Geocoding with PostGis
Sunday, May 1, 2016
For small scale geocoding needs I’ve always used the excellent Google maps API, but I recently wanted to do some custom geocoding and return the nearest NHS trust based on a user’s postcode. This was a nice opportunity for me to learn a bit about PostGIS, an amazing spatial database extension for PostgreSQL databases with an impressive line-up of features: http://postgis.net/features/.
NHS data is available from a variety of sources; I wrote a bunch of C# scripts to read, clean, and save the data I wanted which aren’t particularly interesting. Postcode data for the UK is available from the ONS:
(There are many other useful spatial data sets there as well.)
Each postcode comes with an easting and a northing, which isn’t the format I wanted to work with, but PostGIS makes it easy to convert between geographies.
First I created a postcode table in my PostGres database (with PostGIS extension enabled) with the following properties:
CREATE TABLE postcode( postcode varchar(10) unique primary key, northing integer, easting integer, longitude decimal(18,6), latitude decimal(18,6), location geometry );
Then wrote a script to insert the postcode data line by line using insert statements like:
INSERT INTO postcode(postcode, easting, northing, location) VALUES ('AB101AA',394251,806376, ST_SetSRID(ST_MakePoint(394251,806376),27700))
The key thing here is to specify the geography 27700, because we’re creating a point from the easting and northing values.
Once all the postcodes are in, the following commands will populate the latitude and longitude columns and set the location column to use the lat/lng geometry instead of the easting/northing one. So you’re converting between geography 27700 and 4326.
UPDATE postcode SET longitude=ST_x(ST_Transform(location,4326)), latitude=ST_y(ST_Transform(location,4326)); UPDATE postcode SET location = ST_SetSRID(ST_MakePoint(longitude, latitude),4326);
Finally, because I knew I wouldn’t need the easting and northing values, I dropped those columns. And that’s the reference table of postcodes set up.
Once you’ve got other tables of entities with locations (and PostGIS supports all kinds of geometries) it’s easy to build queries based on postcodes.
For example, to find the area a postcode is in from a table of areas with locations that are polygons:
SELECT * FROM area WHERE st_contains(location, (SELECT location FROM postcode WHERE postcode = 'SW8 5PZ'));
Or to find the nearest entity from a table of entities with single point locations:
SELECT (SELECT e FROM entity AS e ORDER BY st_distance(p.location, e.location) limit 1) FROM postcode AS p WHERE p.postcode = 'SW8 5PZ';
Finally, a really nice function is ST_AsGeoJSON; return a geometry field in GeoJson format:
SELECT ST_AsGeoJson(location) AS location FROM postcode WHERE postcode = 'SW8 5PZ';
Code for my geocoder, including all the sql I used to populate the database is up on Github: https://github.com/labourdigital/labour-geocoder
And here’s an app we built over at the Labour Party using it: