-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Import UK postcodes from ONSPD instead of Code-Point Open #216
Commits on Dec 15, 2015
-
Script for importing UK postcodes from ONSPD
Importing everything from one source reduces what we have to download and we have to download ONSPD to get NI and Crown dependencies which aren't in Code Point Open. Other advantages are that the ONSPD has all live and terminated postcodes in it and (as of Aug 2015 release at least) everything is a gss code, rather than a mix of ons + gss. We add a toggle to allow importing terminated postcodes as it can be useful to have old postcodes in the db to allow searching on old addresses. Note that although code point open doesn't include terminated postcodes we can end up with them in our dataset, but only if we had a long-lived database that imported multiple releases of the dataset. For example, we import the May 2012 dataset and then when the next one comes out in Aug 2012 we import that - our db will have in it the postcodes that were terminated between those two releases, but it won't have any that were terminated before May 2012. This leads to the situation where a db rebuilt from scratch using the current dataset would have a different set of postcodes to one that had been around for a few years having had releases imported as they arrive. Notionally both represent the current data, but one has more postcodes. Using the ONSPD and allowing terminated postcodes fixes this problem.
Configuration menu - View commit details
-
Copy full SHA for b2d1d6d - Browse repository at this point
Copy the full SHA b2d1d6dView commit details -
Allow detecting location availability per postcode row
If `--no-location` is not set we would try to detect a location for each row, and this would break if the location fields could not be coerced into floats. Some datasets mix location and non-location postal codes and to import them all we have to filter the data and run the importer twice. This change allows individual importers to implement `location_available_for_row` to say if the supplied row has location data or not. The method is called on each row and will run the `--no-location` path if we can't extract location fields for that row. If `--no-location` is set, we always run that path, regardless of the `location_available_for_row` value.
Configuration menu - View commit details
-
Copy full SHA for 74aa673 - Browse repository at this point
Copy the full SHA 74aa673View commit details -
Provide fixture data for NI council / electoral / wards
In April 2015 councils and wards changed in Northern Ireland so the old ni-electoral-areas data files no longer represent the truth. The new ni-electoral-areas-2015 file provides the names and GSS codes of the new Districts, Electoral Areas, and Wards of Northern Ireland following the Apr 2015 reorganisation. We synthesized this from a few datasets. ** The District -> Electoral Area -> Ward breakdown is taken directly from the legislation[2] - although this only contains the names ** The GSS codes of the Districts and Wards are taken from the "Wards (2015) to district council areas (2015) NI lookup"[3] dataset provided by the ONS. ** The GSS codes of the Electoral areas are taken from running sparql queries against the ONS Linked Data Portal[4]. The query used was: PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX geography: <http://statistics.data.gov.uk/def/> PREFIX statistical-entity: <http://statistics.data.gov.uk/def/statistical-entity#> SELECT DISTINCT ?item WHERE { ?item rdf:type geography:statistical-geography . ?item statistical-entity:code <http://statistics.data.gov.uk/id/statistical-entity/N10> . } ORDER BY ASC(?code) LIMIT 100 OFFSET 0 N10 is the code for a NI Electoral Area (found by looking up one of the areas by name and investigating). We then extract the data for each entry in this result set and extract the GSS code to match up with the name. This data will be used to help import NI areas from the OSNI boundary datasets which do not all contain GSS codes. [1]: https://geoportal.statistics.gov.uk/geoportal/catalog/search/resource/details.page?uuid=%7B2196C1D5-6A11-47DE-BD0E-26311E3D6D9F%7D [2]: http://www.legislation.gov.uk/uksi/2014/270/made [3]: https://geoportal.statistics.gov.uk/geoportal/catalog/search/resource/details.page?uuid=%7BFE83C43C-9403-408C-833C-367BC56659C9%7D [4]: http://statistics.data.gov.uk/
Configuration menu - View commit details
-
Copy full SHA for 32d772e - Browse repository at this point
Copy the full SHA 32d772eView commit details -
Allow importing postcodes with no location
Some rows in the ONSPD that have no location - those where the 12th column is a '9'. The existing importers for ni and gb postcodes automatically ignore these, but it can be useful to include them for existence checks even if we can't give more geographical information about them. We implement `location_available_for_row` to return `False` if the quality row is `'9'`, `True` otherwise. This lets us import rows that have no location if we want to. To keep the Code-Point Open import behaviour we provide an option `--allow-no-location-postcodes` to turn this on; without this option the importer will not import these rows.
Configuration menu - View commit details
-
Copy full SHA for 7f0f41e - Browse repository at this point
Copy the full SHA 7f0f41eView commit details -
Merge onspd and nspd_crown_dependencies importers
Now that we can handle rows with no location we don't need a separate importer for crown-dependencies. We add a new `--crown-dependencies` option to the onspd importer that takes the following options: `include`, `exclude`, or `only`. The default is `exclude` which means we retain the previous behaviour of the onspd importer to only import GB postcodes and require a second run to import the crown dependency ones. Choosing `include` means we import all the GB + crown dependency postcodes in one pass (we still ignore the NI ones though). The `only` option means that we want to only import crown dependency postcodes - this gives us the behaviour of the old nspd_crown_dependencies importer, which is useful should you choose to import GB postcodes from the Code-Point Open dataset. Note that crown dependency postcodes (currently) have no location information in ONSPD and while the importer has an option for allowing postcodes with no location to be imported (--allow-no-location-postcodes) this option has no effect on the behaviour of importing crown dependency postcodes. Crown Dependency postcodes are imported solely based on the value of the --crown-dependencies option, as outlined above. This could be confusing, but the help text should make it clear.
Configuration menu - View commit details
-
Copy full SHA for 8dafb12 - Browse repository at this point
Copy the full SHA 8dafb12View commit details -
Add command for importing osni boundary data
This is mostly a rip from import_boundary_line but instead of taking one file as an argument we specify which files to work on as options because the OSNI releases are single files per shape, unlike Boundary-Line which has everything in one folder. We could expect a user to put everything in one folder first, but we need to handle each shapefile differently depending on how we expect it to be used. For example the Westminster Parliamentary constituencies boundaries have the name in PC_NAME and the gss code in PC_ID, whereas the LGD file has them in LGDNAME and LGDCode. We add a new codetype for identifying the OSNI id of the boundaries and a new nametype for the OSNI names.
Configuration menu - View commit details
-
Copy full SHA for 503745d - Browse repository at this point
Copy the full SHA 503745dView commit details -
Add option for importing OSNI ward boundaries
Name is in WARDNAME and GSS code is in WardCode. We did see them publish this data in a set that had no WardCode, but did have LGDName so we could use our ni-electoral-areas-2015.csv fixture to do a name match to get the GSS code. Hopefully though they won't publish that dataset again.
Configuration menu - View commit details
-
Copy full SHA for d861f29 - Browse repository at this point
Copy the full SHA d861f29View commit details
Commits on Dec 16, 2015
-
Import Northern Ireland Assembly constituencies
We process the Westminister file twice, once to generate areas with type WMC and then again to generate areas with type NIE. This is what the old shape-less importer did so I assume it's still good. These NIE areas don't have GSS codes because they are legally identical to the Westminister constituencies and don't have a code issued by the ONS.
Configuration menu - View commit details
-
Copy full SHA for 1f4b13a - Browse repository at this point
Copy the full SHA 1f4b13aView commit details -
Import NI Electoral Areas (LGE)
These are the areas that live between LGD (NI Councils) and LGW (NI Wards). The shapefile released by the OSNI does not include GSS codes so we have to synthesize them by doing a name lookup against the ni-electoral-areas-2015.csv fixture. We've asked OSNI if they plan to expose GSS codes in this dataset.
Configuration menu - View commit details
-
Copy full SHA for b861d8f - Browse repository at this point
Copy the full SHA b861d8fView commit details -
Note that importing this area generates some warnings about invalid geometry, but the existing "fix_invalid_geos_geometry" is able to turn the data into something that it considers valid.
Configuration menu - View commit details
-
Copy full SHA for 47a26c3 - Browse repository at this point
Copy the full SHA 47a26c3View commit details -
Configuration menu - View commit details
-
Copy full SHA for b924486 - Browse repository at this point
Copy the full SHA b924486View commit details -
Simplify field extraction by using objects not hashes
This means we don't need to deal with strings vs callables and can isolate complexity to the area codes that need it (LGE and NIE).
Configuration menu - View commit details
-
Copy full SHA for 5f19d5f - Browse repository at this point
Copy the full SHA 5f19d5fView commit details
Commits on Dec 17, 2015
-
Allow specifying SRID of OSNI imports
From looking at the shapefiles in a viewing tool (qgis) it appears that the co-ords are in the NI projection (29902) in some files, but in the 102100 project in others. For some files this matches up with the arcgis metadata pointed to by the OSNI for each release (e.g. the LGEs dataset[1] points to the following metadata as its source[2] which lists 29902 as the projection, and the LGW dataset[3] points to source metadata[4] which lists 102100 projection). For others however this is not true (e.g. the LGDs dataset[5] has a source metadata[6] that says 299002, but the downloaded shapefile is actually in 102100). As it appears this is changeable per release (possibly per download) we allow for telling the importer what srid a given file is in. If it's not 29902 we convert it to 29902 before importing so that everything is consistent. Note that for some reason the geometry imported from the shapefiles does not contain an SRID for some reason so we have to set it even if we don't have to transform it. As a further wrinkle, PostGIS doesn't support 102100, but it is mathematically equivalent to 3857 which it does support. Unfortunately using that projection causes failures during for point-based lookup of parents, but if we use 4326 instead it works. Apparently 102100 and 4326 are both "web mercator" projections so are probably very similar (if not exactly mathematically equivalent). Interestingly opening a shapefile that is in 102100 in a viewing tool such as qgis reports it as 4326 whereas a 29902 reports as a custom projection that is identical in all but name to 29902. This suggests it's safe to use 4326 as a replacement for 102100. The defaults we set for the options are based on the SRIDs of the data files we've downloaded in Dec 2015 - they may change over time. [1]: http://osni.spatial-ni.opendata.arcgis.com/datasets/981a83027c0e4790891baadcfaa359a3_4 [2]: https://gisservices.spatialni.gov.uk/arcgisc/rest/services/OpenData/OSNIOpenData_LargescaleBoundaries/MapServer/4 [3]: http://osni.spatial-ni.opendata.arcgis.com/datasets/55cd419b2d2144de9565c9b8f73a226d_0 [4]: https://services3.arcgis.com/dNsInyVNGMqG1QjF/arcgis/rest/services/OSNI_Open_Data_Largescale_Boundaries_Wards_2012/FeatureServer/0 [5]: http://osni.spatial-ni.opendata.arcgis.com/datasets/a55726475f1b460c927d1816ffde6c72_2 [6]: https://gisservices.spatialni.gov.uk/arcgisc/rest/services/OpenData/OSNIOpenData_LargescaleBoundaries/MapServer/2
Configuration menu - View commit details
-
Copy full SHA for 645a722 - Browse repository at this point
Copy the full SHA 645a722View commit details -
Handle NI postcodes in ONSPD importer
We add an option to allow importing the NI postcodes at the same time as the rest of the postcodes. The option takes 3 values: 'include', 'exclude', 'only' with the same behaviour as the --crown-dependencies option: * 'include' will import NI postcodes * 'exclude' will not import NI postcodes * 'only' will only import NI postcodes The default is 'exclude' to maintain previous behaviour. Unlike Crown Dependency postcodes NI postcodes might have location data, and so the --allow-no-location-postcodes setting (default false) does affect how we import NI postcodes. Setting both --crown-dependencies and --northern-ireland to 'only' is an error and will halt the importer before it begins. We've also updated the documentation provided by the options to be clearer about how the various options interact. Unlike the old nspd_ni importer that relied on the nspd_ni_areas importer to be run and the ni-electoral-areas.csv to directly assign areas to NI postcodes, this importer has no special handling. We assume that the new OSNI importer has been run and the relevant shapefiles have been imported, much like we assume that the boundary-line importer has been run to provide the areas for the rest of the UK.
Configuration menu - View commit details
-
Copy full SHA for eef599c - Browse repository at this point
Copy the full SHA eef599cView commit details -
Allow specifying srid for GB vs. NI postcodes
The --gb-srid and --ni-srid options have defaults (27700 and 29902 respectively) that are sensible and will change the --srid option on a per row basis if the postcode is for Northern Ireland (e.g. starts with BT) or not.
Configuration menu - View commit details
-
Copy full SHA for 91ca8f1 - Browse repository at this point
Copy the full SHA 91ca8f1View commit details -
Provide ONSPD version of scilly command
One command can, by checking the lengths of the rows, work on both Code-Point Open and ONSPD files for dealing with scilly wards.
Configuration menu - View commit details
-
Copy full SHA for c55e7f6 - Browse repository at this point
Copy the full SHA c55e7f6View commit details
Commits on Dec 18, 2015
-
Add script for adding GSS codes to NI Areas
Some mapit installations already have NI Areas with or without boundaries but these areas may not have GSS codes. This script uses the ni-electoral-areas-2015.csv hierarchy to find LGDs, LGEs, and LGWs by name and add their GSS codes. Because names are not neccessarily unique it respects the hierarchy in the fixture. If names cannot be found (it does a case-insensitive lookup) the row is ignored and a warning issued.
Configuration menu - View commit details
-
Copy full SHA for 822f443 - Browse repository at this point
Copy the full SHA 822f443View commit details -
Add script for adding names to NI Areas
The names in the OSNI data don't always match the names for the same areas in the ni-electoral-areas-2015.csv fixture which was extracted from the legislation. In some cases it's just an uppercase difference, or a lack of punctuation. In others the names are completely different. For example in the fixture GSS N09000011 is called "North Down and Ards" but in the OSNI shapefile it is called "East Coast". Turns out this is because the council voted to change the name to the OSNI one, but backed down after outcry and reverted[1]. This script goes through the fixture and matches on GSS code to find the Areas and add a new override name to the area if the fixture name is not already present. [1]: http://www.belfasttelegraph.co.uk/news/northern-ireland/backlash-forces-council-to-ditch-new-east-coast-name-that-cost-thousands-30902221.html
Configuration menu - View commit details
-
Copy full SHA for 2617420 - Browse repository at this point
Copy the full SHA 2617420View commit details -
Correct LGD names in ni-electoral-areas-2015.csv
Mostly this is just extending the name to include the council type (District, Borough, or City), similar to naming of some council areas in the rest of the UK. In the case of "North Down and Ards" we also rename to their final name choice of "Ards and North Down". For "Derry and Strabane" and "Armagh, Banbridge and Craigavon" we also include "City" in the appropriate place ("Derry City" and "Armagh City").
Configuration menu - View commit details
-
Copy full SHA for 99847c4 - Browse repository at this point
Copy the full SHA 99847c4View commit details -
Make adding gss codes to ni areas work for real data
We incorporate the feedback from mysociety about running the `mapit_UK_add_gss_codes_to_ni_areas` command against their real data. Because we're doing name matches we need to change our naive `names_name__iexact` match and sanitize the data a bit.
Configuration menu - View commit details
-
Copy full SHA for f4e27f3 - Browse repository at this point
Copy the full SHA f4e27f3View commit details
Commits on Jan 4, 2016
-
Allow add_x_to_ni_areas scripts work on 1st import
If there are no active generations we set the "current" generation to the "new" generation. Otherwise we try to find objects in the 0th generation and this won't work.
Configuration menu - View commit details
-
Copy full SHA for c159ba1 - Browse repository at this point
Copy the full SHA c159ba1View commit details
Commits on Jan 5, 2016
-
Convert geometry to application projection in NI shape imports
We used to import all the NI shapes with an srid of 29902 (the Irish grid [1]). For some reason when the geometry was extracted from the DB it was in 27700 (the GB grid [2]) but had not undergone any transformation from 29902 to 27700. Consequently the NI shapes were in the wrong place (covering Liverpool, North Wales and some of the Irish Sea). It's not clear how this happened, but we can fix it by always transforming the NI shape data from whatever srid it is provided as into the 27700 srid used by the rest of the UK data. Note that we actually use the `settings.MAPIT_AREA_SRID` srid and not 27700 directly as in most cases of a UK instance of mapit this will be 27700, but in the off chance it's not we don't want things to break. [1]: http://spatialreference.org/ref/epsg/tm65-irish-grid/ [2]: http://spatialreference.org/ref/epsg/27700/
Configuration menu - View commit details
-
Copy full SHA for 0730197 - Browse repository at this point
Copy the full SHA 0730197View commit details