This project loads a file in DTF7.3 format into a database.
DTF7.3 is the Data Transfer Format specification used by local councils to export their address data from their LLPG, in a format matching BS 7666:2006.
The data is loaded into a set of database tables - BLPUs, LPIs and Street Descriptors making it easy to list all the addresses and use sql to query them.
The code also supports DTF delta files, where records are updated and deleted as well as inserted.
The project uses a command line Visual Studio 2015 project with .Net 4.5.2 and Sql Server 2008 R2 and has been designed to be easy to port to different versions/technologies.
It will likely work with most versions of Sql Server, and also earlier versions of .Net.
Once compiled, go to the directory containing the executable (usually bin\debug) and in a command window run a command of the format
parsedtf loadfile "[ConnectionString]" "[FullFileName]"
- ConnectionString - the sql connection string
- FullFileName - the full file name including drive and path
e.g.
parsedtf loadfile "data source=localhost;initial catalog=DTF;integrated security=True;" "C:\DtfFiles\llpg.csv"
There are a few basic test files in the TestDtfFiles folder to try out.
I couldn't find any open source code to do this, yet it's used by all councils in the UK.
Once the addresses are in a database then it opens up the possibility of
- health checks on the LLPG data, making sure fields are populated correctly
- bulk fuzzy address matching to get UPRNs from free typed address text
- address lookup widgets
- Create a blank Sql Server database
- Run Db\Schema.sql to create the schema
- Run Db\Data.sql to populate the tables
- Compile the project
- Run a command line as in the Code Example above
Theres only two functions currently
-
parsedtf loadfile "[ConnectionString]" "[FullFileName]" - loads the DTF file into the database
-
parsedtf clearall "[ConnectionString]" - resets the database
Tables
-
DtfLine - contains all lines from all files that have been loaded, acts like a log.
-
BLPU - Basic Land and Property Unit - DTF Record Identifier 21
-
LPI - Land and Property Identifier - DTF Record Identifier 24
-
StreetDescriptor - Street Descriptor - DTF Record Identifier 15
-
StreetRecord - Street Record - DTF Record Identifier 11
-
RecordIdentifier - a list of all the DTF Record Identifiers
-
Version - a record is generated for each file loaded, and versionid used to refer back to filename / date time stamps
Views
- vwAddresses - contains all the address fields plus record statuses and BLPU Codes
- vwLpi - LPI fields with concatenated PAON and SAON number and description fields
- vwDtfLine - all the DTF lines that have been loaded along with datestamps and source filenames
Stored Procedures
- spTruncateEverything - resets database
Run these from the command line replacing [path] with your path
-
parsedtf loadfile "data source=localhost;initial catalog=DTF;integrated security=True;" "C:[path]\ParseDtf\TestDtfFiles\insert.csv"
-
Check the street record table - it should have 3 records in it
-
parsedtf loadfile "data source=localhost;initial catalog=DTF;integrated security=True;" "C:[path]\ParseDtf\TestDtfFiles\update.csv"
-
The record with USRN = 20900161 has been updated so Street_Start_x = 666666.00, also it's versionId has increased
-
parsedtf loadfile "data source=localhost;initial catalog=DTF;integrated security=True;" "C:[path]\ParseDtf\TestDtfFiles\delete.csv"
-
The record with USRN = 20900162 has been deleted
-
Check the version table and you can see what files were run and when. Messsages field gives you a full breakdown or insert/deletes. Note an update is just a delete followed by an insert.
Please feel free to contact me through Github
MIT