Downloading OS OpenData (GIS) in Bulk

The Ordnance Survey have begun to release their data as free to use. I think this is great, and despite what some say, I think the selection is good as well. I can see why MasterMap and AddressBase, their flagship products still cost money, there are some significant costs associated with producing them.

There is however still a bit of a barrier of entry, into using OS OpenData. This guide will hopefully make downloading the data a little bit easier.

OS OpenData needs to be ordered from:

https://www.ordnancesurvey.co.uk/opendatadownload/products.html

There is no cost, but a valid e-mail address is required.

For my purposes I started with OS Street View® and OS VectorMap™ District for the whole country.

You will receive an e-mail with the download links for the data. Unfortunately these are individual links since the data is split up into tiles. 55 tiles in total for the whole of the UK, so we want some way to automate the download of these links.

In comes curl, made by some nice Swedes. curl is a command line tool which will download the link provided.

Attempt number 1:

curl http://download.ordnancesurvey.co.uk/open/OSSTVW/201305/ITCC/osstvw_sy.zip?sr=b&st=2013-10-03T18:51:25Z&se=2013-10-06T18:51:25Z&si=opendata_policy&sig=YOUR_UNIQUE_CODE_HERE

Result:

[1] 20182 

[2] 20183 

[3] 20184 

[4] 20185 

[2]   Done                    st=2013-10-03T18:51:25Z 

vesanto@hearts-lnx /data/OS $ ResourceNotFoundThe specified resource does not exist. 

RequestId:54953719-5e43-4791-aa60-51dee61f27ac 

Time:2013-10-03T19:20:04.3403546Z

Not quite there, missing the quotes:

curl 'http://download.ordnancesurvey.co.uk/open/OSSTVW/201305/ITCC/osstvw_sy.zip?sr=b&st=2013-10-03T18:51:25Z&se=2013-10-06T18:51:25Z&si=opendata_policy&sig=YOUR_UNIQUE_CODE_HERE'

Which was a success, but it started downloading the file into the terminal (ctrl+c stopped it):

Screenshot from 2013-10-03 20:16:12

So we need an output file as well:

curl 'http://download.ordnancesurvey.co.uk/open/OSSTVW/201305/ITCC/osstvw_sy.zip?sr=b&st=2013-10-03T18:51:25Z&se=2013-10-06T18:51:25Z&si=opendata_policy&sig=YOUR_UNIQUE_CODE_HERE' -o osstvw_sy.zip

Success:

Screenshot from 2013-10-04 08:43:17

To run this in bulk, just format the commands in LibreOffice (Excel) and create a simple shell script.

Copy the commands into a file and then:

sh the_file_containing_your_commands

To load the downloaded data into PostGIS, see my previous post on loading data into PostGIS.

Also a few tips for newer users of OS OpenData:

They are all in the British National Grid Projection (EPSG:27700). Watch out for this in QGIS. Version 1.8 used the 3 parameter conversion while 2.0 uses the 9 parameter. This means that if you created data in QGIS 1.8 and are now using 2.0, your data may not line up if you have “on the fly” CRS transformation turned on. To fix this just go into the layer properties and re-define the projection as British National Grid (EPSG:27700) for all the layers.

If you are working with Raster data you need to copy the .tfw files into the same directory as the .tif files. The .tfw (tiff world files) tell the .tif image where they should be in the world.

Loading Natural Earth data to PostGIS PostgreSQL

Natural Earth provides some of the best data for large scale mapping. It is clean, accurate, extensive, at a number of different scales, and best of all free.

To load the data it into PostGIS (PostgreSQL) we will use the vector tools provided in GDAL. Mainly ogr2ogr.

After downloading the data. I went for all of the vector data in ShapeFile format. First I need to generate a list of datasets and their respective file paths. This will be put into a spreadsheet and the command to load the data will be applied to each line, and finally it will be run using a shell script. Setting up a PostGIS database is covered in my previous post.

My Natural Earth data consisted of:

28 directories, 1472 files

So a little automation is needed. Interestingly there were also a few .gdb files “ne_10m_admin_1_label_points.gdb”. Those we can look into at a later date.

To begin:

ls > my_contents.txt

Produced a decent result, but not quite what I was looking for.

sudo apt-get install tree

tree > natural_earth.txt

Was much better, although with a bit more tuning I’m sure ls would have achieved a better result.

tree

After a bit of work in the spreadsheet, I had what I wanted. Perhaps not the most elegant solution, but certainly enough for my purposes.

Now for the ogr2ogr command:

ogr2ogr -nlt PROMOTE_TO_MULTI -progress -skipfailures -overwrite -lco PRECISION=no -f PostgreSQL PG:"dbname='natural_earth' host='localhost' port='5432' user='natural_earth' password='natural_earth'" 10m_cultural/ne_10m_admin_0_antarctic_claim_limit_lines.shp

Ogr2ogr is a file converter, which does so much more. In this case we are converting the ShapeFiles into tables in a PostGIS database. Essentially you want to copy the beginning part of the command in front of the files you want to load, changing only: “10m_cultural/ne_10m_admin_0_antarctic_claim_limit_lines.shp” .

Our settings:

-nlt PROMOTE_TO_MULTI | Loads all out files as if they were multi-part polygons. This means that a multi-part polygon wont fail the loading. This is a PostGIS requirement.

-progress | Shows a progress bar.

-skipfailures | Will not stop for each failure.

-overwrite | Overwrites a table if there is one with the same name. Our tables will be called whatever the ShaeFile is called since we are not specifying a name.

-lco PRECISION=no | Helps keep numbers manageable, especially with this data where precision is not important.

-f PostgreSQL PG:”dbname=’DatabaseName’ host=’IpAddressOfHost’ port=’5432′ user=’Username’ password=’Password'” | Details of the database where we are connecting to.

Now we are ready to run the commands. While ogr2ogr commands can be pasted straight into the terminal, for this task that is not really feasible. So we can create a simple shell script.

Copy the commands into a file and then:

sh your_commands

Finally there was one final error, with ne_10m_populated_places.shp. This was due to encoding. The encoding for the ogr2ogr tool can be changed from UFT8 to LATIN1 using:

export PGCLIENTENCODING=LATIN1;

After which the file loaded swimmingly.

Now for some mapping.

Thanks to:

http://unix.stackexchange.com/questions/92387/pasting-into-terminal-deteriorating?noredirect=1#comment139520_92387

http://lists.osgeo.org/pipermail/gdal-dev/2009-May/020771.html