This post will cover:
- Digitising rasters using QGIS.
- Loading data to PostGIS using SPIT.
- Reprojecting raster data using QGIS.
- Lines within a polygon selection in PostGIS.
- Georeferencing vector files using ogr2ogr, search for "The georeferencing:" to skip the other steps.
Georeferencing vector data has long been very difficult using the open-source GIS stack. That has all changed with the release of GDAL 1.10.0. Now georeferencing can be done using the vector translator ogr2ogr.
About half a decade ago I wrote my undergraduate dissertation on: "The influence of the Blaeu Atlas of Scotland on subsequent maps of Scotland". While I was very proud of the work at the time, and the grade was good, the markers comments could be paraphrased as: A very good literature review on the Blaeu Atlas, however somewhat weak on analysis". After a Master's degree I could not agree more. At the time I lacked the knowledge and experience of working with projections. The release of GDAL 1.10.0 gives the perfect opportunity to return and correct those mistakes.
One of my favourite aspects of the Blaeu analysis was a comparison of the Blaeu 1654 Atlas of Scotland coastline to the coastline of subsequent maps of Scotland. I originally did this in GIMP and manually resized the outlines to close approximations. The end result looked good, but it lack real scientific rigour.
A huge thanks to the National Library of Scotland for their work in digitising the Blaeu Atlas of Scotland. A great online version and overview can be found at: http://maps.nls.uk/atlas/blaeu/
With the outline used today found at: http://maps.nls.uk/view/00000383
First we need to create a vector outline from the raster image. This is a simple case of adding the .jpeg image in as a raster file. When prompted for a CRS I chose EPSG:4030 “Unknown datum based upon the WGS 84 ellipsoid” this is just used for meta-data at this point. What you choose does not matter. We just want a vector file in cartesian co-ordinates. We create a new shapefile layer and trace outline. My co-ordinate capture will be very rough. If you are working with more modern maps, you should be as careful as required:
Whenever I needed a break I would end the line. Then I could start a new line snapped to the end of the last one. In the end you can select all of the line segments and “merge selected features” from the advanced digitising toolbar. This would likely have been enough, but I was concerned there would be two overlapping nodes and that I would have a multi-part line. So I exploded the lines into points using the Vector>Geometry Tools>Extract nodes tool. Then Using the handy Points2One plugin (thanks Pavol Kapusta) I could stitch the nodes back together into a single line. If you're going to do something might as well do it correctly. In the end we have 5116 nodes:
With the Blaeu outline captured we need a basemap to georeference it to. In the end we want to re-projectiong it into the WGS 84 / World Mercator projection (EPSG:3395). Mercator will provide a good base of what the original mapper would have been trying to capture. In the UK we can turn to the Ordnance Survey for a basemap, however that basemap needs to be reprojected to EPSG:3395. This is simply done in within QGIS by selecting the "Enable on-the-fly CRS transformation" to EPSG:3395 from Project>project Properties>CRS tab.
From the Ordnance Survey (OS) we have a few open datasets to choose from:
- OS BoundaryLine (High Water Polyline)
- OS BoundaryLine (european electoral regions)
We can see how they line up with the raster products:
- OS MiniScale
- OS 250K
We can see that the High Water Polyline follows MiniScale much better than the electoral regions:
It doesn't line up perfectly with the 250k raster (250k is probably too accurate for this study):
But on the whole it will serve for this purpose, in addition Scotland can be easily extracted:
Now we have the High Water Polyline, which looks good, but we only want mainland Scotland. Easy right?
Well... slighlty more complex. Unfortunately even the Scottish high water lines consisted of 16030 line segments. Select by location in QGIS using the European Electoral Regions (polygon) took too long (hours rather than minutes). So I had to resort to PostGIS to do the selection.
SPIT to PostGIS:
For setting up a PostGIS database see my previous post: Setting up PostgreSQL and PostGIS on Linux Mint.
The query in PostGIS/PostgreSQL to select all of the lines within a polygon (we use the OS boundary line electoral regions):
CREATE TABLE scotland_selection AS SELECT os_high_water_mark.* FROM os_high_water_mark,european_scotland WHERE ST_Intersects(european_scotland.the_geom,os_high_water_mark.the_geom);
Query returned successfully: 4994 rows affected, 764779 ms execution time.
The 12 minute run time says a lot about my poor computer.
Then the lines are joined together, the geometries simplified, and multipart converted to singlepart. This allows us to get rid of the final islands. With the result back in QGIS:
Final result projected into WGS 84 / World Mercator projection (EPSG:3395), which really shows the distortion caused by the British National Grid (EPSG:27700):
We now have a cartesian vector file ready for georeferencing, an OS raster background map to use for georeferencing (OS MiniScale) and an OS vector outline BoundaryLine (High Water Polyline) to use for the final comparison.
We open up two QGIS projects. In one we have our cartesian vector file. In the other we have our basemap:
Using the co-ordinate capture tool, we capture the same point from both projects. Basically we are just capturing co-ordinate pairs from the two maps. The “Copy to clipboard” feature comes in very handy here.
These captured co-ordinates should be pasted into a text file. Something like this:
Dumfries and Galloway 1571.29165 -4217.06023 1571.29165 -4217.06023 -551760.015 7302342.218 -551760.015 7302342.218 Scotland England border 2729.71384 -4421.27506 2729.71384 -4421.27506 -338736.496 7324447.924 -338736.496 7324447.924 Kintyre: 1477.31668 -3328.36427 1477.31668 -3328.36427 -603897.130 7496225.431 -603897.130 7496225.431
Of note here is that the co-ordinate capture tool captured the co-ordinate twice for me. How it works is that it captures one in the original CRS and the selected one. It is also important to keep your co-ordinates in order. I captured the cartesian co-ordinates on the first row and the WGS84 ones on the second row. Also I have attached memorable names to the points, in case there is an issue, the culprit point can be easily be identified.
For my project 23 points were captured:
After the points are captured we switch to ogr2ogr to perform the actual reprojection and to convert the OS outline into World Mercator.
Scotland Mainland Outline from Boundary Line High Water Mark to World Mercator (while I'm converting I will convert it to a GeoJSON for viewing in leaflet (this will be covered by a later post) and strip out any attributes to decrease the file size):
ogr2ogr -progress -f GeoJSON blaeu_outline.geojson -a_srs EPSG:3395 -dsco ATTRIBUTES_SKIP=YES -order 1 -gcp 1571.29165 -4217.06023 -551760.015 7302342.218 -gcp 2729.71384 -4421.27506 -338736.496 7324447.924 -gcp 1477.31668 -3328.36427 -603897.130 7496225.431 -gcp 2011.25239 -3353.35609 -536774.862 7514805.007 -gcp 2021.23961 -3783.80519 -516238.228 7418369.368 -gcp 1386.05253 -2433.53330 -632410.002 7624335.089 -gcp 1609.76622 -1681.49577 -635963.491 7783695.427 -gcp 1732.60900 -1079.26651 -599335.214 7983647.549 -gcp 2102.13607 -655.80846 -556830.013 8062917.700 -gcp 2906.10714 -611.86470 -339110.443 8067291.226 -gcp 2974.02022 -753.68320 -340067.151 8031482.985 -gcp 2858.16849 -1518.70411 -423164.137 7901097.254 -gcp 2617.47653 -1878.74333 -470179.537 7823193.830 -gcp 3788.47786 -1882.73821 -224168.724 7866382.395 -gcp 3920.30914 -2064.50559 -200661.024 7828114.046 -gcp 3271.13996 -3004.30282 -288951.571 7635405.575 -gcp 3216.21026 -3254.98199 -289771.607 7578549.743 -gcp 2608.98740 -3372.33181 -415578.804 7531397.670 -gcp 3120.33297 -3431.75575 -312337.131 7534907.369 -gcp 3433.93162 -3683.43365 -238109.336 7506113.225 -gcp 3495.85237 -3831.74384 -222391.979 7477821.982 -gcp 2050.70190 -4455.44562 -457276.805 7286175.723 1654_Blaeu_map_of_scotland.shp
The key here is the -gcp tag. For each control point (co-ordinate pair) we have one tag:
Dumfries and Galloway 1571.29165 -4217.06023 1571.29165 -4217.06023 -551760.015 7302342.218 -551760.015 7302342.218
-gcp 1571.29165 -4217.06023 -551760.015 7302342.218
So: -gcp ungeoref_x ungeoref_y georef_x georef_y
One -gcp tag for each point.
Means that it will use the a first order polynomial transformation, so affine. For myself the less distortion the better.
The end result is great, it really shows how advanced the Blaeu Atlas was for a 1654 map:
If you are interested in buying a historically amazing map:
Blaeu Atlas Maior of 1665 - Including the atlas of Scotland.
Also please visit the NLS, without who this would have been impossible, they also have great changing exhibits:
National Library of Scotland