Load OpenStreetMap data into a PostGIS database and have it automatically updated in an incremental way (so at each update will be loaded just the changes that meanwhile had happened in the OSM servers).
There are (really) a lot tools available to handle OSM data, and in a few cases they are just alternatives to do the same thing. The documentation of this tools (osm2pgsql, osmupdate, osmfilter, osmconvert, osmium, osmosis, imposm, etc.) is good but not great (and often not up to date), especially when it comes to “recipes” explaining how to put them together to achieve some specific goal. The following notes -to solve the aforementioned problem- work, but is to be considered that probably there is a much more straightforward and clean way to solve it. Special mention this tutorial that was fundamental to understand how to achieve the goal.
The solution passes trough a series of steps that can be wrapped into one or more scripts and then scheduled with CRON.
Phase 1, downloading, preparing and importing the data
Step 1: get OSM data
Daily updated pbf/osm (and shapefiles) datasets -clipped along national borders of most of countries- can be downloaded at Geofrabrik OSM download facility. In the same page are available daily updated datasets with the data of each continent.
Another option is to download the full Planet file, that includes the entire (world) OSM dataset.
Step 2: limit the data from the Planet to a specific area of interest
The following steps can be used to import and update the full Planet, but this will take a long time (depending also on hardware resources of the server/computer where the operations are being done). More probably it would only needed a specific, limited geographic region.
The Planet dataset can then be clipped using a polygon file (.poly) that can be easily created from a shapefile using this QGIS plugin.
The --complete-ways option is not used because is not supported by osmupdate (to be used later) so ways crossing the clip boundaries are removed. For this reason is important that the .poly file must represent an extent slightly bigger than the real area of interest.
$ osmconvert --verbose planet-latest.osm.pbf -B=portugal.poly -o=planet-portugal.o5m
Step 3: filter the dataset and leave just features with a specific OSM attribute/tag
In this specific example only the roads that are tagged as “highway” are the ones to be imported in the database. This tag name is misleading as “highway” means all the roads that can be used by cars, even the smallest ones. Roads/ways that is not possible to use by car are tagged as “pedestrian“.
$ osmfilter --verbose planet-portugal.o5m --keep= --keep-ways="highway" --out-o5m > portugal_estradas.o5m
Step 4: remove broken references
References to nodes which have been excluded because lying outside the geographical borders of the area of interest need to be removed with the option --drop-broken-refs.
The --b=-180,-90,180,90 option defining a global bounding box seems superfluous, but is actually necessary to circumvent a bug in the --drop-broken-refs task that would leave only nodes in the data
$ osmconvert --verbose portugal_estradas.o5m -b=-180,-90,180,90 --drop-broken-refs -o=portugal_estradas_nbr.o5m
Step 5: import the data into PostGIS using osm2pgsql
The important bit here is to use the --slim flag, otherwise later it will not be possible to update this database in an incremental way
$ osm2pgsql --flat-nodes flat_nodes.bin --slim --create --cache 16000 --number-processes 12 --hstore --style openstreetmap-carto.style --multi-geometry portugal_estradas_nbr.o5m -H localhost -d databasename -U username --proj 32629
Phase 2, updating the data
When a Planet file is downloaded it is already old because a new version is published only once a week (and changes in the OSM servers are continuous). So after the first import the data in the database can be immediately updated.
Step 6: update the dataset
With osmupdate we update the dataset obtained in Step 2
$ osmupdate --verbose planet-portugal.o5m planet-portugal-updated.o5m -B=portugal.poly
Step 7: filter the dataset and leave just features with a specific OSM attribute/tag
$ osmfilter --verbose planet-portugal-updated.o5m --keep= --keep-ways="highway" --out-o5m > portugal_estradas_updated.o5m
Step 8: remove broken references
$ osmconvert --verbose portugal_estradas_updated.o5m -b=-180,-90,180,90 --drop-broken-refs -o=portugal_estradas_updated_nbr.o5m
Step 9: create the DIFF file
The DIFF file (in .osc format) contains only the differences between two OSM datasets
$ osmconvert --verbose portugal_estradas_updated_nbr.o5m --diff --fake-lonlat -o=diff.osc
Step 10: import the DIFF file
$ osm2pgsql --flat-nodes flat_nodes.bin --slim --append --cache 16000 --number-processes 12 --hstore --style openstreetmap-carto.style --multi-geometry diff.osc -H localhost -d databasename -U username --proj 32629
Wrap steps 6 to 10 into a batch file and schedule it with CRON to get a fully automatic way to have a continuously updated PostGIS database with OSM data.
The speed of the above process gets a huge boost if it can be done on a SSD rather than a HDD.