OpenStreetMap logo OpenStreetMap

How to find out the total length of highway per type of road / per Country in PostGIS

using the OSM postgis Script Repository This is a part of the final result.

  • Step 1. - Download the North-America osm.pbf file from Geofabrik __(7.4 Gb)__

The link is here You can use this command line to download the file in linux.

wget http://download.geofabrik.de/north-america-latest.osm.pbf
  • Step 2. - Load the data into PostGIS using the scope.sh utility tool

You can find the script here https://github.com/baditaflorin/osm-postgis-scripts/ The aim of this tool is to simplify the process of importing a osm.pbf file into PostGIS. Now the procedure is complicated and you first have to create the database, then to enable postGIS and Hstore on that database, etc

Here, you have a very simple process where you need to tell the name of the postgres user, the name of the database that you will create and the name of the osm.pbf file that you want to import, without the extension. At the end you will see something like this :

INFO: Total execution time: 93403048 milliseconds.

This
     script
            is
               done

                     Cluj
                          Map Analyst Team
                                           Telenav

 Find Postgis Scripts and snippents of code that you can use here
 https://github.com/baditaflorin/osm-postgis-scripts/
 It is a Open Source Project so you can also contribuite
 with you PostGIS Code to make the repository more complete
  • Step 3. - Wait 25 hours for the file to import.

In the end we will have a 280 Gb database, that is composed of over 59M ways, 855M nodes, and 500k relations

  • Step 4. - Find out if a node is from USA,Canada or Mexico.

Now, all of or nodes and ways have GPS coordinates, but we do not know if they are in Canada or Mexico. Fortunately, there is a great guy in Germany that goes by the name of Walter Nordmann that have a server where he hosts the OSM Boundaries Map using the data from OSM. He is doing this as a volunteer, so if yoi can help with any donations this will definitely help him to keep the server running. The website is https://osm.wno-edv-service.de/boundaries/ To do this we will download the admin_level=2 (al_2) boundaries of Canada.Mexico and USA.

  • Step 5. - Load the al_2 boundaries into the same postGIS database

To do this i used QGIS, a free and open source GIS tool, where you first need to connect to the database by doing :

Layer -> Add Layer -> Add PostGIS Layer.

In the new windows, under Connections click New and setup the connection.

After this, click OK and close the windows

We then go to

Database -> DB Manager -> DB Manager

We find and select our database from the left part of the screen.

We click on the Table Menu and we select Import vector layer

We put the name of the new table to be al_2 and we also select the index and we click ok

  • Step 6. - Calculate the total road length per each type of road and for each Country

This is the simple part, because we already have a OSM github script page with PostGIS scripts that you can run it for a city, a country or the planet, and they will work the same. The link is this OSM PostGIS script repository and the script that we are interested can be found here highway_length_per_type_different_attr.sql

  • Step 7. - Run the script

If you would run the script from the osm PostGIS script repo, you will get the total value of all of the roads in North America. This is why we need to see where each of the roads are located, and we do this using a inner join and ST_contains

The modified script is available as a gist here

I have not used the admin_level=2 borders that i have imported because the geometry of them was to complicated and it would have taken to much to get the result.

Instead i over-simplified the geometry until i got something really cartoon-ish compared to the first example. I first used the plugin SimpliPy developed by Albert Ferràs. You can download the plugin from Qgis Web link

  • Step 8 - Test and optimize the script, the polygon for the st_contains

I put a limit of 500 ways and tested to see how it will compare if i try to run the 500 examples using the bulk admin_level=2 boundaries, if i simplify and if i simplify and then also delete a lot of nodes so that i will speed up the process of Geocoding.

admin_level=2 no simplification. al_2_simplipy al_2_simplipy_extreme
16 sec 0.7 sec 0.3 sec

You have to have in mind that at the borders you will get some mixed results, where some parts of a town in Canada can become counted as part of USA, because of the over-simplification, but this is a useful when we have 59M ways, from where 22M ways have highway informations.

If for 500 we had to wait 16 seconds, for 5.000.000 ways, or a quarter of the total number of ways that at the worst case scenario we will have to wait 160.000 seconds, or
44 hours until the task will be completed.
For the complete 22M ways, this would mean around one week until we get the results.! dsds

By doing the simplification we reduce the amount to 0.7 seconds * 10000 = 7000 seconds or 2 hours for 25 % of the dataset. In 8 hours it should give us the results.

The last simplification meant that we start to deleting almost everything, trying to get a square, at least where i know that there is only water.

  • Step 9 - Run the script and get the results

In the end the script completed in 2861 seconds, or around 50 minutes.

The table with the results for North America can be seen here

The table was generated using this link

Email icon Bluesky Icon Facebook Icon LinkedIn Icon Mastodon Icon Telegram Icon X Icon

Discussion

Log in to leave a comment