Jun 132014
 

This post presents some of the most relevant statistical information about the content of the OpenStreetMap database related to Germany.

These data can be further explored on the  OpenAlfa Deutschland Straße website.

How to get the OSM database for Germany

Actually, OpenStreetMap is a single database that encompasses the whole world. It can be downloaded as a single “planet.osm” file from the main OSM site, of from one of its mirror sites.

But event the compressed planet.osm file is more than 30 GBytes in size, and processing it to extract the data about a single country may take several days on a typical server.

Instead, it is easier to download a pre-processed extract for the country of interest, available on some OSM mirrors. The list of sites with downloadable  country and area extracts can be found here.

To get the OSM data for Germany, we will download the daily Germany extract from geofabrik.de. It can be downloaded with the command:

The file downloaded is a XML document compressed in bzip2 format, and is 3.2 GBytes in size. There are three types of  XML elements in the document: nodes, ways and relations, commented below.

Note: geofabrik.de also makes available for download incremental files. Another post in this blog will be devoted to explaining how to maintain an updated copy of the database, downloading and importing these incremental files.

The incremental files available from geofabrik also allows to make an estimate of the maintenance activity done on the content of the database.

A graphical representation of the size of these files makes it easy to conclude that the maintenance effort for Germany is significative, and faily steady in time:

[visualizer id=”1683″]

Nodes

A node is a point in the map. The mandatory data for a node is a pair of coordinates (latitude,longitude). Every node is also assigned a unique numerical identifier.

There is also some administrative information associated to the node, namely a timestamp, version number, etc.

Optionally, a node can be associated a set of tags with additional information.

Example:

The example above shows the two first nodes found in the file downloaded. The first node only has the ID and coordinates of the node, together with the administrative information added as attributes of the XML element.

The second node also has a set of associated tags, that identify the node as a bus stop named “Bleichplatz”.

Node tags

In the file downloaded, 6,625,830 nodes have one or more attached tags.

A tag is just a pair (k,v), where k is the name (“key”) of the tag, and v is the value.

There is a set of normalized tag names (such as “name” or “highway”, as can be seen in the example above). But OSM does not enforce the use of those names. As a result, in the database there are a number of non-normalized tag names, tag names with typos, etc.

It is interesting to analyze statistically the total number of appearances of each tag name. The most frequent are:

Other tag names of interest, although the number of appearances is lower, are:

 

Nodes of type ‘natural’

The number of appearances of the most frequent values assigned to the ‘natural’ tag of OSM nodes in Germany are:

Ways

There are 11,620,714 ways in the analyzed database.

A way is an ordered sequence of nodes, identified by a unique numerical ID.

Ways are used to represent many different types of paths: roads, rivers, administrative boundaries, etc.

Optionally, there may be one or more tags assigned to a given way, to provide additional information.

Example:

In this sample, we can see the definition of a way of eleven nodes. The administrative information (version, changeset, etc.) is added to the way as attributes of the XML <way> element.

There are also several tags associated to the way in the sample above that identify it as a street (k=”highway”, v=”tertiary”) named “Österwieher Straße”, with a  50 Km/h speed limit.

Way tags

The most frequent names of way tags found in the Germany database are:

We can see that “source” is the most frequent tag name for ways. This tag is used to identify the original source of the data for the way.

The next most frequent way tag name is “highway”. This tag is used to identify not only highways, but also streets, tracks, and in general all kinds of paths than are usually traversed by people.

A fair number of ways have also associated a “name” tag.

Other tags of interest that appear assigned to ways are:

 

Values of tags ‘highway’ assigned to ways

Values of tags ‘natural’ assigned to ways

Values of tags ‘waterway’ assigned to ways

 

Relations

There are 385,081 relations in the OSM database for Germany being analyzed.

A relation in the OSM database is a set of member elements that conform a single entity. Relations are identified by a unique numeric ID, in the same way as nodes and ways. Also, a set of tags can appear assigned to a relation to provide additional information about it.

Members of a relation can be nodes, ways, and even other relations.

Example:

In the example above, the relation is a bus route named “200 (gegen den Uhrzeigersinn)”, operated by “üstra”, that is part of the GVH network. The ordered sequence of ways define the traject followed by the route, and the nodes are the bus stops and other relevant points in the route.

Relation tags

Relations are usually given a tag named “type”. In the database being analyzed, the most frequent values assigned to the “type” tag are:

Among them, relations of type “boundary” are used to group the set of ways that delimit a region (administrative or of some other kind), as one or more closed polygons, that may contain inner “holes”.

Relations that have associated a “type=boundary” tag are also given a tag named “boundary”, whose value details the type of boundary. The most frequent values assigned to “boundary” tags are:

We can see the greater number of boundaries found in the database are administrative boundaires, that delimit the regions of Germany at several levels: Bundesländer, Regierungsbezirke, Landkreise, etc…

The second most frequent number of boundaries are the areas covered by the postal codes.

Other boundary types found on the OSM database for Germany are different kinds of geographical areas of interest: protected areas, national parks, etc.

 Posted by at 8:37 pm

 Leave a Reply

(required)

(required)