Authoritative Data is Not More Right Just Because It’s Authoritative
Posted by tordans on 1 October 2025 in English.HeiGIT recently published an analysis together with the German Federal Agency for Cartography and Geodesy (BKG), comparing land cover data from OSM with the official CORINE Land Cover (CLC) dataset from BKG.
I want to use this opportunity to make an appeal to HeiGIT and similar projects analyzing OSM data: just because data is published by an authoritative (mostly government) source does not make it more correct than OSM.
I’ve often observed that OSM is compared to external datasets, and the analysis is framed around the question of whether OSM is “right.” This framing does OSM a disservice, because it suggests OSM is wrong and the other dataset is right.
In reality, all open datasets I’ve compared with OSM—whether bicycle parking, public parking spaces, buildings, cycling infrastructure, or cycling routes—have always contained errors in both datasets. The reality is: the publishing authority has no inherent influence on data quality.
Of course, this does not mean such comparisons should be avoided. They are very useful and important. But I urge that the way these analyses are communicated be reconsidered. The communication must make clear that such comparisons are evaluations of both datasets, aimed at finding similarities and differences. It must be explicit that this is not an evaluation of correctness.
Correctness of data can only be checked through ground truth and usually by sample analysis. This is a lot of work, but only this approach can truly allow for an assessment of data quality.
At this point, it would also be valuable for such analyses not only to acknowledge that all datasets contain errors, but also to highlight one of the central advantages of OSM compared to other datasets: how errors are handled once they are found.



