Exploring machine learning assisted and traditional digitizing of map features in 'OpenStreetMap'
Posted by bo_hot on 5 October 2020 in English.We are about to undertake an experiement funded by the NESTA Collective Intelligence Grants to explore emerging trends in map feature digitization. You can read about the project here >
There will be more to follow for those who are interested in participating in the upcoming experiements, however, I thought it may be of interest to share some of the background first.
The project was originally applied (and accepted) by Felix Delattre and I have the good fortune of supporting its implementation. We are also partnering with 510/Netherlands Red Cross who are supprting the derivation of the ‘AI-Only’ data set.
I will also highlight, none of the data collected or used in the experiement will be added to the global OSM database, we are using OSMSeed instances with customized Tasking Manager instances to sandbox the entire project for data gathering.
Comments are welcome as we refine the design. The next entry will have more info for the specific experiemental design, so that may be a better opportuity also.
Enjoy.
Experiment: Comparison of machine learning assisted and traditional digitizing of map features in OpenStreetMap
Hypothesis Machine predicted map features, from state of the art machine learning models, can effectively and efficiently (w.r.t the quality and speed of mapping results) assist and improve the current volunteer mapping in OpenStreetMap.
Context Evidence-based approach to get insights about the effectiveness and efficiency of machine learning assisted methods for mapping in OpenStreetMap. Build trust through arguments. Elaboration of a scientific publication, shall be open data, open access, open science. Publish datasets and all documentation with open licenses to allow calculations to be transparent and reproducible, and this provides a general framework for conducting measurable and comparable experiments around this topic.
Sample groups
-
Reference data: Well mapped OpenStreetMap data (over a longer time period)
-
Traditional: OpenStreetMap mapping data (single remote mapping event)
-
AI-assisted: mapping data with RapID (single remote mapping event)
-
AI-only: Data created by the latest generation AI models (without any editing)
Discussion
Comment from SimonPoole on 5 October 2020 at 19:47
You don’t mention how the models you intend to employ were trained. Regardless of the source of the training data, you need to consider that there may be residual intellectual property from the source of the training data in the output of the models.
Comment from bo_hot on 7 October 2020 at 09:22
Hi Simon,
You raise a good point here. We will be running the comparison experiments on three different models (and respective training data) that have already been checked, referneced and approved for any residual IP, as you have highlighted. A quick brief below:
You will understand that to maintain the integrity of the experiements I have purposely omitted identifiers to who will be providing the data, however, once it is completed everything will be made public and open after publication.
I hope this information is helpful for you and all others who may be reading.