SAM and OSM

Posted by krschap on 8 June 2023 in English. Last updated on 15 June 2023.

Introduction

The purpose of this document is to explore the possibilities of the Segment Anything Model (SAM) Segment Anything developed by Meta in OpenStreetMap mapping. SAM is a promising model in the field of image segmentation, and its potential application to fAIr could enable the tracing of various features on maps. This document provides an overview of SAM’s working mechanism, test cases conducted with SAM, and two potential integration ideas for utilizing SAM

SAM Working Mechanism

SAM operates directly on images and generates individual masks based on the training datasets used during model training. Unlike object detection models, SAM focuses on segmenting all elements present in an image, regardless of their nature, based on texture, color, and points of difference from the background and several others can be found on their research doc SAM Working Mechanism

Advantages of SAM for fAIr and OSM Mapping

SAM can run inference on the user’s browser, ensuring accessibility and reducing server-side processing.
SAM allows excellent automatic tracing of individual features based on user mouse-over interactions, enhancing user interaction and customization which can be used on OSM editors

Test Cases with SAM

Test cases were conducted using drone images to evaluate SAM’s performance in tracing various features. The following results were observed:

Initial Impressions:

SAM excels in accurately tracing features, as demonstrated in the test case. It demonstrates the impressive ability to identify and trace objects with precision.

However, SAM struggles with line features, as evidenced by the poor performance in detecting and identifying road masks Testcase1

SAM’s Classification Abilities:

SAM can classify polygons, making it applicable to land use features in OpenStreetMap (OSM). The output may vary depending on the zoom level, such as generating a polygon enclosing all building features at zoom level 19 and tracing individual buildings at zoom level 21. Road Test

Retraining SAM:

SAM is often misconstrued as a model that can be retrained for object detection purposes. However, SAM’s primary function is segmentation rather than object detection. While additional training datasets can enhance the accuracy of segmentation classes, the base model of SAM already performs excellently, making retraining less preferred for OSM use cases

Integration Ideas for SAM

Two promising integration ideas for incorporating SAM are proposed:

Idea 1: SAM as Tracer

By leveraging the spatial utilities and satellite images, SAM can be used to trace features on maps. The workflow involves:

Utilizing SAM for feature tracing on the map
Assigning labels manually and refining polygons if necessary
Running conflation to convert the traced features into OSM XML format
Pushing the converted features to OpenStreetMap

flowidea1

GIF illustration demonstrates the ability of SAM’s creating masks in mouse hover which can later integrated with osm editors

gif

Idea 2: Integrating SAM with Object Detection Models

grounddino

SAM can be combined with object detection models, such as Grounding Dino and MMDetection, to achieve improved segmentation accuracy. GroundingDINO, an open-set object detection model, recently added support for SAM. Object detection models are employed to classify objects on the map, which are then segmented using SAM. GroundingDINO is retrainable and can be used as a base model alongside SAM. For this test i have used groundingdino but several others good performing object detection models can be used

flow2

Test Cases with GroundingDINO:

Test cases were conducted using SAM in conjunction with GroundingDINO on aerial images. The following observations were made:

Example: Tracing Sea , Buildings block , Trees , Vehicle

SAM, combined with GroundingDINO, successfully identified and traced a yellow vehicle in the image. This demonstrates the potential for promising performance when trained with relevant data.Currently, GroundingDINO lacks understanding of buildings and swimming pools in drone images. However, when trained with local data specific to these features, it can become highly beneficial. The role of SAM in this context is to precisely trace the located features, resulting in accurate segmentation. yellowveh TODO: Since GroundingDINO is also still in research, inference code is available but training is not (it’s on their checklist though) once training code is available of grounding dino , it would be better to test training with local datasets

Screenshots Respectively : Detecting trees only with object detection model , Building Blocks , GroundingDIno output for sea , SAM mask for sea supplied from bbox of object detection model and finally the pond on bing image

scrn1 scrn2 scrn3 scrn3 scrn4

Application of SAM in OSM

The utilization of SAM within the OSM and fAIr project presents several opportunities:

SAM can be individually implemented as tracer for existing OSM editors a simple webUI platform that can load OAM and start tracing features would be beneficial to apply this idea or as a plugin in JOSM
Buildings: While there are still some deficiencies in using SAM for building tracing, an existing high-quality basemodel, RAMP, can be employed. For other features such as swimming pools, land use, natural elements, and sea ponds, SAM can be utilized, working with user-provided datasets in the fAIr project.
Alignment with fAIr (HOTOSM) approach: The proposed SAM + object detection model integration aligns with the fAIr approach, but further research is required to address performance considerations, determining the computational requirements for retraining Object detection models, and estimating the time required for training and inference processes.

Conclusion

The Segment Anything Model (SAM) developed by Meta shows great promise in the field of image segmentation. Its ability to segment various features based on texture, color, and points of difference presents opportunities for integration into the fAIr project. Through test cases and evaluation, SAM’s strengths and limitations have been identified. Integration ideas, including SAM as a tracer and combining SAM with object detection models, have been proposed. These integration possibilities can enhance feature tracing, classification, and segmentation within the OSM and fAIr project. Further research and testing is necessary to find a good object detection model, optimize performance and determine the practical implementation of SAM within the fAIr framework.

fAIr is a free and open source AI project initiated by HOTOSM. Know more :https://github.com/hotosm/fair

Resources & Credits : https://github.com/IDEA-Research/Grounded-Segment-Anything , https://github.com/facebookresearch/segment-anything , https://github.com/UX-Decoder/Segment-Everything-Everywhere-All-At-Once , https://github.com/RockeyCoss/Prompt-Segment-Anything , https://github.com/opengeos/segment-geospatial , https://github.com/IDEA-Research/GroundingDINO , https://github.com/OptimalScale/DetGPT

Collab test notebooks : Segmentanything2osm :https://colab.research.google.com/drive/1n4OgMZX0JkoFGh_j-NDCqafHtyqnRFfL?usp=sharing Segmentanything2osm class based https://colab.research.google.com/drive/1vIsvkkki_EeMWUkxZuQ5HXhU2N9usV4F?usp=sharing Segment any anomaly https://colab.research.google.com/drive/1gneKlzWxNTFKsil45IFU6cYxgTq9HYYi?usp=sharing Grounded sam https://colab.research.google.com/drive/1kNu7zW2AOJsllvF-ROmVne9Uh7X5iswu?usp=sharing

Discussion

Comment from -karlos- on 9 June 2023 at 04:25

Do you intend to run SAM inside an OSM editor at my computer? So I could check the findings before upload them?

Comment from krschap on 9 June 2023 at 04:29

@karlos ! Yes !

Comment from Zaiyi Hu on 12 June 2023 at 13:13

Hello, I’m new to use open street map but I’m delightful to see work combined with large visual model like SAM, great! By the way, I thought there’s no satelite map in OSM, but it seems you can get the data. Where can I use the satelite data?

Comment from krschap on 12 June 2023 at 14:12

@zaiyi , Yes , if you want to use bing / maxar they provide tms but their license is not compatible for training the models but you can run inference on them ! You need to get tms link and then you can fetch tiles from tms to convert it to geotif which then can be passed to any models For opensource imagery that you can use to even train models , one resource could be https://openaerialmap.org/

Comment from Zaiyi Hu on 13 June 2023 at 02:46

@Kshitijraj, Thank you very much for providing me this information so promptly! If I get your words correctly, you mean I may get tiles from the bing/maxar or some other tms provider and run inference on these images.

I’m trying to obtain both semantic maps (which seems that OSM only supports) and satelite maps to apply semantic segmentation. I successfully made it on Google static map API, but I checked the OSM and found that OSM provide more accurate semantic legends for differnet areas like park or school. Is there any possible solution to acquire OSM semantic maps and their corresponding satelite maps?

Comment from krschap on 14 June 2023 at 10:22

@zaiyi , Yes you can do that , I will say Grab TMS for both OSM tiles and satellite tiles , remember tile id will always be same for the area , in that way you can create your labels with iamges , Know more here : osm.wiki/Slippy_map_tilenames If you wish to have only specific features you can do that as well , such as only ponds and water , or only landuse , You can use mapbox for it to get it customized easily or Any github tools that creates custom osm tiles should do the zoom , Single logic for converting TMS to tiles will do the job for you : Both osm and satellite You just need to change the TMS source

Comment from tordans on 24 June 2023 at 04:34

FYI, there is also a little talk about this at https://github.com/facebook/Rapid/issues/922

Comment from RAytoun on 27 July 2023 at 15:55

@Zaiyi Hu Please note that the imagery available to OpenStreetMap is only for tracing features onto OpenStreetMap and should not be used for any other purposes or for download. You are free to use the derived data for other purposes but not the imagery.

OpenStreetMap

krschap's Diary