Villages in Andhra Pradesh/Apr-May 2024 Mapping improvement

From OpenStreetMap Wiki
Jump to navigation Jump to search

During Apr-May 2024, substantive work was carried out across OSM and Wikipedia to improve Wikidata and Telugu name mapping on OSM and also clean up data across Wikidata, Wikipedia(english, telugu) and OSM.


District Summary

Data

as on 2024-06-01;

The scope of the initiative changed to to all villages with coords on Wikidata rather than limit to Revenue villages, as the steps required are more or less similar. This will result in significant increase in wikidata mapped places on OSM. The following table compiled from the results of Overpass Turbo and Wikidata queries shows district wise details.

✅ Summary:

Total OSM places: 36,779

With Wikidata (osm_places_wd): 12,434

Wikidata places (wd_places_RV): 15,323


Wikidata District name osm_places osm_places_wd wd_places_RV
Q110714850 Alluri Sitharama Raju 3785 2351 2956
Q110714857 Anakapalli 1178 540 670
Q15212 Anantapur 1143 407 486
Q110714854 Annamayya 3094 298 443
Q110876712 Bapatla 733 255 268
Q15213 Chittoor 2595 475 782
Q110714859 Dr. B. R. Ambedkar Konaseema 667 277 303
Q15338 East Godavari 518 252 259
Q110714851 Eluru 1532 568 647
Q15341 Guntur 368 195 192
Q110714860 Kakinada 575 290 385
Q15382 Krishna 997 396 455
Q15381 Kurnool 718 406 432
Q110714861 Nandyal 714 353 441
Q110876763 NTR 509 257 292
Q110714862 Palnadu 732 220 350
Q110714856 Parvathipuram Manyam 1311 511 902
Q15390 Prakasam 1662 653 784
Q15383 Sri Potti Sriramulu Nellore 1715 520 637
Q110714863 Sri Sathya Sai 1895 306 445
Q15395 Srikakulam 2398 945 1237
Q110714853 Tirupati 2094 660 994
Q15394 Visakhapatnam 411 122 87
Q15392 Vizianagaram 1364 884 914
Q15404 West Godavari 727 302 272
Q15342 YSR 1434 510 666

Improving accuracy of villages in OSM and Wikidata

The following workflow is stringly advised to minimize mistakes. Composite string of Mandal wikidata qid with village name is to be matched with similarly derived string from OSM to overcome the limitation of single parameter matching through Openrefine. The following workflow needs revision.

Work flow

Note: In all the scripts, before running, please make changes for the required district. Standard scripts provided are given with Bapatla district as an example. Store all files in a single work directory for avoiding confusion.

  1. Select a district to work on, note its name,qid in Wikidata and name in OSM.

Cleanup existing data on Wikidata, OSM

  1. Check duplicaton of coordinate location and update correct coordinates from Andhra Pradesh State GIS portal (APSGISP)
    1. Script for identifying places (more than one) with same coordinates
    2. Download in tsv format and rename as <district>-wd-cleanup.tsv and use an editor like "featherpad" on Ubuntu, which displays the urls as clickable.
Example data extract with duplicated coordinates
wdloc count places qids
Point(80.267 16.3) 4 Bethapudi, Vemavaram, Takkellapadu, Erraguntlapadu http://www.wikidata.org/entity/Q13004307, http://www.wikidata.org/entity/Q13010165, http://www.wikidata.org/entity/Q15702442, http://www.wikidata.org/entity/Q16343986
Point(80.462 16.459) 3 Pamulapadu, Bejatpuram, Damarapalle http://www.wikidata.org/entity/Q16338696, http://www.wikidata.org/entity/Q13004303, http://www.wikidata.org/entity/Q15704109
Point(80.329 16.073) 2 Pusulur, Annaparru http://www.wikidata.org/entity/Q13002436, http://www.wikidata.org/entity/Q12990340
  1. Check whether any of the villages are outside district borders
Using Wikidata query map view

As Wikidata query can display the results on a map, comparing the map with map of the specific district in mind, one can locate places farther from the district easily.

Using JOSM
    1. place location in an admin area as per Wikidata to generate geojson using openrefine
    2. use map view for a quick glance to see potential errors. Create a geojson (instructions for creating geojson using Openrefine, after export, rename file type as geojson Load both the district boundary and places data as two different layers in JOSM to visually clearly all the errors.

Check the difference in locations for the existing mapped data

  1. Check whether the currently mapped data is correct i.e (distance between Wikidata location and OSM location is less than 2 km)
    1. Distance between village location as per Wikidata and OSM in a district for identifying potential errors (with distance rounded to two number after decimal point).
    2. Update coordinates in wikidata, or OSM as required to fix errors if any

Get "todo" data from Wikidata, OSM

  1. Extract village without Wikidata mapping in OSM from Wikidata
    1. villages with mandal info but without Wikidata in OSM in name,qid order with wd prefix (for matching with overpass-turbo query) In this script, the qid for desired district need to be updated in the MINUS sub query also before executing.
    2. Download as csv and rename it as <district>-wd-todo.csv
Sample data extract
wd_label_village wd_qid_village wd_loc_village wd_label_mandal wd_qid_mandal
113 Thalluru Q12989716 Point(80.227 16.314) Phirangipuram mandal Q59205264
  1. Extract places without Wikidata on OSM as csv
    1. Overpass-turbo script for Places without Wikidata in a district with osm prefix
    2. Export the data using Export -> "raw data directly from Overpass API". Rename it as <district>-osm-todo.tsv
Sample data extract
osm_name_village osmfullid_village osm_wikidata_mandal osm_name_mandal osmfullid_mandal
Medavaripalem n5375826762 Q30642580 Prathipadu r10160414
  1. Extract places without Wikidata as .osm
    1. Script for places without Wikidata in xml
    2. Export the data using Export-> load data into an OSM editor: JOSM and from josm save as <district>.osm

Reconcile data

Note: Because of the manual matching involved, the operation automation feature is not being used

  1. Load <district>-osm-todo.tsv in Openrefine. Export it as csv. The resulting file <district>-osm-todo.tsv.csv should be renamed as <district>-osm-todo.csv (Openrefine reconcile software takes only csv formatted file as input
  2. Load <district>-wd-todo.csv in Openrefine.
  3. Run reconcile-csv software with <district>-osm-todo.csv as input and selecting osm_name_village, osmfullid_village as parameters in terminal
  4. Reconcile wd_label_village using CSV reconcile service, Opt for additional matching wd_qid_mandal with osm_wikidata_mandal, but do not tick auto matching.
  5. Use facet based on judgement on wd_label_village to review automatically matched items
  6. For the remaining items, go through the options provided to find most probable match based on spelling variations and mandal information presented in the row and in the drop down for the item below the cursor
  7. Now add column based on reconcilation with name osmfullid and value as cell.recon.match.id This gives the unique geometry identifier for the village
  8. Now reconcile wd_qid_village as identifiers using Wikidata reconcilation service.
  9. Add addition columns based wd_qid_village reconcilation such as Len, Lte
  10. Rename wd_qid_village column as wikidata
  11. Rename columns Len as name, Lte as name:te
  12. Export columns osmfullid, wikidata, name, name:te as <district>-wd-todo-reconciled.csv

Update OSM file with reconciled data and upload

  1. Use OSMCSVappender to update <district>.osm file with reconciled data wikidata, name, name:te from <district>_wd_todo_reconciled.csv (Allow overwriting as the existing names may have different spellings)
  2. Use JOSM to load the updated file
  3. Search for modified place=* to verify that the updated file is accurate
  4. upload to OSM

Verify accuracy of mapping

  1. Wait for upto 10 minutes (usually 2-5 min) to allow sync of the update with Sophox.org service.
  2. Check whether the mapped data is correct i.e (distance between Wikidata location and OSM location is less than 2 km)
  3. Distance between village location as per Wikidata and OSM in a district for identifying potential errors (with distance rounded to two number after decimal point).
  4. Update coordinates in wikidata, or OSM as required to fix errors if any

Software resources

  • JOSM
  • OSM CSV appender
  • Openrefine
  • Featherpad

Progress status

Scope:Update villages without wikidata on OSM with wikidata, name, name:te updates based on name (text), mandal(qid) matching

April 2024

About 853 wikidata and Telugu labels were added to places in AP in the current month making the total reach 2592 from 1739. Bulk of it were by Phani who worked on Vizianagaram villages and hamlets. Coordinates were updated for about 84 places in Wikidata with location from APSGISP. About 20 places were having coordinates outside AP and these were fixed. In all, 999 were touched. Here is a query for visualization. (Image post on Telegram) After trying different tools and scripts, need to write up a clear work flow to speed up remaining work.

May-June 2024

Notes for updating the below table.

wikidata District name Village Wikidata, OSM cleanup Reconcile Village on wikidata with places on OSM and update wikidata, name, name:te on OSM Village WD-OSM mismatch fix status date, contributor,status update
Q110714850 Alluri Sitharama Raju 2024-06-09,User:Ph4ni,user:arjunaraoc: added 227 wikidata entries on OSM, checked about 70 entries. Rest of the work is left to Ph4ni, as he has taken up the resposnibility for ASR Final stats: Wikidata villages mapped to OSM XXX/XXX, Place count with wikidata on OSM:XXX/XXX
Q110714857 Anakapalli ☑Y ☑Y ☑Y 2024-06-01,User:Ph4ni,user:arjunaraoc: added 78 wikidata entries on OSM. Final stats: Wikidata villages mapped to OSM 520/614, Place count with wikidata on OSM:533/1173
Q15212 Anantapur ☑Y ☑Y ☑Y 2024-04-26,user:arjunaraoc: About 3 inside, 0 outside district corrected in wikidata.
2024-05-25,user:arjunaraoc: Added wikidata to 340 places. Final stats: Wikidata villages mapped to OSM 371/415, Place count with wikidata on OSM:379/1080
Q110714854 Annamayya ☑Y ☑Y ☑Y 2024-05-24,user:arjunaraoc: Added wikidata to 241 places. Final stats: Wikidata villages mapped to OSM 271/345, Place count with wikidata on OSM:271/2719
Q110876712 Bapatla ☑Y ☑Y ☑Y 2024-04-12,user:arjunaraoc: 8 villages outside district, 2 inside corrected;
2024-04-27,user:arjunaraoc: RV location for Jangamaheswaram, Modepalli did not have visible houses in aerial imagery;
2024-04-30,user:arjunaraoc:19 RV could not be synced with OSM, due to inadequate info from APSGISP or other open sources. Some of them being urban centres are excluded from work. 196 places have wikidata, including about 98 RV wikidata changes during April 2024. 2 places without village names deleted, as it won't be useful. 2 places created, 96 entries tagged with wikidata. Kodavalivaripalem is supposed to be next to Kesavarappadu as per ward sachivalayam data, but is found to be far north of that. That location was modified on OSM. Audipudi location from wikidata is corresponding to Swarnapalem on OSM. Jangamaheswarapuram location from APGISP does not map to residential landuse areas. Information from open sources not sufficient to resolve these issues. About 15 RV could not be reconciled to places on OSM. Four RVs are currently part of urban centres and are excluded for reconcilation. Only about 90% success considering that out of 179 RV with coords, only 159 (88.8%)are mapped with wikidata on OSM. Still about 88 RV need coordinates on wikidata.
2024-05-17 user:Arjunaraoc, 197 places have wikidata on osm out of which 174 are villages. One non RV added. About 125 places(mostly hamlets) with coords exist on wikidata, and 522 places exist on OSM which could not be matched.
Q15213 Chittoor ☑Y ☑Y ☑Y 2024-04-14,user:arjunaraoc: About 12 corrected in wikidata, 1 in OSM; 2024-05-23,user:arjunaraoc: Added wikidata to 376 places. Final stats: Wikidata villages mapped to OSM 414/523, Place count with wikidata on OSM:424/2481, Mandal boundary inacuuracies might have prevented more matching. Lot of duplicate names, single letter names present.
Q110714859 Dr. B. R. Ambedkar Konaseema ☑Y ☑Y ☑Y 2024-06-01,User:Ph4ni,user:arjunaraoc: added 160 wikidata entries on OSM. Final stats: Wikidata villages mapped to OSM 263/306, Place count with wikidata on OSM:273/601
Q15338 East Godavari ☑Y ☑Y ☑Y 2024-05-28,User:Ph4ni,user:arjunaraoc: added 11 wikidata entries on OSM. Final stats: Wikidata villages mapped to OSM 248/248, Place count with wikidata on OSM:250/390
Q110714851 Eluru ☑Y ☑Y ☑Y 2024-05-31,User:Ph4ni,user:arjunaraoc: added 46 wikidata entries on OSM. Final stats: Wikidata villages mapped to OSM 531/595, Place count with wikidata on OSM:563/1260
Q15341 Guntur ☑Y ☑Y ☑Y 2024-04-09,user:arjunaraoc: error >2km corrected; 2024-05-16,user:arjunaraoc; 81 places including about 5 hamlets updated, total place nodes count with wikidata became 193 out of which 157 are villages.
Q110714860 Kakinada ☑Y ☑Y ☑Y 2024-05-31,User:Ph4ni,user:arjunaraoc: added 0 wikidata entries on OSM. Final stats: Wikidata villages mapped to OSM 283/312, Place count with wikidata on OSM:291/566
Q15382 Krishna ☑Y ☑Y ☑Y 2024-05-29,user:arjunaraoc: Added wikidata to 345 places. Final stats: Wikidata villages mapped to OSM 381/437, Place count with wikidata on OSM:503/899
Q15381 Kurnool ☑Y ☑Y ☑Y 2024-05-26,user:arjunaraoc: Added wikidata to 282 places. Final stats: Wikidata villages mapped to OSM 376/419, Place count with wikidata on OSM:380/702
Q110714861 Nandyal ☑Y ☑Y ☑Y 2024-05-26,user:arjunaraoc: Added wikidata to 309 places. Final stats: Wikidata villages mapped to OSM 344/418, Place count with wikidata on OSM:327/680
Q110876763 NTR ☑Y ☑Y ☑Y 2024-05-30,,User:Ph4ni,user:arjunaraoc: Added wikidata to 176 places. Final stats: Wikidata villages mapped to OSM 244/305, Place count with wikidata on OSM:255/454
Q110714862 Palnadu ☑Y ☑Y ☑Y 2024-04-12,user:arjunaraoc: two villages outside district, 8 inside corrected in wikidata
2024-05-17 user:arjunaraoc updated 102 taking the total to 220, out of which 186 are villages. 497 on OSM, 176 on wikidata with coordinates need work. (80+ revenue villages on wikidata need coordinates)
Q110714856 Parvathipuram Manyam ☑Y ☑Y ☑Y 2024-06-04,User:Ph4ni,user:arjunaraoc: added 77 wikidata entries on OSM. Final stats: Wikidata villages mapped to OSM 573/797, Place count with wikidata on OSM:589/1263
Q15390 Prakasam ☑Y ☑Y ☑Y 2024-04-12,user:arjunaraoc: 1 villages outside district, 1 inside corrected
2024-05-19,user:arjunaraoc: 535 wikidata added on OSM. Final stats: Wikidata villages mapped to OSM 595/993, Place count on OSM:1030,
Q15383 Sri Potti Sriramulu Nellore ☑Y ☑Y ☑Y 2024-04-13,user:arjunaraoc: About 10 inside district corrected in wikidata, 2 position corrected in OSM; 2024-05-21,user:arjunaraoc:Updated 443 places. Final stats: Wikidata villages mapped to OSM 512/577, Place count on OSM:1729,
Q110714863 Sri Sathya Sai ☑Y ☑Y ☑Y 2024-04-26,user:arjunaraoc: About 6 inside, 1 outside district corrected in wikidata;

2024-05-25,user:arjunaraoc: Added wikidata to 271 places. Final stats: Wikidata villages mapped to 334 /345, Place count with wikidata on OSM:306/1501

Q15395 Srikakulam ☑Y ☑Y ☑Y 2024-06-03,User:Ph4ni,user:arjunaraoc: added 81 wikidata entries on OSM. Final stats: Wikidata villages mapped to OSM 930/1089, Place count with wikidata on OSM:947/2348
Q110714853 Tirupati ☑Y ☑Y ☑Y 2024-04-13,user:arjunaraoc: About 10 inside district corrected in wikidata, 2 duplicate nearby villages deleted in OSM, purple triangle is to be preferred over green circle or star for location info in Bharatmaps;2024-05-23,user:arjunaraoc: Added wikidata to 450 places. Final stats: Wikidata villages mapped to OSM 494/643, Place count with wikidata on OSM:508/1884
Q15394 Visakhapatnam ☑Y ☑Y ☑Y 2024-04-13;User:Ph4ni:Corrected most locations;
2024-04-26; user:arjunaraoc:Reviewed and corrected two locations inside district

2024-05-30,user:arjunaraoc: Final stats: Wikidata villages mapped to OSM 80/100, Place count with wikidata on OSM:119/404

Q15392 Vizianagaram ☑Y ☑Y ☑Y 2024-04-12,User:Ph4ni:Corrected locations of about 20 villages inside the district in wikidata. Remaining are close enough;
2024-04-26,user:arjunaraoc: Reviewed and fixed 8 places inside district in Wikidata, About 5 fixed in OSM.

2024-06-03,User:Ph4ni,user:arjunaraoc: added 6 wikidata entries on OSM. Final stats: Wikidata villages mapped to OSM 881/908, Place count with wikidata on OSM:894/1346

Q15404 West Godavari ☑Y ☑Y ☑Y 2024-05-30,,User:Ph4ni,user:arjunaraoc: Added wikidata to 9 places. Final stats: Wikidata villages mapped to OSM , Place count with wikidata on OSM:299/690
Q15342 YSR ☑Y ☑Y ☑Y 2024-05-27,user:arjunaraoc: Added wikidata to 311 places. Final stats: Wikidata villages mapped to OSM 335/438, Place count with wikidata on OSM:369/996