OpenStreetMap logo OpenStreetMap

lpf452's Diary

Recent diary entries

Fixing Chinese Place Name Display Issues in OpenStreetMap with a C++ Batch Script

Posted by lpf452 on 14 September 2025 in English. Last updated on 1 October 2025.

As an OpenStreetMap contributor, I’ve always been dedicated to enhancing the detail and usability of local data. Recently, however, I ran into a frustrating problem: in areas I’ve mapped, the Chinese names for many places fail to display correctly in certain applications and services (like OsmAPP, JawgMaps, and MapTiler). Instead, they either fall back to Pinyin or default to the English name (name:en), which looks odd—especially when a primary name tag clearly exists but is simply ignored.

The root of the problem lies in the peculiar rendering rules of these applications, which often prioritize name:[lang] tags that match the user’s language. Even though we add a name tag, the absence of an explicit name:zh or name:zh-Hans tag can leave the renderer confused, causing it to fall back to name:en or just display the Pinyin transliteration.

Manually adding these tags to thousands of elements is obviously out of the question. You can’t just copy and paste your way through it; the sheer monotony would be mind-numbing. So, I decided to automate the process by writing a script.

Tech Stack and Script Logic

When high performance is a priority, C++ is the natural choice. I also leveraged two powerful open-source libraries:

  1. pugixml: A lightweight, high-performance C++ XML parser, perfect for rapidly reading and writing large .osm files.
  2. OpenCC: The community’s go-to library for Simplified and Traditional Chinese conversion, which I used to generate name:zh-Hant tags.

The core logic of my script is as follows:

See full entry

作为一名贡献者,我一直致力于提升本地 OSM 数据的细节和可用性。但是最近我发现一个令人困扰的问题:在我绘制过的区域,很多地点的中文名称在某些软件/服务中 (比如 OsmAPP、JawgMaps 和 MapTiler 的瓦片等)无法正确显示(回退为拼音),或是优先显示了英文名 (name:en),导致看起来怪怪的,明明有 name 但就是不用。

主要原因这些软件的神必渲染规则通常会优先寻找符合用户语言的 name:[lang] 标签。虽然我们加了 name 标签,但如果缺少了明确的 name:zhname:zh-Hans 标签,渲染器可能就会“不知所措”,转而去寻找 name:en 或干脆直接显示拼音。

手动为成千上万个要素添加这些标签显然是不现实的,又不能靠复制粘贴,一圈下来人可能都麻了。我决定靠自动化,也就是写一个脚本来解决这个问题。

技术选型与脚本逻辑

对于这种讲究高性能的操作,C++ 肯定是首选,我还用了两个强大的开源库:

  1. pugixml: 一个以轻量和高性能著称的 C++ XML 解析库,用来极速读写庞大的 .osm 文件。
  2. OpenCC: 社区公认的中文简繁转换标准库,用于生成 name:zh-Hant 标签。

我编写的脚本核心逻辑如下:

  1. 读取与解析: 使用 pugixml 加载从 Overpass API 查询的的本地 .osm 数据文件;
  2. 遍历要素: 循环遍历文件中的每一个 node wayrelation
  3. 定位目标: 检查元素是否含有 k="name"tag
  4. 生成标签: 如果找到 name 标签,则执行以下操作:
    • 复制 name 标签的值,创建新的 <tag k="name:zh" v="..."/>
    • 再次复制 name 标签的值,创建新的 <tag k="name:zh-Hans" v="..."/>
    • 调用 OpenCC 库(使用 s2twp.json),将 name 的值从简体中文转换为台湾地区通行的繁体中文,并创建 <tag k="name:zh-Hant" v="..."/>
  5. 生成变更文件: 将所有被修改的要素(保留其原始 version 号)写入一个全新的 .osc (osmChange) 文件中,以便上传。

这里我贴一个 AI 生成的代码 (懒得写),各位可以参考一下:

See full entry