Data items: Difference between revisions

From OpenStreetMap Wiki
Jump to navigation Jump to search
Content deleted Content added
Yurik (talk | contribs)
Yurik (talk | contribs)
First part of a massive refresher to the data item documentation
Line 10: Line 10:
* This wiki will be able to show data as info cards and tables, without duplicating and complicated template hackery.
* This wiki will be able to show data as info cards and tables, without duplicating and complicated template hackery.


This project's goal is NOT to replace the primary tag storage for the OSM database, nor to use opaque IDs instead of the human readable key=value strings to tag features. We are only trying to improve this wiki's Key:* and Tag:* pages, making them more useful to various tools.
This project's goal is NOT to replace the primary tag storage for the OSM database, nor to use opaque IDs instead of the human readable key=value strings to tag features. We are only trying to improve metadata documentation, making it more useful to various tools.


== How can I help? ==
== How can I help? ==
{{Hidden
''Looking for volunteers...''
|''Looking for volunteers...  click "show" -->''
|2=
; Community and content
; Community and content
* Set up a wiki portal, possibly similar to [https://www.wikidata.org/wiki/Wikidata:Community_portal Wikidata's community portal] (but simpler), where community can:
* Set up a wiki portal, possibly similar to [https://www.wikidata.org/wiki/Wikidata:Community_portal Wikidata's community portal] (but simpler), where community can:
Line 20: Line 22:
** discuss Wikibase data structures
** discuss Wikibase data structures
* Create Lua modules to generate tag tables, such as {{t|Template:Bridge:movable}}, {{t|Map Features:highway}}, or {{t|Template:Religions}}.
* Create Lua modules to generate tag tables, such as {{t|Template:Bridge:movable}}, {{t|Map Features:highway}}, or {{t|Template:Religions}}.
** '''Implementation note:''' ''Wikibase only links Tags to the corresponding Key, but Keys do not list all possible Tags. To generate a table, we must have a list of items somewhere. We could create a new WB key property that lists all tags, and use a bot to maintain it, or we could list all needed tags as a template parameter, e.g. for highway, <code><nowiki>{{...|motorway|trunk|primary|secondary|...}}</nowiki></code>. List as a template parameter does not need to be localized, and it could specify proper ordering of items (not available in WB). Lua code would use <code>mw.wikibase.getEntityIdForTitle("Key:highway=motorway")</code> to find the right data.''
** '''Implementation note:''' ''Wikibase only links Tags to the corresponding Key, but Keys do not list all possible Tags. To generate a table, we must have a list of items somewhere. We could create a new WB key property that lists all tags, and use a bot to maintain it, or we could list all needed tags as a template parameter, e.g. for highway, <code><nowiki>{{...|motorway|trunk|primary|secondary|...}}</nowiki></code>. List as a template parameter does not need to be localized, and it could specify proper ordering of items (not available in WB). Lua code would use <code>mw.wikibase.getEntityIdForTitle("Key:highway=motorway")</code> to find the right data.''


; Technical
; Technical
Line 32: Line 34:


; tasks in progress
; tasks in progress
* Change {{t|KeyDescription}}, {{t|ValueDescription}}, and {{t|RelationDescription}} to get data from the Wikibase. The {{T|KeyDescription/Sandbox}} is mostly done, and can already be used as a replacement for '''KeyDescription'''. (''being worked on by {{ping|Teester}} and {{ping|Yurik}}'')
* Change {{t|RelationDescription}} to get data from the Wikibase, similar to {{T|KeyDescription}} is. (''being worked on by {{ping|Yurik}}'')


; done!
; done!
* <strike>Add helper templates, e.g. {{T|O|Q2}} (link to {{O|Q2}}), {{T|label|Q2}} (label of the {{O|Q2}}). See also [https://www.wikidata.org/wiki/Template:Q Wikidata's Q], [https://www.wikidata.org/wiki/Template:label label], and other similar templates. Ideally we should have exactly the same functionality, except that we may need to have different template names.</strike> Thanks {{ping|Teester}}!!!
* <strike>Add helper templates, e.g. {{T|O|Q2}} (link to {{O|Q2}}), {{T|label|Q2}} (label of the {{O|Q2}}). See also [https://www.wikidata.org/wiki/Template:Q Wikidata's Q], [https://www.wikidata.org/wiki/Template:label label], and other similar templates. Ideally we should have exactly the same functionality, except that we may need to have different template names.</strike> Thanks {{ping|Teester}}!!!
* <strike>Create {{T|desc|Q2}} (description of the {{O|Q2}}) template</strike> Thanks {{ping|Teester}}!!!
* <strike>Create {{T|desc|Q2}} (description of the {{O|Q2}}) template</strike> Thanks {{ping|Teester}}!!!
* <strike>Change {{t|KeyDescription}} and {{t|ValueDescription}} to get data from the Wikibase. ({{ping|Yurik}})</strike>
}}


== Tag Keys ==
== Tag Keys ==
Each unique OSM '''Key''' is stored as a separate page in the Item namespace:
Each OSM '''Key''' is stored as a separate page in the Item namespace. For example, see {{O|Q104}} that describes a {{key|bridge:movable}}:
* There must be only one item per tag key, e.g. {{key|highway}} or {{key|landuse}}.
* The key string will be stored as a {{O|P16}}
* The key string will also be stored as a sitelink, e.g. '''highway''' becomes a link to [[Key:highway]] page ''(note that "Key:" is part of the title, and not a wiki namespace)''
** Sitelink will be shown in the upper right corner as a link
** Ensures each sitelink is unique
** Allows item's data to be used on the linked page with <nowiki>{{#statements:...}}</nowiki> and Lua
** Unlike Wikipedia, sitelinks to non-existent Key:* pages is allowed
* Item must have instance-of property set to Key
* A bot will create all keys with a significant usage (see below)

=== Example ===
See {{O|Q104}} that describes a {{key|bridge:movable}}:
{| class=wikitable
{| class=wikitable
! property || type || value example || description
! property || type || value example || description
|-
|-
| label || string || '''en''' - <code>bridge:movable</code> || Set '''English''' to the key's value, exactly the same as '''P16''' below. Some languages have '''nativekey''' (localized key). Use the corresponding label language for that. ''Note that same as "en", the localized label must be unique in that language.''
| local sitelink || string || [[Key:bridge:movable]] || link to the '''key:*''' pages, even if they do not exist
|-
|-
| description || string || '''en''' - ''The mechanism by which a movable bridge moves to clear the way below.''<br>'''ru''' - ''Механизм, которым переносной мест освобождает проходимость внизу.'' || Describe the key using proper sentences (first word capitalized, ending with a period). Must not contain any wiki markup or HTML. Must be less than 250 symbols.
| {{O|P2}}<br>''{{desc|P2}}'' || Item || {{O|Q7}} || indicate the type of the item
|-
|-
| {{O|P16}}<br>''{{desc|P16}}'' || String || <code>"bridge:movable"</code> || shows the exact form of the key as used in OSM. Must never be changed once the item is created.
| sitelink || string || [[Key:bridge:movable]] || Links to the '''Key:...''' pages, even if the page does not exist. Sitelink is shown in the upper right corner of the item page.
|-
|-
| {{O|P5}}<br>''{{desc|P5}}'' || Items || {{O|Q4}}, {{O|Q5}}, {{O|Q6}} || what kind of OSM objects this key should be used on - e.g. relation, way, area, node
| {{O|P2}}<br>''{{desc|P2}}'' || Item || {{O|Q7}} || Indicate the type of the item. Set to Q7 for keys.
|-
|-
| {{O|P16}}<br>''{{desc|P16}}'' || String || <code>bridge:movable</code> || Shows the exact form of the key as used in OSM. Must never be changed once the item is created. Due to technical limitations, keys "Key:water tap", "Key:water_tap", and "Key:water_tap_" have identical wiki pages/sitelinks - "Key:water tap". In this case, set this property to multiple strings, but mark one as "preferred" (small up arrow left of value).
| {{O|P24}}<br>''{{desc|P24}}'' || Items || {{O|Q7}} || what kind of OSM objects this key should NOT be used on
|-
|-
| {{O|P5}}<br>''{{desc|P5}}'' || Items || {{O|Q4}}, {{O|Q5}}, {{O|Q6}} || What kind of OSM objects this key should be used on - e.g. relation, way, area, node. Use {{O|P27}} to exclude certain regions. See {{O|Q501|P5}} example.
| {{O|P4}}<br>''{{desc|P4}}'' || Commons file || [[File:MovableBridge roll.gif|100px]] || An image from [[Wikimedia Commons]] (Wikibase cannot use images stored on OSM wiki itself)
|-
|-
| {{O|P25}}<br>''{{desc|P25}}'' || Item || {{O|Q4712}} || The group this item belongs to. In the current model, each key belongs to just one group. In theory we could use it to attach multiple ones, changing the meaning of the "group" to something of a "label"/"meta-tag".
| {{O|P24}}<br>''{{desc|P24}}'' || Items || {{O|Q3}} || What kind of OSM objects this key should NOT be used on. Use {{O|P26}} to limit this just to certain regions. See {{O|Q501|P24}} example.
|-
| {{O|P4}}<br>''{{desc|P4}}'' || Commons file || [[File:MovableBridge roll.gif|100px]] || An image from [[Wikimedia Commons]]. Technical limitations do not allow OSM own images to be used here. Please upload our local images to Commons under the proper license (preferred), or use {{O|P28}}. To use a different image for a specific language region, add another value and set {{O|P26}}. Make sure to set ''preferred'' status for the default image.
|-
| {{O|P4}}<br>''{{desc|P28}}'' || String || <code>Noexit.jpg</code> || An image stored on OSM wiki, without the ''File:'' prefix. If possible, please use {{O|P4}} with an image from Commons. See previous line.
|-
| {{O|P25}}<br>''{{desc|P25}}'' || Item || {{O|Q4712}} || The group this item belongs to. In the current model, each key belongs to just one group. In theory we could use it to attach multiple groups, changing the meaning of the "group" to something like a "label"/"meta-tag".
|-
|-
| {{O|P6}}<br>''{{desc|P6}}''<br>&nbsp;&nbsp;{{O|P11}}<br>&nbsp;&nbsp;{{Desc|P11}}
| {{O|P6}}<br>''{{desc|P6}}''<br>&nbsp;&nbsp;{{O|P11}}<br>&nbsp;&nbsp;{{Desc|P11}}
Line 76: Line 74:
| {{O|P9}}<br>''{{desc|P9}}'' || Item || {{O|Q8}} || what type of key is this? one of well known values, external id, etc.
| {{O|P9}}<br>''{{desc|P9}}'' || Item || {{O|Q8}} || what type of key is this? one of well known values, external id, etc.
|}
|}


* There must be only one item per tag key, e.g. {{key|highway}} or {{key|landuse}}.
* The key string will be stored as a {{O|P16}}
* The key string will also be stored as a sitelink, e.g. '''highway''' becomes a link to [[Key:highway]] page ''(note that "Key:" is part of the title, and not a wiki namespace)''
** Sitelink will be shown in the upper right corner as a link
** Ensures each sitelink is unique
** Allows item's data to be used on the linked page with <nowiki>{{#statements:...}}</nowiki> and Lua
** Unlike Wikipedia, sitelinks to non-existent Key:* pages is allowed
* Item must have instance-of property set to Key
* A bot will create all keys with a significant usage (see below)



=== Optional properties ===
=== Optional properties ===

Revision as of 04:22, 15 December 2018

Goal

This page documents how to store structured tag metadata on this wiki using Wikibase extension - the same software that runs Wikidata. (initial discussion)

Wikibase allows OSM community to store multilingual tag descriptions and community-defined metadata on the OSM wiki in a way useful to both humans and tools.

  • Tools, such as iD editor and Taginfo are now able to get tag information without complex and error-prone parsing of the wiki markup. Eventually the data may include tag suggestions, validation rules, common pitfalls, and more.
  • Data consumers will be able to get structured metadata to help process main OSM database
  • This wiki will be able to show data as info cards and tables, without duplicating and complicated template hackery.

This project's goal is NOT to replace the primary tag storage for the OSM database, nor to use opaque IDs instead of the human readable key=value strings to tag features. We are only trying to improve metadata documentation, making it more useful to various tools.

How can I help?

Looking for volunteers...  click "show" -->
Community and content
  • Set up a wiki portal, possibly similar to Wikidata's community portal (but simpler), where community can:
    • propose new properties
    • write guidelines/docs
    • discuss Wikibase data structures
  • Create Lua modules to generate tag tables, such as {{Template:Bridge:movable}}, {{Map Features:highway}}, or {{Template:Religions}}.
    • Implementation note: Wikibase only links Tags to the corresponding Key, but Keys do not list all possible Tags. To generate a table, we must have a list of items somewhere. We could create a new WB key property that lists all tags, and use a bot to maintain it, or we could list all needed tags as a template parameter, e.g. for highway, {{...|motorway|trunk|primary|secondary|...}}. List as a template parameter does not need to be localized, and it could specify proper ordering of items (not available in WB). Lua code would use mw.wikibase.getEntityIdForTitle("Key:highway=motorway") to find the right data.
Technical
  • Add Wikibase support to external tools. Simple usage: get key/tag localized description. Complex usage: allow user to add missing or even edit description, especially when user is creating a new key.
  • Port simple validation rules, e.g. regex-based, to use Wikibase data.
  • Help parse various tables of tag data. Even if you can only generate plain files with data, user:Yurik can quickly import them.
tasks in progress
done!

Tag Keys

Each OSM Key is stored as a separate page in the Item namespace. For example, see bridge:movable (Q104) that describes a bridge:movable=*:

property type value example description
label string en - bridge:movable Set English to the key's value, exactly the same as P16 below. Some languages have nativekey (localized key). Use the corresponding label language for that. Note that same as "en", the localized label must be unique in that language.
description string en - The mechanism by which a movable bridge moves to clear the way below.
ru - Механизм, которым переносной мест освобождает проходимость внизу.
Describe the key using proper sentences (first word capitalized, ending with a period). Must not contain any wiki markup or HTML. Must be less than 250 symbols.
sitelink string Key:bridge:movable Links to the Key:... pages, even if the page does not exist. Sitelink is shown in the upper right corner of the item page.
instance of (P2)
that class of which this subject is a particular example and member (subject typically an individual member with a proper name label); different from P3 (subclass of)
Item key (Q7) Indicate the type of the item. Set to Q7 for keys.
permanent key ID (P16)
A string representing the key ID. Once set on a key data item, this value should never be changed.
String bridge:movable Shows the exact form of the key as used in OSM. Must never be changed once the item is created. Due to technical limitations, keys "Key:water tap", "Key:water_tap", and "Key:water_tap_" have identical wiki pages/sitelinks - "Key:water tap". In this case, set this property to multiple strings, but mark one as "preferred" (small up arrow left of value).
ERROR: Invalid ID
Lua error in Module:OSMWikiBase at line 84: attempt to index local 'entity' (a nil value).
Items way (Q4), area (Q5), relation type (Q6) What kind of OSM objects this key should be used on - e.g. relation, way, area, node. Use (DEPRECATED) excluding region qualifier (P27) to exclude certain regions. See noexit (Q501) example.
ERROR: Invalid ID
Lua error in Module:OSMWikiBase at line 84: attempt to index local 'entity' (a nil value).
Items node (Q3) What kind of OSM objects this key should NOT be used on. Use limited to language (P26) to limit this just to certain regions. See noexit (Q501) example.
image (DEPRECATED) (P4)
image of relevant illustration of the subject
Commons file An image from Wikimedia Commons. Technical limitations do not allow OSM own images to be used here. Please upload our local images to Commons under the proper license (preferred), or use image (P28). To use a different image for a specific language region, add another value and set limited to language (P26). Make sure to set preferred status for the default image.
image (DEPRECATED) (P4)
Image of a relevant illustration of the subject. (Note: without "File:" and set one to "preferred rank")
String Noexit.jpg An image stored on OSM wiki, without the File: prefix. If possible, please use image (DEPRECATED) (P4) with an image from Commons. See previous line.
group (P25)
Indicates which group the given tag or a key belongs to. Target must have instance-of = group.
Item bridges (Q4712) The group this item belongs to. In the current model, each key belongs to just one group. In theory we could use it to attach multiple groups, changing the meaning of the "group" to something like a "label"/"meta-tag".
status (P6)
Community acceptance status. Use reference to link to the proposal discussion page (P11).
  proposal discussion (P11)
  Link to the key or tag proposal page. Can be used as reference for status (P6).
Item approved (Q15)
  reference link
community's approval status, together with a reference link to the discussion page (optional)
key type (P9)
Type of the key entity, e.g. enum, external id. Do not use this for groups or statuses.
Item well-known values (Q8) what type of key is this? one of well known values, external id, etc.


  • There must be only one item per tag key, e.g. highway=* or landuse=*.
  • The key string will be stored as a permanent key ID (P16)
  • The key string will also be stored as a sitelink, e.g. highway becomes a link to Key:highway page (note that "Key:" is part of the title, and not a wiki namespace)
    • Sitelink will be shown in the upper right corner as a link
    • Ensures each sitelink is unique
    • Allows item's data to be used on the linked page with {{#statements:...}} and Lua
    • Unlike Wikipedia, sitelinks to non-existent Key:* pages is allowed
  • Item must have instance-of property set to Key
  • A bot will create all keys with a significant usage (see below)


Optional properties

Here are some additional properties that may be needed in some cases.

name type description
rename-to reference(s) for simple cases (e.g. frequent typos), indicates that this key should be replaced with another key, or possibly one of the other keys(?)
value validation regex (P13) string Regular expression to test the validity of the tag's value. May also be used for role names. The wrapping ^( and )$ are assumed. Do not use for enum-like values, e.g. noexit=yes should be a tag, not a regex.
See population (Q574) example.
limited to language (P26) item A qualifier property to specify when a statement only applies to a given documentation language, and nowhere else. Most of the time this is a mistake and should be fixed.
(must be used as a qualifier of another statement)
See noexit (Q501) example.
(DEPRECATED) excluding region qualifier (P27) item This qualifier will soon be deleted. Do not edit. A qualifier property to specify when a statement applies everywhere except a given region
(must be used as a qualifier of another statement)
See noexit (Q501) example.

Item Creation Process

An automated tool creates all significantly used tag keys when they are detected in the OSM database (taginfo API). The bot should:

  • create an item for any key with 10+ usages if it matches ^[a-z0-9]+([-:_\.][a-z0-9]+)*$, or for any 1000+ usages regardless of the key syntax (see talk page)
  • set item's label to be the same as the key (can be changed by the user, e.g. name:enEnglish Name )
  • set item's description from the corresponding wiki page's info card (if available, from all languages)
  • set used-by, recommended tags, implies, and any other easy-to-figure-out data from the info cards.
  • should NOT update any fields already set, e.g. if description does not match with wiki, leave it as is.

If the item's key was incorrect, the item should be edited with some metadata to indicate the proper name. Merging items will be prohibited because that would remove the original key (sitelink).

Eventually, it would be better for OSM tools (iD, JOSM, ...) to ask the user for the metadata, and use MW API to create new items.

Tag values

For keys like Key:highway, there is a list of the well-known (enum) values, e.g. highway=residential, highway=service, highway=footway. These items will be stored the same way as for Keys, except that sitelink will point to a Tag:key=value page, the instance-of will be Tag instead of Key, and it will have a link to the corresponding Key item, and it will use permanent tag ID (P19) instead of permanent key ID (P16).

Example

See bridge:movable=bascule (Q888) that describes a bridge:movable=bascule. See all items that link to bridge:movable.

property type value example description
local sitelink string Tag:bridge:movable=bascule link to the tag:* pages, even if they do not exist
instance of (P2) Item key (Q7) indicate the type of the item
permanent tag ID (P19) String "bridge:movable=bascule" shows the exact form of the tag as used in OSM. Must never be changed once the item is created.
ERROR: Invalid ID Items way (Q4), area (Q5) what kind of OSM objects this key should be used on - e.g. relation, way, area, node
image (DEPRECATED) (P4) Commons file An image from Wikimedia Commons (Wikibase cannot use images stored on OSM wiki itself)
key for this tag (P10) Item bridge:movable (Q104) link to the corresponding key item

Meta item

There will be a number of items that are neither Key nor a Tag. For example, there needs to be several items to represent meta concepts themselves, like Node, Way, Area, Relation, Key, Tag, and perhaps other. All such items are sub-classes of the OpenStreetMap concept (Q10) meta item.

Additionally, we may want to label each Key with one or more values, e.g. classify keys as belonging to roads / buildings / business / etc, in which case those labels will also have to be meta items.

API access and querying

  • The easiest way for an external tool to get all the data about a key is to use this API call:
https://wiki.openstreetmap.org/w/api.php?action=wbgetentities&sites=wiki&titles=Key:bridge:movable&languages=en|fr
Use languages to filter labels and descriptions to the needed languages.
Add &format=json to get the actual JSON instead of HTML.
Due to MediaWiki limitations, the titles value should be ("Key:" + key).replace('_', ' ').trim(). Use permanent key ID (P16) to get the actual format of the key.
  • Soon the Wikibase data should be made available as downloadable dumps. This will allow us to set up a SPARQL endpoint similar to https://query.wikidata.org, which in turn will help any tools to run complex queries against it.

Quality Control

There are several additional extensions designed to validate Wikibase data, and find items that do not pass validation. Installing such capabilities may not be done in the first deployment stage.

Limitations

  • This Wikibase cannot yet reference local images. So for the moment, all its images must come from the Wikimedia Commons.
  • The sitelink in the upper right corner does not show whether the Tag:* or a Key:* page exists or not
  • All sitelinks must use spaces instead of underscores. API sitelink search does not work otherwise. See permanent key ID (P16) and permanent tag ID (P19) for the correct value. Note that regular Mediawiki Key:* and Tag:* pages have the same issue, and use a special hack to change the title.
  • MediaWiki removes spaces/underscores from the key, so Key:_abc_ would become Key: abc. There are no way to have two items with sitelinks Key:_abc and Key:_abc_ -- they are treated as the same, and fail.

See also