Data items: Difference between revisions

From OpenStreetMap Wiki
Jump to navigation Jump to search
Content deleted Content added
Yurik (talk | contribs)
m Yurik moved page OpenStreetMap:Wikibase to OpenStreetMap:Data Items: Wikibase has caused some confusion
Yurik (talk | contribs)
Line 48: Line 48:
! property || type || value example || description
! property || type || value example || description
|-
|-
| description || string || '''en''' - ''The mechanism by which a movable bridge moves to clear the way below.''<br>'''ru''' - ''Механизм, которым переносной мест освобождает проходимость внизу.'' || This is the primary way to describe the key, using proper sentences (first word capitalized, ending with a period). Must not contain any wiki markup or HTML. Must be less than 250 symbols. When translating, it is usually enough to add just the description to the item.
| label || string || '''en''' - <code>bridge:movable</code> || Set '''English''' to the key's value, exactly the same as '''P16''' below. Some languages have '''nativekey''' (localized key). Use the corresponding label language for that. ''Note that same as "en", the localized label must be unique in that language.''
|-
|-
| label || string || '''en''' - <code>bridge:movable</code> || Label usage is still a bit undecided for the key/tag data items. For now, bot sets the English label to the key's value, exactly the same as '''P16''' below. Some languages have '''nativekey''' (localized key) that was added to the labels as well. Do not add a copy of the English label to any other languages. ''Note that same as "en", the localized label must be unique in that language.''
| description || string || '''en''' - ''The mechanism by which a movable bridge moves to clear the way below.''<br>'''ru''' - ''Механизм, которым переносной мест освобождает проходимость внизу.'' || Describe the key using proper sentences (first word capitalized, ending with a period). Must not contain any wiki markup or HTML. Must be less than 250 symbols.
|-
|-
| sitelink || string || [[Key:bridge:movable]] || Links to the '''Key:...''' pages, even if the page does not exist. Sitelink is shown in the upper right corner of the item page.
| sitelink || string || [[Key:bridge:movable]] || Links to the '''Key:...''' pages, even if the page does not exist. Sitelink is shown in the upper right corner of the item page.

Revision as of 02:12, 16 December 2018

Intro

This page documents how to store structured tag metadata on this wiki using data items provided by the Wikibase extension - the same software that runs Wikidata. (initial discussion)

Data items allow OSM community to store multilingual tag descriptions and any other community-defined metadata on the OSM wiki in a way useful to both humans and tools.

  • Tools, such as iD editor and Taginfo are now able to get tag information without complex and error-prone parsing of the wiki markup. Eventually the data may include tag suggestions, validation rules, common pitfalls, and more.
  • Data consumers are able to get structured metadata to help process main OSM database
  • This wiki can now show data as info cards and tables, without duplicating and complicated template hackery.
  • All metadata can be analyzed using Sophox queries.

This project's goal is NOT to replace the primary tag storage for the OSM database, nor to use opaque IDs instead of the human readable key=value strings to tag features. We are only trying to improve metadata documentation, making it more useful to various tools.

How can I help?

Looking for volunteers...  click "show" -->
Community and content
  • Set up a wiki portal, possibly similar to Wikidata's community portal (but simpler), where community can:
    • propose new properties
    • write guidelines/docs
    • discuss Wikibase data structures
  • Create Lua modules to generate tag tables, such as {{Template:Bridge:movable}}, {{Map Features:highway}}, or {{Template:Religions}}.
    • Implementation note: Wikibase only links Tags to the corresponding Key, but Keys do not list all possible Tags. To generate a table, we must have a list of items somewhere. We could create a new WB key property that lists all tags, and use a bot to maintain it, or we could list all needed tags as a template parameter, e.g. for highway, {{...|motorway|trunk|primary|secondary|...}}. List as a template parameter does not need to be localized, and it could specify proper ordering of items (not available in WB). Lua code would use mw.wikibase.getEntityIdForTitle("Key:highway=motorway") to find the right data.
Technical
  • Add Wikibase support to external tools. Simple usage: get key/tag localized description. Complex usage: allow user to add missing or even edit description, especially when user is creating a new key.
  • Port simple validation rules, e.g. regex-based, to use Wikibase data.
  • Help parse various tables of tag data. Even if you can only generate plain files with data, user:Yurik can quickly import them.
tasks in progress
done!

Tag Keys

Each OSM Key is stored as a separate page in the Item namespace. For example, see bridge:movable (Q104) that describes a bridge:movable=*:

property type value example description
description string en - The mechanism by which a movable bridge moves to clear the way below.
ru - Механизм, которым переносной мест освобождает проходимость внизу.
This is the primary way to describe the key, using proper sentences (first word capitalized, ending with a period). Must not contain any wiki markup or HTML. Must be less than 250 symbols. When translating, it is usually enough to add just the description to the item.
label string en - bridge:movable Label usage is still a bit undecided for the key/tag data items. For now, bot sets the English label to the key's value, exactly the same as P16 below. Some languages have nativekey (localized key) that was added to the labels as well. Do not add a copy of the English label to any other languages. Note that same as "en", the localized label must be unique in that language.
sitelink string Key:bridge:movable Links to the Key:... pages, even if the page does not exist. Sitelink is shown in the upper right corner of the item page.
instance of (P2)
that class of which this subject is a particular example and member (subject typically an individual member with a proper name label); different from P3 (subclass of)
Item key (Q7) Indicate the type of the item. Set to Q7 for keys.
permanent key ID (P16)
A string representing the key ID. Once set on a key data item, this value should never be changed.
String bridge:movable Shows the exact form of the key as used in OSM. Must never be changed once the item is created. Due to technical limitations, keys "Key:water tap", "Key:water_tap", and "Key:water_tap_" have identical wiki pages/sitelinks - "Key:water tap". In this case, set this property to multiple strings, but mark one as "preferred" (small up arrow left of value).
ERROR: Invalid ID
Lua error in Module:OSMWikiBase at line 84: attempt to index local 'entity' (a nil value).
Items way (Q4), area (Q5), relation type (Q6) What kind of OSM objects this key should be used on - e.g. relation, way, area, node. Use (DEPRECATED) excluding region qualifier (P27) to exclude certain regions. See noexit (Q501) example.
ERROR: Invalid ID
Lua error in Module:OSMWikiBase at line 84: attempt to index local 'entity' (a nil value).
Items node (Q3) What kind of OSM objects this key should NOT be used on. Use limited to language (P26) to limit this just to certain regions. See noexit (Q501) example.
image (DEPRECATED) (P4)
image of relevant illustration of the subject
Commons file An image from Wikimedia Commons. Technical limitations do not allow OSM own images to be used here. Please upload our local images to Commons under the proper license (preferred), or use image (P28). To use a different image for a specific language region, add another value and set limited to language (P26). Make sure to set preferred status for the default image.
image (P28)
Image of a relevant illustration of the subject. (Note: without "File:" and set one to "preferred rank")
String Noexit.jpg An image stored on OSM wiki, without the File: prefix. If possible, please use image (DEPRECATED) (P4) with an image from Commons. See previous line.
group (P25)
Indicates which group the given tag or a key belongs to. Target must have instance-of = group.
Item bridges (Q4712) The group this item belongs to. In the current model, each key belongs to just one group. In theory we could use it to attach multiple groups, changing the meaning of the "group" to something like a "label"/"meta-tag".
status (P6)
Community acceptance status. Use reference to link to the proposal discussion page (P11).
  proposal discussion (P11)
  Link to the key or tag proposal page. Can be used as reference for status (P6).
Item approved (Q15)
  reference link
community's approval status, together with a reference link to the discussion page (optional)
key type (P9)
Type of the key entity, e.g. enum, external id. Do not use this for groups or statuses.
Item well-known values (Q8) Describes the type of values this key is expected to have. If there is a well known list of values, use Q8. Other types are TBD.
value validation regex (P13)
Regular expression to test the validity of the tag's value. May also be used for role names. The wrapping ^( and )$ are assumed. Do not use for enum-like values, e.g. noexit=yes should be a tag, not a regex.
string [0-9]+ A regular expression that can be used to validate the value of this key. In this case the value must be one or more digits. Validators will add the ^ and $ symbols.
See population (Q574) example.

Tag values

For keys like Key:highway, there is a list of the well-known values such as highway=residential, highway=service, highway=footway. These values are stored similarly to keys. See bridge:movable=bascule (Q888) that describes a bridge:movable=bascule. See all items that link to bridge:movable.

property type value example description
label string en - bridge:movable=bascule Set English to the tag's value, exactly the same as P19 below. Some languages have nativekey=nativevalue (localized key/value). Use the corresponding label language for that. Note that same as "en", the localized label must be unique in that language.
description string en - A type of movable bridge, a bascule bridge contains one or two spans, one end of which is free and swings upwards. A counterweight at the pivoting end of the span or spans balances the weight as the free end rises.
pl - Most zwodzony jest to rodzaj mostu w którym co najmniej jedno przęsło jest podnoszone. Mosty zwodzone mogą być jedno- lub dwuskrzydłowe.
Describe the tag using proper sentences (first word capitalized, ending with a period). Must not contain any wiki markup or HTML. Must be less than 250 symbols.
sitelink string Tag:bridge:movable=bascule Links to the Tag:... pages, even if the page does not exist. Sitelink is shown in the upper right corner of the item page.
instance of (P2) Item tag (Q2) Indicate the type of the item. Set to Q2 for tags.
permanent tag ID (P19) String bridge:movable=bascule Shows the exact form of the key as used in OSM. Must never be changed once the item is created. Due to technical limitations, tags "Tag:water tap=yes", "Tag:water_tap=yes", and "Tag:water_tap=yes_" have identical wiki pages/sitelinks - "Tag:water tap=yes". In this case, set this property to multiple strings, but mark one as "preferred" (small up arrow left of value).
key for this tag (P10) Item bridge:movable (Q104) Every tag item links to the corresponding key item, making it easier to easier to query and validate.

Tags may also use ERROR: Invalid ID, ERROR: Invalid ID, image (DEPRECATED) (P4), image (P28), group (P25), status (P6), value validation regex (P13). See their description in Tag Key section above.

Meta item

There will be a number of items that are neither Key nor a Tag. For example, there needs to be several items to represent meta concepts themselves, like Node, Way, Area, Relation, Key, Tag, and perhaps other. All such items are sub-classes of the OpenStreetMap concept (Q10) meta item.

Additionally, we may want to label each Key with one or more values, e.g. classify keys as belonging to roads / buildings / business / etc, in which case those labels will also have to be meta items.

Item Creation Process

A bot has created all significantly used keys and tags, and will continue creating these items when they are detected in the OSM database (taginfo API) or on the wiki. The bot will:

  • create an item for any key with 10+ usages if it matches ^[a-z0-9]+([-:_\.][a-z0-9]+)*$, or for any 1000+ usages regardless of the key syntax (see talk page)
  • set item's label to be the same as the key
  • set item's description from the corresponding wiki page's info card (if available, from all languages)
  • set used-by, recommended tags, implies, and any other easy-to-figure-out data from the info cards.
  • will NOT update any fields modified by a user, e.g. if description in FR has been changed by a user, it should not be changed by the bot.

Eventually, it would be better for OSM tools (iD, JOSM, ...) to ask the user for the metadata, and use MW API to create new items.

API access and querying

  • The easiest way for an external tool to get all the data about a key is to use this API call:
https://wiki.openstreetmap.org/w/api.php?action=wbgetentities&sites=wiki&titles=Key:bridge:movable&languages=en|fr
Use languages to filter labels and descriptions to the needed languages.
Add &format=json&formatversion=2 to get the actual JSON instead of HTML.
Due to MediaWiki limitations, the titles value should be ("Key:" + key).replace('_', ' ').trim(). Use permanent key ID (P16) to get the actual format of the key. Make sure to get the "preferred" value, just in case more than one value is present.

Quality Control

There are several additional extensions designed to validate Wikibase data, and find items that do not pass validation. Installing such capabilities may not be done in the first deployment stage.

Limitations

  • This Wikibase cannot yet reference local images. So for the moment, all its images must come from the Wikimedia Commons.
  • The sitelink in the upper right corner does not show whether the Tag:* or a Key:* page exists or not
  • All sitelinks must use spaces instead of underscores. API sitelink search does not work otherwise. See permanent key ID (P16) and permanent tag ID (P19) for the correct value. Note that regular Mediawiki Key:* and Tag:* pages have the same issue, and use a special hack to change the title.
  • MediaWiki removes spaces/underscores from the key, so Key:_abc_ would become Key: abc. There are no way to have two items with sitelinks Key:_abc and Key:_abc_ -- they are treated as the same, and fail.

See also