Wikidata/Development/Representing values

From Meta, a Wikimedia project coordination wiki

This document is a draft, and should not be assumed to represent the ultimate structure.

This draft describes how data values in Wikidata would be represented. Feedback and comments are more than welcome. This draft will refine the Wikidata data model.

Number and quantities[edit]

This refines and should eventually replace the section on numbers in the data model

A number is represented by its quantity value, together with the uncertainties, a confidence, and an optional unit of measurement:

  • quantity value: signed decimal number
  • upper: unsigned decimal number
  • lower: unsigned decimal number
  • unit: a Wikidata item (e.g. Q11573 for Metre)

The given quantity value is interpreted as the main value of the Value. Upper and lower specify variations or uncertainties of the quantity value in the positive (upper) or negative (lower) direction. This allows to capture expressions such as 12300 +/- 100. For many practical purposes, only the quantity value might be used (e.g., for sorting), but the uncertainty can provide valuable information for presentation (e.g., for selecting a reasonable representation in unit conversions).

In order to represent that a quantity value has a certain number of significant digits, the uncertainty would be expressed as +- 1 against the least significant digit. Example: to say a city has a population of 12,300 people with 3 significant digits we would express with a quantity value of 12300 and an uncertainty of 100. If we say that there are four significant digits, we would use the same quantity value but an uncertainty of 10.

The unit specifies the unit the quantity has been measured in. It is represented as an item rather than as a string, since a string like "m" might represent different units in different contexts. The value should be meaningful independently of the declaration information for its property (from which more details about units could possibly be obtained).

Coordinate[edit]

For the updated and final version, see the section on geographic locations in the data model

Time[edit]

For the updated and final version, see the section on dates and times in the data model

Other data types[edit]

The other data types are described and listed in the data model. They are:

  • Items
  • Media
  • Strings
  • Mono- and multilingual texts
  • URLs