Jump to content

Community Wishlist Survey 2017/Wikidata/Create a new class of statements which are automatically generated based on a query

From Meta, a Wikimedia project coordination wiki

Create a new class of statements which are automatically generated based on a query

  • Problem: Many properties related to an item are stored in other items and are presently very hard to access by templates using wikidata. For example:
  • taxon items have property parent taxon (P171). If I have an infobox that shows genus, family or order of a given organism I need a way to move up a chain of P171s until some rank is met.
  • people items have property place of death (P20) that stores most exact item related to place of death, which could be a house, street, hospital, neighborhood, etc. If I have an infobox that shows place of death of a person I usually need city or town where someone died. A query to look up a city of death of Pyotr Tchaikovsky is SELECT DISTINCT ?city { ?city ^(wdt:P20/wdt:P131*) wd:Q7315; wdt:P31/wdt:P279* wd:Q515 . }. It is very hard to access that information using Lua calls and it is totally not accessible through {{#statement:...}} calls. Similar issue would be for "Country of birth" or "Country of death".
  • We have several properties which have inverse constraint, for example mother (P25) / child (P40). We could retire child (P40) property and automatically calculate it from mother (P25) property. That would allow us to keep the information only in one place.
  • Who would benefit: users of the wikidata, infobox writers, maintainers of the wikidata
  • Proposed solution: Create infrastructure to allow read-only properties which are not directly editable but precomputed based on some SPARQL query and other properties and items. Users would see and access them in a way similar to the current properties.
  • More comments:

Discussion

[edit]
  • This sounds like adding a reasoning layer to wikidata query service. Maybe should be a separate instance, but in principle this sounds like it could be a good idea. Or by pre-compute do you mean to re-compute these things every time something changes? That might be a lot trickier (do you update every location in a country if the country name is changed?) ArthurPSmith (talk) 20:22, 15 November 2017 (UTC)[reply]
    I do not know how often one would have to refresh such pre-computed statements, but whatever it is I am sure it will be more efficient than the current state of infobox templates doing in Lua operations equivalent to SELECT DISTINCT ?city { ?city ^(wdt:P20/wdt:P131*) wd:Q7315; wdt:P31/wdt:P279* wd:Q515 . }, just to get a city of death. I was looking into doing it for the c:Module:Creator infobox I maintain, and was advised by others that already implemented it, but with potential of loading multiple items to get this one piece of information, I figured out that there has to be a better solution. --Jarekt (talk) 21:31, 15 November 2017 (UTC)[reply]
  • I don't know that we need a new class of statements--what we would need would be to be able to say "this property can be followed" in the data exposed to the API, have the API follow it, until such time as it cannot be followed or it hits some predefined limit (by an editor maybe), or some such. --Izno (talk) 03:41, 16 November 2017 (UTC)[reply]
  • @Jarekt:, the problem here is that we're unlikely to have a Wikidata Query Service cluster powerful enough for what's proposed here until late 2018, so everything involving SPARQL queries falls outside of the scope of this proposal. See also my rejection of a somewhat related proposal here - SPARQL queries can be very slow and it's completely out of question that we allow them to slow down page viewing/editing. I'm not archiving this proposal, however, to see if a limited simpler solution that doesn't involve WDQS can be viable. Max Semenik (talk) 09:01, 24 November 2017 (UTC)[reply]
Max Semenik I understand, that the proposal might not be technically feasible, but if nothing else we can figure out how many people think it is a good idea. I was also imagining that the values would be precomputed and stored or cashed. I was also imagining that it might be saving time, because when I was asking about how to access city of birth using Lua I was told about modules that already do equivalent of SELECT DISTINCT ?city { ?city ^(wdt:P20/wdt:P131*) wd:Q7315; wdt:P31/wdt:P279* wd:Q515 . } in Lua. However since that would require loading of many items for a single piece of information, precomputing seemed like more sane solution.

Voting

[edit]