Research talk:Wikidata gap analysis

From Meta, a Wikimedia project coordination wiki

Importance levels for properties?[edit]

@Ironholds: At Research:Wikidata_gap_analysis#Proposed_Wikidata_improvements: "Importance levels for properties, either on a per-property or per-item basis, should be introduced ..."(etc). Seems like you're describing ranks. Were you aware of ranks when you wrote this or is this a new concept to you? Do ranks cover what you were thinking of? If not, why not? Multichill (talk) 19:04, 8 June 2015 (UTC)[reply]

"ranks" are for ranking multiple values for the same property. They don't help rank different properties. Anyway who says that being president is more important than winning the nobel prize? There have been 60 presidents but only a few of those have won the nobel. Filceolaire (talk) 13:09, 10 June 2015 (UTC)[reply]

See this discussion on wikidata[edit]

See d:Wikidata:Project_chat#Evaluation_of_WD_by_WMF for a discussion of this report on wikidata. Filceolaire (talk) 12:57, 10 June 2015 (UTC)[reply]

My general comment would be that if WMF want to use info from Wikidata then they need to participate more in wikidata, reviewing property proposals etc. If they had done this then they would, for instance, have learned much sooner that generating descriptions from statements is a much quicker route to getting descriptions in 200 languages than relying on the "Description" field. There are 20 million item descriptions but there are less than 2000 properties.
More dialog needed. Filceolaire (talk) 13:43, 10 June 2015 (UTC)[reply]
There seems to be a pretty massive disconnect in that study. We're not studying the project to focus on what Wikidata needs to be self-contained for Wikidata; we're studying it to focus on what Wikidata needs to actually be integrated into other projects. See the examples at the top; more properties would absolutely be great but I fail to see how that would help to provide, say, language-localised search of any reliability. Ironholds (talk) 15:17, 10 June 2015 (UTC)[reply]

See also de:Wikipedia_Diskussion:Meinungsbilder/Nutzung_von_Daten_aus_Wikidata_im_ANR/Vor_dem_Start#WMF_Analyse_von_Wikidata. --Atlasowa (talk) 13:05, 12 June 2015 (UTC)[reply]

Atlasowa, could you translate your first comment there? Because Google Translate doesn't make it seem particularly polite, and I have absolutely no intention of spending my time discussing research with people who can't afford to be polite. Ironholds (talk) 19:04, 12 June 2015 (UTC)[reply]
Not sure what you mean, Ironholds. I wrote that this analysis was requested by T91423 and quoted your conclusion and recommendation. Then i added that the analysis didn't use the words "references" or "sources" and that the research question RQ7: Trustworthiness wasn't even examined. Which i find surprising and disappointing ("wow" because sourcing and trustworthiness is really important. Wikidata project goals state "Wikidata will not be about the truth, but about statements and their references.") The context of the german discussion is a RfC about allowing/not allowing wikidata usage in german WP and to what extent/how. It's a long and serious discussion and lack of sources, references and trustworthiness in wikidata is a major point.
Wikidata "description" shown on Wikipedia.
„Live“ vandalism from 9. December 2014 to 29. January 2015
What i find strange is that this gap analysis falls quite short in actual data analysis (Succu wrote: "Ironholds, please publish the dataset of your study." [1]), while you give longer conclusions, recommendations and improvement proposals. Your "Proposed Wikidata improvements" 2) is "A focus should be put on description localisation, which is currently lagging substantially behind the unstructured data on our projects and the structured labels on Wikidata". How trustworthy are these wikidata "descriptions" according to your gap analysis, currently? Because Wikipedians see that WMF is silently pushing these wikidata "descriptions" into official Wikipedia apps and Wikipedia mobile and that they are vandalised and hardly patrolled, and that WMF is doing this without consensus or even consultation of the community. Who needs a "wikidata description" anyway between the WP article lemma "Adventskalender" and the intro "Ein Adventskalender (in Österreich Adventkalender; auch Weihnachtskalender) gehört seit dem 19. Jahrhundert zum christlichen Brauchtum in der Zeit des Advents."? This doesn't really add value to Wikipedia but rather adds a new vandalism entry point, from a Wikipedian perspective. Now to the wikidata perspective. It is not wikidatas goal to build a free-form text mini-wikipedia, but to build a free database of structured data. Automatic descriptions can be constructed from wikidata statements (https://phabricator.wikimedia.org/T64695 http://tools.wmflabs.org/autodesc "13M items x 287 languages = 4 billion descriptions to fill in manually." "a little developer time can save megahours (new unit!) of volunteers performing needless work.") And if you want natural language, there is http://www.mediawiki.org/wiki/Extension:TextExtracts from Wikipedia articles. So why is WMF pushing for manual wikidata descriptions? And i don't see why they are needed for multilingual search anyway?
Re: "Infobox usage could be investigated once those problems have been addressed." You DO realise that WMF is already silently pushing Wikidata infoboxes into Wikipedia? Again, without consensus or even consultation of the community. For reading AND editing. No indication that this is not Wikipedia, but wikidata data. Allowing to add personal religious beliefs without sources/references, despite Wikidata d:Property:P140 "religion of a person or organization (MUST be claimed by the subject or documented by historical sources)"! This is already built by WMF and online - what is actually the point of the WMF analysis? Any consequences? Will this experiment be pulled? --Atlasowa (talk) 14:59, 15 June 2015 (UTC)[reply]
I know it's not Wikidata's goal; it is, however (or was when I did this report) one of the WMF's goals to actually be able to surface this data in context. I don't design the apps and I'm not a community liaison so I'm not sure why you're asking me hypothetical questions around the design of the app, or bringing the WMF's overall actions in here. That has nothing to do with this research, which was a highly specific question from my then-manager that I answered. I have no idea what you mean by "experiment"; this isn't an experiment, it was a one-off research report. Ironholds (talk) 15:35, 15 June 2015 (UTC)[reply]