WikiCite 2016/Report/Group 4
Group 4: (Semi-)automated ways to add references to Wikidata statements
[edit]Room 124, 4:00 - 6:00 pm • Etherpad: Room 124
Goal
[edit]Improve tools for semi-automated statement and reference creation (e.g. StrepHit, ContentMine)
Participants
[edit]- Adam Shorland (Wikimedia Deutschland, Wikidata), Thursday
- Alex Kalderimis (RefMe), Wednesday
- Marco Fossati (Fondazione Bruno Kessler (FBK)), both days
- Scott Chamberlain (rOpenSci), Wednesday
- Thomas Arrow (ContentMine), Thursday
- Till Sauerwein (Universität Würzburg (University of Wurzburg)), both days
Summary
[edit]This work group had very specific goals that were met. The report is included in wikidata:Wikidata:Requests_for_comment/Semi-automatic_Addition_of_References_to_Wikidata_Statements.
Introduction
[edit]Quoting from the StrepHit project page:
- The trustworthiness of Wikidata assertions plays the most crucial role in delivering a high-quality, reliable Knowledge Base: in order to assess their truth, assertions should be validated against third-party resources, and few efforts have been carried out under this perspective. One form of validation can be achieved via references to external (i.e, non-wiki), authoritative sources. This has motivated the development of the primary sources tool: it will serve as a platform for users to either accept or reject new references and/or assertions coming from third-party datasets. We argue that there is a need for datasets which guarantee at least one reference for each assertion.
Recommendations
[edit]The StrepHit and ContentMine teams have a common vision for Wikidata. If we manage to join forces through the support of the Wikimedia Grants program (both StrepHit and ContentMine have proposals for project grants, cf. #Resources), we will produce protocols for Wikimedia that leverage semi-automated approaches to extract facts from reliable Web sources.
Discussion
[edit]The discussion focused on the primary sources tool usability, a platform for data curation in Wikidata. The outcomes can be browsed at wikidata:Wikidata:Requests_for_comment/Semi-automatic_Addition_of_References_to_Wikidata_Statements. Previous discussion on the tool is available at wikidata:Wikidata_talk:Primary_sources_tool.
Resources
[edit]- StrepHit
- 1.1 beta release: https://github.com/Wikidata/StrepHit/releases/tag/1.1-beta
- technical documentation: mw:StrepHit
- initial IEG: Grants:IEG/StrepHit:_Wikidata_Statements_Validation_via_References
- renewal proposal: Grants:IEG/StrepHit:_Wikidata_Statements_Validation_via_References/Renewal
- midpoint report: Grants:IEG/StrepHit:_Wikidata_Statements_Validation_via_References/Midpoint
- final report: Grants:IEG/StrepHit:_Wikidata_Statements_Validation_via_References/Final
- Primary sources tool
- initial project page: wikidata:Wikidata:Primary_sources_tool
- central discussion: wikidata:Wikidata:Requests_for_comment/Semi-automatic_Addition_of_References_to_Wikidata_Statements
- ContentMine
- homepage: http://contentmine.org/
- Project Grant proposal: Grants:Project/WikiFactMine
Appendix: workgroup notes
[edit]Raw notes from group 4.