Group 4: (Semi-)automated ways to add references to Wikidata statements
Room 124, 4:00 - 6:00 pm • Etherpad: Room 124
Improve tools for semi-automated statement and reference creation (e.g. StrepHit, ContentMine)
- Adam Shorland (Wikimedia Deutschland, Wikidata), Thursday
- Alex Kalderimis (RefMe), Wednesday
- Marco Fossati (Fondazione Bruno Kessler (FBK)), both days
- Scott Chamberlain (rOpenSci), Wednesday
- Thomas Arrow (ContentMine), Thursday
- Till Sauerwein (Universität Würzburg (University of Wurzburg)), both days
This work group had very specific goals that were met. The report is included in wikidata:Wikidata:Requests_for_comment/Semi-automatic_Addition_of_References_to_Wikidata_Statements.
Quoting from the StrepHit project page:
- The trustworthiness of Wikidata assertions plays the most crucial role in delivering a high-quality, reliable Knowledge Base: in order to assess their truth, assertions should be validated against third-party resources, and few efforts have been carried out under this perspective. One form of validation can be achieved via references to external (i.e, non-wiki), authoritative sources. This has motivated the development of the primary sources tool: it will serve as a platform for users to either accept or reject new references and/or assertions coming from third-party datasets. We argue that there is a need for datasets which guarantee at least one reference for each assertion.
The StrepHit and ContentMine teams have a common vision for Wikidata. If we manage to join forces through the support of the Wikimedia Grants program (both StrepHit and ContentMine have proposals for project grants, cf. #Resources), we will produce protocols for Wikimedia that leverage semi-automated approaches to extract facts from reliable Web sources.
The discussion focused on the primary sources tool usability, a platform for data curation in Wikidata. The outcomes can be browsed at wikidata:Wikidata:Requests_for_comment/Semi-automatic_Addition_of_References_to_Wikidata_Statements. Previous discussion on the tool is available at wikidata:Wikidata_talk:Primary_sources_tool.
- 1.1 beta release: https://github.com/Wikidata/StrepHit/releases/tag/1.1-beta
- technical documentation: mw:StrepHit
- initial IEG: Grants:IEG/StrepHit:_Wikidata_Statements_Validation_via_References
- renewal proposal: Grants:IEG/StrepHit:_Wikidata_Statements_Validation_via_References/Renewal
- midpoint report: Grants:IEG/StrepHit:_Wikidata_Statements_Validation_via_References/Midpoint
- final report: Grants:IEG/StrepHit:_Wikidata_Statements_Validation_via_References/Final
- Primary sources tool
- initial project page: wikidata:Wikidata:Primary_sources_tool
- central discussion: wikidata:Wikidata:Requests_for_comment/Semi-automatic_Addition_of_References_to_Wikidata_Statements
Appendix: workgroup notes
Raw notes from group 4.