WikiCite 2016/Proposals/Generation of referenced Wikidata statements with StrepHit
Appearance
Proposal
[edit]Background
[edit]Data quality in Wikidata is crucial and references to trustworthy third-party sources are a way to ensure it. Lots of Wikidata statements are either unsourced or sourced to Wikimedia sister projects (typically Wikipedia via bots). Adding references to such small units of information may be a cumbersome task for human editors.
StrepHit wants to relieve this effort: it is a Natural Language Processing system that reads documents across reliable Web sources and produces referenced Wikidata statements.
Aim
[edit]- Play with the current StrepHit dataset: biographies in English;
- create and fill a Request for Comments;
- encourage referenced data donations through the primary sources tool:
- @Daniel Mietchen, Aubrey, and Thomas: follow up past discussions with ContentMine and Hypothes.is people.
Demo
[edit]Install the primary sources tool gadget to check out the StrepHit dataset: instructions at wikidata:Wikidata:Primary_sources_tool#How_to_use
Skills needed
[edit]- Basic understanding of how Wikidata works;
- communication strategies for community engagement, in order to:
- raise awareness of StrepHit's potential impact;
- attract new primary sources tool users.
Phabricator task
[edit]None yet.
See also
[edit]Participants
[edit]- Hjfocs
- Aubrey (talk) 08:06, 21 May 2016 (UTC)
- add your name here