WikiCite 2016 is an event focused on designing data models and technology to improve the coverage, quality, standards-compliance and machine-readability of citations and source metadata in Wikipedia, Wikidata and other Wikimedia projects. Our goal in particular is to define a technical roadmap for building a repository of Wikimedia references in Wikidata.
Building a central repository of citations in Wikidata
There is currently a lot of momentum around citations and bibliographic metadata at the Wikimedia Foundation, in the movement, and across a number of open science, library tech, and open access organizations. The idea of building a repository to store all citations and source metadata across Wikimedia projects has been proposed in different forms for the past 10 years. Wikipedia is currently one of the most popular entry points into the scientific literature and ranges among the top 5 referrers of scholarly citations. However, as of today, references and source metadata are still a second-class citizen in Wikimedia projects. References are the most fundamental building block of open knowledge, but:
- they are still served by fragile mechanisms such as citation templates;
- they are inconsistently represented across (and sometimes within) articles, languages and Wikimedia projects;
- the data they store is curated in multiple places and often ill-formed, incomplete and not machine-readable;
- references often fail to use identifiers such as DOIs or PubMed IDs, even when such identifiers are available for the cited source.
Reusing sources across articles, languages or projects is still a complex task, and conducting research on the use of references and citations in Wikimedia projects requires sophisticated information extraction skills. In addition to that, an overwhelming number of statements in Wikidata are currently not sourced at all or generically sourced to a Wikimedia site rather than specific references.
Initiatives such as the Wikipedia Library have focused primarily on the outreach and programmatic angle, while efforts focused on infrastructure in Wikimedia projects (like WikiProject Source Metadata and WikiProject Open Access) have not been strategically aligned across all parties involved. We believe the time is ripe for aligning and innovating around the major efforts to build the necessary infrastructure, data models, automated tools and user interfaces to build a central repository of citations in Wikidata and support high quality sourcing of free knowledge.
A timeline of previous efforts
- Seminal work towards the design of a central repository of citations leveraging Wikidata started at the 2012 Wikimedia Hackathon in Berlin.
- These efforts continued through a series of "citathons" hosted over the years at:
- 2014: Wikimania London
- 2014: Rich citations hackathon at the Public Library of Science
- 2015: Wikimania Mexico
- 2016: Wikimedia Developer Summit
- In 2015, a WikiBase test instance called LibraryBase with a dedicated SPARQL endpoint was set up on Wikimedia Labs.
- For a history of related initiatives predating Wikidata, see this page.
The event will be focused on the short-term goal of defining a roadmap and the technical requirements for developing a central Wikidata repository for references and bibliographic metadata. This will facilitate research on references across Wikimedia projects and provide much needed tools to support the sourcing work of volunteers and movement affiliates contributing open content to Commons and Wikidata.
We will also discuss a long-term vision aiming to facilitate the integration of citation and bibliographic metadata with other scholarly and linked data repositories. You can learn more about these goals from a joint presentation we gave in December at WMF in partnership with Crossref.
We are bringing together Wikidatans, Wikipedians, developers, data modelers, and information and library science experts from organizations including Wikimedia DE, Wikimedia IT, Wikimedia DC, Wikimedia NYC, Internet Archive, Crossref, Zotero, CSL, JISC, ContentMine, Google, Datacite, NISO, OCLC, Fondazione Bruno Kessler, eLife Sciences, RefMe and the NIH. We are also inviting academic researchers from several institutions with experience working with Wikipedia's citations and bibliographic data (w:CNRS, w:Ecole Normale Supérieure, w:University of Aix-Marseille, w:Imperial College London, w:University of Pittsburgh, w:University of Chicago, w:University of Leeds, w:University of Manchester, w:Wellcome Trust Sanger Institute, w:University of Wurzburg, w:University of Trento, w:University of Miami Library, w:Mannheim University Library, the w:German National Library of Science and Technology, w:University of Washington, w:TU Denkmark).
Please add your name here after filling out the EventBrite registration form (you'll receive a link from the organizers)
- Dario Taraborelli (Wikimedia Research, organizer)
- Daniel Mietchen (National Institutes of Health - NIH, organizer)
- Lydia Pintscher (Wikimedia Deutschland, Wikidata, organizer)
- Jonathan Dugan (organizer)
- Patrice Bellot (LSIS / OpenEdition Lab)
- Scott Chamberlain (rOpenSci)
- Thomas Steiner (Google)
- Marin Dacos (OpenEdition Lab)
- Luca Martinelli (Wikimedia Italia)
- Finn Årup Nielsen (TU Denkmark)
- Nettie Lagace (National Information Standards Organization, NISO)
- Sebastian Karcher (Northwestern University)
- Marco Fossati (Fondazione Bruno Kessler)
- Cristian Consonni (Wikimedia Italia, Università di Trento)
- Philipp Zumstein (Mannheim University Library)
- James Hare (WikiProject X, Wikimedia DC)
- Merrilee Proffitt (OCLC Research)
- Chris Wilkinson (eLife Sciences)
- Joe Wass (Crossref)
- Mike Showalter (OCLC)
- BrillLyle aka Erika Herzog (Wikimedia NYC)
- Laura Rueda (DataCite)
- Rachael Lammey (Crossref)
- Antonin Delpeuch (Dissemin)
- Jake Orlowitz (Ocaasi) (Wikipedia Library)
- Andra Waagmeester (Micelio)
- Chris Keene (Jisc)
- John Kaye (Jisc)
- Lambert Heller (TIB, Germany)
- Jens Nauber (SLUB Dresden, Germany)
- Mairelys Lemus-Rojas (University of Miami Libraries)
- Chiara Storti (Wikimedia Italia)
- Brian Mingus (Cognatory)
- Sébastien Santoro
- Daniel Kinzler (Wikimedia Deutschland, Wikidata)
- Adam Shorland (Wikimedia Deutschland, Wikidata)
- Katie Filbert (Wikimedia Deutschland, Wikidata)
- Katherine Thornton (University of Washington)
- Jon Tennant (Imperial College London, ScienceOpen)
- Jonas Kress (Wikimedia Deutschland)
- Jan Zerebecki (Wikimedia Deutschland)
- Andrea Zanni (Wikisource)
- Eamon Duede (University of Chicago)
- Marielle Volz (Wikimedia Foundation)
- Karen Coyle
- Heather Ford (University of Leeds)
- Magnus Manske (Wellcome Trust Sanger Institute)
- Till Sauerwein (Universität Würzburg)
- Konrad Förstner
- Adam Becker
- Aaron Halfaker (Wikimedia Research)
- Alex Kalderimis (RefMe)
- Thomas Arrow (ContentMine)
- Nettie Lagace (NISO)
How to apply
Applications are now closed
- if you were pre-invited and have already filled in a form, you will receive a separate note with a registration link from the organizers
- if you have not been invited but you would like to participate, please submit an application to give us some information about you and your interest and expected contribution to the event. We'll send out notifications of acceptance by April 15.
- March 29, 2016
- applications open
- April 11, 2016
- applications close
- April 15, 2016
- notifications of acceptance are issued (if you applied for a travel grant, we'll be able to confirm by this date if we can cover the costs of your trip)
- May 25-26, 2016
- event takes place
- The precise schedule has not been defined yet, but we expect to start ca. 9am on both days, with an open end on May 25 and a closing session around 5pm on May 26.
- The focus will be on dialogue, not monologues, and on hands-on activities aimed at improving or establishing workflows or learning by doing.
- GLS Campus Berlin
- Kastanienallee 82
- 10435 Berlin Prenzlauer Berg
- phone: +49 (030) 780 089 550
- email: firstname.lastname@example.org
- Nearby hotels include:
- Dario Taraborelli
- Jonathan Dugan
- Lydia Pintscher
- Daniel Mietchen
- Cameron Neylon
You can contact the organizers via email@example.com
WikiCite is cohosted by the Wikimedia Foundation and Wikimedia Deutschland. It is generously supported by Crossref, the Gordon and Betty Moore Foundation, and others (additional organizations will be announced soon). Funding to cover the cost of the event has been approved by the Wikimedia Foundation Board of Trustees.
- Source Metadata Project Workboard at Phabricator
- December 2015: Wikipedia as the front matter to all research on Meta – joint Crossref/Wikimedia presentation on citations and references in Wikimedia projects
- 2014: Reform of citation structure for all Wikimedia projects at IdeaLab
- 2012: Wikidata Bibliographic data notes from Berlin Hackathon 2012
- 2012: Open Metadata Handbook at Wikibooks