Research:Wikiwho Provenance Api
This page in a nutshell:This research aims to provide an API for retrieving provenance and change information of single tokens in any arbitrary Wikipedia article revision.
We aim to provide a performant API to request for every single token (words + functional characters) in any specified Wikipedia article the revision of origin for that token, and all changes ever applied to it - with high accuracy. In this way metadata like the original author and the presence (in revision and/or time) of each token in the past can be retrieved - this can also be used for extracting disputes about content.
We currently offer the service in English, German, Turkish and Basque, with more languages planned.
A beta version of the API is live and working for en.wikipedia.org, although we are working on the performance. See API and documentation here: https://api.wikiwho.net/api/
The algorithm used to mine provenance for single tokens is described in our corresponding paper, including runtime and precision evaluations for English. Further information can also be found at f-squared.org/wikiwho
Regarding the "precision" of the method: Former research  has shown that the task of identifying the "correct" original author of a piece of text in a WP article is not trivial. Therefore we rely on an extraction method that has to be scientifically proven to perform at 95% percent precision , higher than any other algorithm proposed for the task, as far as we can tell. We think that this is crucial if used in production.
Apart from direct queries to the WikiWho api, there are some use cases already:
- Use case 1: whoCOLOR: this is a userscript that highlights selected text pieces in an article annotated with their provenance (author). Other features currently being build include a conflict view that highlights the most deleted and reintroduced text pieces, as well as a word history view that shows for each word/token when it was originally introduced and it's individual deletion/reintroduction history. See examples, description, screenshots and download link at this website: f-squared.org/whovisual. Described in a ICWSM workshop paper.
- Use case 2: whoVIS: A prototype of an editor-editor interaction network visualization for individual articles, based on the word/tokens deleted and reintroduced by editors. Also at f-squared.org/whovisual. A WWW Conference demo paper describes the system.
- Use case 3: WikiEdu Dashboard (see "Assessment tools" -> Article symbol)
- Flöck, Fabian, and Maribel Acosta. "WikiWho: Precise and efficient attribution of authorship of revisioned content." Proceedings of the 23rd international conference on World wide web. ACM, 2014.
- Luca de Alfaro , Michael Shavlovsky, Attributing authorship of revisioned content, Proceedings of the 22nd international conference on World Wide Web, May 13-17, 2013, Rio de Janeiro, Brazil
- Flöck, Fabian, et al. "Towards Better Visual Tools for Exploring Wikipedia Article Development–The Use Case of “Gamergate Controversy”." Ninth International AAAI Conference on Web and Social Media. 2015.
- Flöck, Fabian, and Maribel Acosta. "whovis: Visualizing editor interactions and dynamics in collaborative writing over time." Proceedings of the 24th International Conference on World Wide Web. ACM, 2015.