RENDER/Wikipedia Case Study
Motivation for the Wikipedia Case Study
The main goal of this case study is to increase the quality of Wikipedia through understanding and encouraging its knowledge diversity. The Neutral Point of View (NPOV) is one of the fundamental principles of Wikipedia and all Wikimedia projects. However, articles are often written by editors who may be biased towards a certain point of view. Wikipedians work hard and engage in numerous and sometimes tiresome debates in order to achieve an appropriate and neutral description. In order to achieve the Neutral Point Of View, it is necessary that an editor is able to transcend his personal point of view or that the significant points of view are covered by multiple editors.
Research topics of the RENDER Case Study in Wikipedia
For the case study the following research topics were defined:
- Understand and predict socio-technical mechanisms leading to biases
For relevant phenomena potentially leading to imbalances in coverage, representation and accuracy of information we will develop models to show how they function, which effects on diversity they actually have and which patterns they display that can be used to detect and predict them. As a first step we will look into the existence and effects of territoriality and ownership behaviour in Wikipedia and to other possible effects of heightened vigilance e.g., triggered by vandalism attacks. Specifically, we want to know if the ability of new and occasional editors to add their points of view to the article is impaired by those effects. The analysis incorporates several features of authors, articles and edits taken from the complete revision history of the articles and their discussion pages.
- Identify and extract diversely expressed information
We will provide mechanisms to identify and extract opinions, viewpoints and sentiments based on the available Wikipedia meta-data, going significantly beyond shallow text mining and information extraction. These mechanisms will use concepts represented in the article text, their relation to each other, temporally coincident comments on the article talk pages, and answers to these, as well as discussions on the responsible contributors’ talk pages.
- Represent and process diversely expressed information
We will design methods that utilize opinions and viewpoints to summarize, understand, and visualize the flow of discussions on a specific topic. As a highly expressive formalization of discussions cannot be achieved in a feasible way - due to the limitations of formal knowledge representation languages paired with the computational complexity associated with inferring over such rich formalizations - our methods will leverage semi-structured data such as fragments of articles and associated change information, as well as lightweight representations and reasoning that make key aspects of diversity explicit.
Working steps of the RENDER Case Study in Wikipedia
- Definition and evaluation of metrics to analyse and to assess the quality of Wikipedia articles in terms of diversity (deadline 30.09.2011)
- The results were documented in the final report.
- Tools for diversity management in Wikipedia are developed and tested for their scalability and usage (deadline 30.09.2012)
- Understanding the feedback-effects of metrics on Wikipedia (deadline 31.03.2013)
- Evaluation of the tools for diversity management in Wikipedia (deadline 30.09.2013)
Use Cases Scenarios for Wikipedia in RENDER
As a first step, we have identified the three following use cases for Wikipedia:
Display warnings to readers when detecting patterns of bias
Does an article cover all facts about its topic? We will collect and analyse different sources that deal with a particular topic and compare this fact coverage with the one in the Wikipedia article about the same topic. For this task we also plan to compare different language versions of an article, to detect if there are biases in certain language editions.
Incentivising readers to extend articles
If our assessment system detects a lack of significant points of view within an article, the reader could be offered links to sources or extracted summaries of the missing facts. This will support readers in contributing to Wikipedia. Additionally their awareness for different points of view and the collaborative labour on an article will be raised.
Notifying authors when an article is out of date
For this task we will use two indicators to expose that an article or a part of an article needs to be updated.
- If in several other language versions the articles of the same topic have been recently edited heavily, this could be a hint to an important recent event in the world, which needs to be added.
- We will monitor external sources to detect certain types of events and analyse the news streams for biases. Thus we will be able to present a balanced view of changes in the world to Wikipedia editors.