Research:Wikimedia France Research Award/nominated papers/A Content-Driven Reputation System for the Wikipedia

From Meta, a Wikimedia project coordination wiki

A Content-Driven Reputation System for the Wikipedia[edit]

A Content-Driven Reputation System for the Wikipedia by Thomas Adler and Luca de Alfaro, was published in the Proceedings of the 16th International World Wide Web Conference, 2007

See the full text here.

Summary[edit]

The paper presents a reputation system for Wikipedia authors.

Most reputation systems, such as the ones used in social networks or e-commerce, rely on user-to-user comments or other ratings. In the proposed reputation system for the Wikipedia, reputation is content-driven: authors gain reputation when the edits they perform to Wikipedia articles are preserved by subsequent authors. Conversely, authors lose reputation when their edits are rolled back or quickly undone. Thus, author reputation is computed on the basis of content evolution only: in particular, no “badmouthing” or commissioned praise is possible.

Content change is computed either as Text life (whether the text is deleted or not) or Edit life (how much of the article structure is kept after the next edit).

The author reputation can be used to flag new contributions from low-reputation authors, or to allow only authors with high reputation to contribute to controversial or critical pages. A reputation system for Wikipedia could also provide an incentive for high-quality contributions.

The authors also implement the proposed system, and use it to analyze the entire Italian and French Wikipedias during their first years (totalling 691,551 pages and 5,587,523 revisions), with results showing that the proposed notion of reputation has good predictive value. Machine-calculated results are put to the test by a group of 7 volunteers rating revisions performed to the Italian Wikipedia.

  • Anonymous authors are shown to be the largest source of short-lived contributions.
  • Changes performed by low-reputation authors have a significantly larger probability of having poor quality of being later undone.
  • Comparison with Edit-Count Reputation (the more edits you have the better your reputation) shows that content-driven reputation performs slightly better than edit-count reputation.
  • Author reputation is also a useful factor in predicting the survival probability of fresh text.

Jury comments[edit]

A major contributor to the lasting debate on article quality, still a hot topic. Thomas Adler and Luca de Alfaro have continued working on this topic since 2007 and refined their method.

Vote for this paper

Vote[edit]

  1. Even though the resulting Wikitrust software did not become as widely used as once anticipated, it has been an innovative and influential approach. This appears to have been the first academic paper making use of a content persistence metric as quantitative indicator for content quality. Tbayer (WMF) (talk) 22:47, 10 March 2013 (UTC)
  2. Warfair (talk) 01:32, 19 March 2013 (UTC)