Research:Improving Wikipedia's important articles
- Wiki user TCO
The goal is to understand how we are doing at getting high quality on the most important articles (either subjectively rated important or by web traffic), to understand some of the factors affecting slow quality improvement of those high priority topics, and to propose some recommendations.
Some of the main findings:
- While we produce a lot of Featured and Good Articles, most of them are on trivial topics
- At the same time, many highly important topics are low quality
- Concentrating on important articles is a vastly more efficient way to serve the readers, even with the added difficulty of bigger topics. (Differences in page views can be 3 orders of magnitude.)
- Current social rewards "star stickers" motivate writers in the opposite direction though, to concentrate on high numbers of obscure topics.
- The solution to the problem is to ADD new social rewards that recognize quality achievements on high importance articles. (Adding new rewards is much easier to accomplish than trying to withdraw or change the current rewards.)
The research is really more a set of several sub-projects, "deep dives", on different facets of quality/importance:
- Vital Articles
- Featured and Good Articles programs
- Relevance to readers
- Content checks of Featured Article Candidates
- Declining output of Featured Articles
- Featured Article writing patterns
- Champions or star collectors? (importance versus volume tradeoff with FAs)
- High or low view topic concentration? (Cases)
- Four Award
- “Waddesdon Road railway station” (an FA)
- Hurricane WikiProject
- Ucucha FAs (user writing on rare, new mammals)
- Some high importance efforts
- Project Elements
- NARA contest
- Aviation master plan
- Wikimedia Foundation quality strategy
- Pulling it all together (final thoughts)
A mixture of qualitative and quantitative research was done (cases, web traffic). Manual methods were used (stats.grok.se) for the page views. More detail in Backups section of the completed report.
I'm very open to questions about the report or to thinking about future work. The report has gotten a lot of controversy and I am not monitoring a lot of the talk pages or defending the work in general. It is not a super work of statistics or academic gravity (think business report more than paper for a journal). But there are some interesting things to examine and attempts to examine them.
Wikimedia Policies, Ethics, and Human Subjects Protection
No private information was used.
Benefits for the Wikimedia community
No tangible benefits. Report is advisory. (The "leader board" might be used for social awards that are importance weighted.)
Work was done between late OCT 2011 and late NOV 2011.
No funding was tapped.
Formal publications from the report: (none yet).
- Gorbatai, Andrea D., “Exploring Underproduction in Wikipedia”, WikiSym Proceedings, 2011.
- Note: this publication was used to produce the slide in the presentation titled "Academic research showed that Wiki fails on its most important products" and is credited there. The rest of the report is independent of it (and the bulk of the work was done before seeing this paper).
- (unpublished) Gorbatai also produced a new, unpublished analysis of quality versus traffic, (eyeball view of quality) on the slide titled "69% of readers are seeing a low quality article. Only 3% a high quality article". I suspect this is the first time such an analysis has been done (although Wiki analysis is hard to search and the view seems both powerful and obviously one would want to perform, so it may have been done).