Research:Wikipedia Primary School SSAJRP programme/Evaluation

From Meta, a Wikimedia project coordination wiki
Wikipedia Primary School SSAJRP programme Timelines & progress



Project evaluation main goals

  • Analyzing the state of the art of topics related to Wikipedia Primary School Project
  • Verifying the project impact in Wikipedia content
  • Evaluating significant variables of interest for other research projects

Evaluation steps

  1. Current state, Hypothesis, 2014
  2. Intermediate state, Comparison, 2015
  3. Final state, Conclusions, 2016

Hypothesis and questions:

  • If an article is impossible to find it is useless. How easy is it to find an article?
    • Single article
      • Number of templates and categories
      • Number of visits (per categories and portals)
      • Featured in homepage as an article and as a fact
      • Links to the article and interlinks Portals and Wikiprojects Google Rank
    • Correlation with other articles
      • Total amount of articles in the same categories
  • How to “calculate” the quality of an article through quantitative data? Is it possible to define variables in order to extract significant qualitative information?
    • Positive variables: Number of references and interlinks, Mentions, Portals and Wikiprojects, ecc.
    • Negative variables; Issues Stub, ecc.
    • Article: References and external links, Issues Templates, Edits
    • Direct evaluations: Features in homepage, Article quality, Monitoring, Mentions
    • Indirect evaluations: Links to the article, Number of visits, Page rank
    • Assigning points according to direct and indirect evaluations in order to generate a quality rank.
  • Usually people write Wikipedia articles according to their interests. Can we drive interest?
    • Articles size (and variation)
    • Editor task force and casual contributors
    • Talk page
    • New users


  1. Define the list of 100 articles to evaluate
  2. Extract data to visualize
  3. Concept of the data visualization
  4. First draft
  5. Review and second iteration
  6. Final data visualization

Preliminary analysis[edit]


The preliminary analysis aims to give a general view on the current status of the selected articles. The articles under analysis are 171.
The main goals of the intermediate analysis are the following:

  • Understand the current state of the selected articles
  • Define and visualize a set of parameters that can lead to the understanding of the ‘quality’ of the selected articles
  • Explore the relations among articles


In order to do the visual evaluation, a scraper has been implemented. It’s name is Wikimole and it is available on GitHub for further implementation.
The following is the process Wikimole uses:

  1. Wikimole gets data by using Wikipedia Api, a jQuery page scraper and some other external data sources such as Google Page rank and the Wikipedia article traffic statistics.
  2. After an external section of data filtering and cleaning, Wikimole visualizes data by using Gephi and D3.js.

Intermediate analysis[edit]


The intermediate analysis provides an overview of the state of the selected articles at six months from the previous one.
The articles under analysis are now 176. Among them 36 have been reviewed thanks to the involvement of the Wikipedia Community, 25 articles have been reviewed by expert reviewers.

The main goals of the intermediate analysis are the following:

  • Understand the impact of Wikipedia Primary School Project in the selected articles
  • Provide actionable information on how to improve the selected articles
  • Investigate the user interest in editing Wikipedia articles and how it changes over time

Two new features the data visualizations show:

  • The benchmarks regarding the previous analysis
  • The articles reviewed by the Wikipedia Community and by expert reviewers.


Phase 0[edit]

Report of the initial status of the selected articles.

Phase 1[edit]

Report of the provisional status of the selected articles considering the previous visual analysis. All datasets are open and available online.

Phase 2[edit]

Report of the final status of the selected articles considering the previous visual analysis. All datasets and protocols are open and available online.


  • Between 2014 and 2015 the total amount of editors has slightly decreased (from 845 to 773, -8.5%). The number of edits has slightly increased (from 1463 to 1481, +1,2%). It suggests a collective greater effort in contributing to the develop of Wikipedia articles. Among articles with none or very few edits in 2015 we found:


Open datasets[edit]

All datasets are open and available online.

Related articles[edit]


Giovanni Profeta is leading the evaluation of the project with a specific focus on information design.