Grants:Programs/Wikimedia Research Fund/Trails: A Reader's Guide to Wikipedia

From Meta, a Wikimedia project coordination wiki
statusnot funded
Trails: A Reader's Guide to Wikipedia
start and end datesJuly 2023 - July 2024
budget (USD)48,969 USD
fiscal year2022-23
applicant(s)• Rudy Arthur and Tristan Cann



Rudy Arthur and Tristan Cann

Affiliation or grant type

University of Exeter


Rudy Arthur and Tristan Cann

Wikimedia username(s)

Project title

Trails: A Reader's Guide to Wikipedia

Research proposal[edit]


Description of the proposed project, including aims and approach. Be sure to clearly state the problem, why it is important, why previous approaches (if any) have been insufficient, and your methods to address it.

Wikipedia is well on its way to becoming the “essential infrastructure of the ecosystem of free knowledge”. One of the most interesting recent surveys on Wikipedia readers across languages finds that in-depth reading of Wikipedia articles is much more common in low HDI countries. In poorer countries, people with less access to quality learning materials in their own languages are turning to Wikipedia, especially in STEM. It is in this context that we propose Wikipedia “Trails”.

Consider approaching the topic of Statistics as a beginner through Wikipedia. A natural place to start would be the main Statistics article. This page has links to the Islamic Golden Age, Sexual Selection and biographies of figures like Stanley Smith Stevens and John Graunt. Beginners do not know how to differentiate core knowledge from interesting tangents, so the same features that make this a high quality article also make it difficult for a beginner to use for learning. The more condensed “Outline of Statistics” page presents dozens of links from the essential, “Median”, to the obscure, “Polychoric correlation”, with little to indicate which is more useful to a learner.

The Wikimedia environment already provides resources such as Wikibooks and Wikiversity, however these are less known than Wikipedia and have significantly less resources in languages other than English. Our proposal is to use statistical methods to produce “trails” or paths through a topic area to help guide a reader on a learning progression. We will characterise articles (or sections of articles) using network science and NLP techniques with data such as: network centrality, shared references, page views, text and topic overlap, reading level metrics (if available) and article quality ratings. These metrics will be synthesised across groups of articles to provide a user with a “trail” through that topic. This trail can be chosen in a number of ways, for example to visit as many articles as possible or to take the shortest path from one to another, all while maintaining as much continuity as possible to avoid the reader having to make large jumps in understanding, or getting lost on “non-core” material.

This, language agnostic, approach will allow learners, especially in low income countries, to better use the existing free resources on Wikipedia for learning and could even be incorporated into projects like Wikiversity and Wikibooks as a guide to content creators for making study guides or syllabi.


  • Advisors:
  • Tatjana Baleta, Wikimedia Visiting Fellow at the Global Systems Institute at the University of Exeter. UK resident.
  • Sam Walton, Senior Product Manager, Wikimedia Foundation. UK resident.


Approximate amount requested in USD.

48,969 USD

Budget Description

Briefly describe what you expect to spend money on (specific budgets and details are not necessary at this time).

The budget will be spent on the researcher's salary. The amount requested will cover 6 months of postdoc time on Exeter's standard pay scale. This should be enough time to produce a prototype version of the tool described. Dr. Arthur will act as an advisor and supervisor at no cost. Details to be determined, but we aim to pay Open Access publishing costs from Exeter University's institutional open access fund.


Address the impact and relevance to the Wikimedia projects, including the degree to which the research will address the 2030 Wikimedia Strategic Direction and/or support the work of Wikimedia user groups, affiliates, and developer communities. If your work relates to knowledge gaps, please directly relate it to the knowledge gaps taxonomy.

This work supports the 2030 Strategy goal Knowledge as a Service by building a tool that can help creators in their work across languages and in some cases fill in the gaps where resources are missing or being built. We work towards the Knowledge Equity goal by targeting the user experience of underprivileged groups across the world, building a tool that will enhance their current engagement with Wikipedia.

Of the 10 strategy recommendations, this work would be especially relevant for Identifying Topics for Impact, Innovation in Free Knowledge and Improved User Experience. Moreover, it aims to mainly target groups (non-English speaking, lower income) who have historically not been centred in research on the Wikipedia community.


Plans for dissemination.

We hope to first write the system up as an academic publication, targeting a high impact journal e.g. “Studies Higher Education” or conference “Proceedings of the ACM on Human-Computer Interaction”. We will use Exeter University’s networks to publicise the academic work. We will also seek feedback and collaboration with our advisors, especially the Wikimedia fellow, recently engaged at Exeter, as well as reaching out to Wikipedia users for feedback.

Past Contributions[edit]

Prior contributions to related academic and/or research projects and/or the Wikimedia and free culture communities. If you do not have prior experience, please explain your planned contributions.

Dr. Arthur is an expert in Computational Social Science, NLP and network analysis. He has worked on: the social-media geography of the UK; changes in the labour market due to COVID and online disaster communication, among other topics. He has experience bringing academic research into practice: working with environmental charities, IBM and launching a company to provide weather analytics. Dr. Arthur is an innovative teacher, developing new courses, lecture series, a YouTube channel and is leader of Exeter’s MSc in Data Science. A strong educational background will be crucial in evaluating the resources generated by the project.

Dr. Cann is an expert in network approaches to online communication and misinformation.

I agree to license the information I entered in this form excluding the pronouns, countries of residence, and email addresses under the terms of Creative Commons Attribution-ShareAlike 4.0. I understand that the decision to fund this Research Fund application, the application itself along with all the information entered by my in this form excluding the pronouns, country of residences, and email addresses of the personnel will be published on Wikimedia Foundation Funds pages on Meta-Wiki and will be made available to the public in perpetuity. To make the results of your research actionable and reusable by the Wikimedia volunteer communities, affiliates and Foundation, I agree that any output of my research will comply with the WMF Open Access Policy. I also confirm that I have read the privacy statement and agree to abide by the WMF Friendly Space Policy and Universal Code of Conduct.