Research:Understanding the use of maintenance templates
This page documents a research project in progress.
Information may be incomplete and change as the project progresses.
Please contact the project lead before formally citing or reusing results from this page.
Maintenance templates are used by Wikipedia editors to identify articles or sections that require improvement. These templates serve as visible tags that highlight specific issues and invite editors to address them. There are hundreds of such templates, each marking a specific type of issue. For example, the list of cleanup templates includes tags for issues such as {{Copy edit}}. In this project, our goal is to obtain a systematic understanding of maintenance templates in Wikipedia.
Motivation
[edit]Some individual templates have been found very useful in previous work to build models that support editors: content reliability (e.g., {{Unreferenced}}), edit checks (e.g., {{Peacock}} or {{POV}}), or structured tasks for newcomers (e.g., {{Underlinked}} or {{Orphan}}). Templates provide a rich annotation of article content because: i) They specify if and what might need improvement; ii) The associated edits (e.g. when the templates are removed) provide information about what needs to change in the content so that the issue is considered resolved.
In this project, we want to expand previous approaches using individual sets of templates by considering maintenance templates more generally. The goal is to obtain a systematic understanding of how maintenance templates are used in order to develop models for supporting editors. The project will directly contribute to the execution of the AI strategy for contributors:
- The AI Strategy aims to engage a new generation of new editors by, e.g., “generating valuable types of suggested edits” or “giving feedback on edits”. Structured tasks have been shown to cause more newcomers to publish a constructive edit. However, there are only few structured tasks available. Templates are a way to scale to many more tasks (add-a-link and add-an-image). Several structured tasks have been identified as relevant product use-cases for AI. The advantage is that these are not arbitrary tasks but annotated by the community.
- The AI Strategy aims to support patrollers and moderators with AI assisted workflows to allow editors to do their work effectively to maintain Wikipedia’s quality. Templates allow to identify specific issues in articles and/or specific edits. This in turn allows to develop models to automatically identify issues at scale and potentially propose ways to resolve those issues.
Specifically, we will work towards the following deliverables:
- A dataset about use of more than 100 maintenance templates across language
- Insights about the use of a wide range of maintenance templates on Wikipedia
- A model to detect maintenance templates in Wikipedia articles and/or individual edits.
- A model to suggests edits fixing issue of actual/predicted maintenance template issues (probably harder and only feasible for some specific templates)
Method
[edit]Generating datasets from templates
[edit]Below we propose a draft method for identifying a set of maintenance templates that correspond to specific tasks.
- Start with cleanup-templates in enwiki and find the corresponding templates in other languages.[1]
- Get corresponding templates in other languages from interlanguage links.
- Filter cleanup templates that can be mapped to a specific and relevant task.
A method for generating the dataset could be:
- Identify all current articles with maintenance templates using mediawiki-content-current dumps.
- Identify all revisions where maintenance templates were added or removed using mediawiki-content-history dumps.
- Parse content features of articles from wikitext.
- Parse content features of the diff when templates were added or removed using the mwedittypes library.
- The wiki-reliability project can be used as a reference for this part[2].
Understanding the usage of templates
[edit]Using this dataset, we aim to understand usage of templates.
- What is the coverage of templates across languages? This could help identify which kinds of issues exist in different language editions.
- How many articles currently contain one of the maintenance templates? This can identify the scope of the problem associated with the template.
- (Optional) Which articles contain the templates (topic, quality, etc.)? This provides additional information where these problems occur.
- How has the number of articles containing maintenance templates changed over time? This indicates whether the problem is growing or diminishing.
- What is the amount or type of content that changes in an article when an issue gets fixed? This can identify how well defined an issue associated with the template is.
- What is the average time it takes to fix an issue (i.e. time between adding/removing the template)? This can identify the difficulty for editors in fixing the issue associated with the template.
Identifying issues
[edit]We aim to develop a model (likely based on large language models) that predicts whether a given article revision should include a specific maintenance template. It is very likely that some templates will be easier to predict than others. Some open questions to consider:
- Models: should each template have its own model or should there be a single model for all templates? Or something in between; e.g., individual models for core policies and a combined model for others?
- Data: it will be crucial to evaluate on fresh data after the LLM training as previous experiments showed that performance can drop substantially on new data.
The models can be utilized in different forms for different use cases:
- Identifying issues in articles. This could feed into newcomer tasks.
- Detecting potential issues in edits, which could inform edit-check tools.
- Identifying articles where an existing template is no longer needed.
- Identify specific regions in the article (section/paragraph/sentence/word) that are associated with the template. For some templates, the issues might be more localized than others.
Suggesting improvements
[edit]Develop a model capable of suggesting edits or revisions that resolve the identified issue (i.e., leading to the removal or replacement of the associated template).
Results
[edit]Work in progress. Results will be shared when available.
References
[edit]- ↑ Maik Anderka and Benno Stein. 2012. A breakdown of quality flaws in Wikipedia. In Proceedings of the 2nd Joint WICOW/AIRWeb Workshop on Web Quality (WebQuality '12). Association for Computing Machinery, New York, NY, USA, 11–18. https://doi.org/10.1145/2184305.2184309
- ↑ KayYen Wong, Miriam Redi, and Diego Saez-Trumper. 2021. Wiki-Reliability: A Large Scale Dataset for Content Reliability on Wikipedia. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '21). Association for Computing Machinery, New York, NY, USA, 2437–2442. https://doi.org/10.1145/3404835.3463253