Jump to content

Research:Crowdsourced Content Moderation

From Meta, a Wikimedia project coordination wiki
Created
16:01, 6 May 2025 (UTC)
Duration:  2025-01 – ??
Wikipedia, content moderation, crowdsourcing

This page documents a research project in progress.
Information may be incomplete and change as the project progresses.
Please contact the project lead before formally citing or reusing results from this page.


This work advances the initial effort to develop a working definition for moderation activity and moderators.

Moderation is the social, technical and governance work needed to sustain an online community. This includes the creation, revision and enforcement of community values, rules, and norms. On Wikipedia, moderation takes various forms to ensure that content and practices align with established policies. One particularly important moderation form is the crowdsourced use of templates, which help inform editors and readers about articles requiring maintenance, such as those needing additional references, formatting, or other improvements. Here, we present our ongoing work examining data from multiple language editions of Wikipedia to develop a taxonomy of article maintenance templates, expanding on previous approaches based exclusively on English Wikipedia. We conclude by outlining potential directions for future research.

Introduction

[edit]

Wikipedia is a very popular, free online encyclopedia created and maintained by volunteers. Anyone can edit its content, but all information must follow community-established policies through a decentralized governance system[1], such as neutrality, verifiability, and the no inclusion of original research. For this reason, volunteers in the existing different language editions take on moderation tasks to ensure that content follows Wikipedia's policies and that the integrity of knowledge is preserved.

Different types of content moderation tasks exist on Wikipedia, with important differences compared to social media platforms[2]. Clearly-capturable moderation activities, such as the removal of content (in the form of page deletion), or the exclusion of users (in the form of blocks), can only be done by Wikipedia editors with elevated user rights. These editors, referred as to admins, are selected periodically by the community[3]. In contrast, many other content moderation tasks are crowdsourced, allowing almost any community member to contribute. Forms of moderation are accessible to any user, tend to be actions that are less easily captured. These tasks generally fall into two categories: patrolling and maintenance.

Patrolling plays a crucial role in combating disinformation, copyright infringement, libel, slander, personal threats, and other forms of vandalism[4]. Research has explored various mechanisms such as content flagging and reverts. Compared to reverts, which negatively impact newcomer retention[5], templates highlight issues that contribute to article improvement but do not necessarily affect the behavior of the editors they address[6]. Templates are pieces of wiki code to generate visual markers for several purposes, including flagging issues for other editors to fix[7]. The two most common forms of article maintenance templates are message boxes and inline-cleanup tags. Both serve the same purpose, but message boxes appear at the top of an article or section, whereas inline tags are placed within the text. For example, to address promotional language ("peacock terms"), English Wikipedia moderators use either a message box or an inline-cleanup tag.

Understanding how article maintenance templates are used in practice, and whether this tagging is widespread across various language communities is essential for developing advanced moderation tooling. Templates have been used to test the placement and design of trust indicators to help readers assess content quality[8] or as a source of tasks for editors through recommender systems[9][10]. Another key application of Wikipedia templates is in machine learning models designed to assist editors. ORES, the first machine learning scoring service on Wikipedia, led to the development of specific templates that editors used to request new features[11]. Models designed to detect changes in the quality of Wikipedia articles have also included occurrences of templates as a feature[12]. More recently, templates have been compiled in the process of creating datasets of Wikipedia articles annotated with content reliability issues[13][14][15] which can be used to train patrolling machine learning models.

Given the growing interest in crowdsourced content moderation in social media platforms[16][17], it is important to understand how it works on platforms with a long history of community moderation like Wikipedia. Our work expands previous analyses of Wikipedia article maintenance templates by examining data from multiple language editions.

[edit]

Moderation Work

[edit]

Academic research on volunteer moderation on Wikipedia has tended to focus almost exclusively on the role of administrators as a proxy for "moderator". There are strong similarities between the position of moderator on other platforms reliant on volunteer moderation labor (such as Reddit, Twitch, or Discord), and Wikipedia administrators. Indeed, searching for the term "moderator" on English Wikipedia's Wikipedia namespace, redirects readers to the page about administrators.

However, moderation work on Wikipedia extends beyond the activities of administrators alone. Both the social modeling[18][19] and governance work[20] of moderation can be carried out by users who are not administrators. We focus on article maintenance as an aspect of moderation work on Wikipedia. Wikipedia's article maintenance policies are governed by community-created policies, whose complexity is enabled by distributing the work needed to write and update it[21]. The work of enforcing these policies is likewise distributed. We use the framework of crowdsourced moderation to better understand this form of moderation. Crowdsourced moderation is generally characterized by allowing a large population of users to engage in moderation activity, as opposed to allowing a handful of trusted users with elevated user rights to carry out moderation on behalf of the greater community. While in practice, "pure" versions of one or the other are rare, crowdsourced moderation has been applied to Slashdot's rating system[22], crowd-worker political fact-checking[23], and in the context of COVID-19 disinformation on Reddit[24]. For this project, we propose an intersection of volunteer moderation and crowdsourced moderation: moderation actions carried out by volunteer members of the community being moderated, which do not require extra permissions beyond basic user rights.

Wikipedia Templates

[edit]

Wikipedia templates have attracted scholarly interest, particularly with the rise of research on Wikipedia itself. One of the earliest studies examined templating mechanisms in popular state of the art wiki engines[25]. Subsequent research explored the use of specific templates in different language editions of Wikipedia. For instance, the {complex} template in Simple English Wikipedia has been analyzed for its role in flagging articles that fail to meet simplicity and readability standards[26]. Another study focused on the {NPOV} template in English Wikipedia found that editing activity increases right after the template is added and that articles disputed by a few editors before being tagged as controversial take longer to resolve[27].

To the best of our knowledge, the most comprehensive analysis of Wikipedia article maintenance templates identified up to 388 quality flaws categorized into 12 types: verifiability, wiki tech, general cleanup, expand, unwanted content, style of writing, neutrality, merge, cleanup of specific subjects, structure, time-sensitive, miscellaneous[28]. While this classification offers a broad overview of maintenance templates in Wikipedia articles, the findings are based exclusively on the English language edition. Differences in templating practices can be expected between language editions, motivating efforts in templates cross-lingual alignment[29]. Given the growing interest in multilingual Wikipedia research[30], it remains uncertain how well these categories represent the many existing language editions.

Taxonomy Development in Wikimedia Research

[edit]

Besides the aforomentioned categories of English Wikipedia quality flaws, this work is inpiried by previous research efforts aimed at categorizing different areas of interest within Wikimedia projects. A paradigmatic example is the taxonomy of knowledge gaps[31] . By analyzing over 250 references from researchers and community members, they identified gaps in readership, contributorship, and content. This taxonomy later served as a foundation for categorizing knowledge integrity risks in Wikipedia[32], which distinguishes between internal community risks, internal content risks, and external risks. In fact, knowledge integrity has been a focal area for multiple taxonomy-building efforts from researchers interested in Wikimedia projects. For instance, other researchers employed surveys and interviews to develop a taxonomy of mechanisms that readers use to assess the credibility of Wikipedia, including features from articles, readers and Wikipedia itself[33]. Another closely related example, which is especially relevant to the use of templates, is the taxonomy of the reasons why inline citations are required on English Wikipedia[34] . All these studies highlight the increasing interest from research in systematically classifying different aspects of Wikipedia's content and activity.

Data Exploration

[edit]

To guide the development of the taxonomy of article maintenance templates of Wikipedia, we begin by analyzing a month of editing activity across multiple language editions. In particular, we examine HTML data of revisions from October 2024 using the Wikimedia Enterprise API. We opt for HTML data over traditional wikitext data because they allow us to directly capture templates through message boxes and inline-cleanup tags. This method eliminates the need to pre-identify specific templates for content moderation, as required in earlier approaches[35]. Furthermore, HTML data have been preferred to capture how templates incorporate elements that remain hidden when working only with wikitext, such as links, tables, images or references[36].

Our parsing process is as follows. First, we retrieve the HTML content of each revision and its parent revision. Next, we utilize the mwparserfromhtml Python library, which efficiently extracts major article metadata with their respective type and status information. To identify article maintenance templates, we focus on message boxes and inline-cleanup tags. Finally, we compare the templates found in each revision and its parent revision to determine when they were added or removed.

For our data exploration, we focus on a selection of widely popular Wikipedia language editions: English, French, German, Spanish, Japanese, Russian, Italian, Chinese, Polish, Dutch, and Swedish. The table below presents several metrics on derived from the use of message boxes and inline cleanup tags in these languages. As expected, English Wikipedia -- the most popular language edition -- exhibits the highest values across all metrics. Surprisingly, the values for German Wikipedia are notably low. These differences could be attributed to the distinct nature of moderation practices in that language edition, such as the extensive use of the Flagged Revisions extension, which restricts the visibility of new revisions until they are reviewed by experienced editors, as well as specific policies that were set to limit the use of article maintenance templates. The high values observed for Russian Wikipedia are also noteworthy, likely explained by bot-driven moderation activity.

Metrics for crowdsourced content moderation on Wikipedia with article maintenance templates across language editions (ISO 639 codes) in October 2024. Metrics include the number of templates added (Tₐ), templates removed (Tᵣ), revisions with a template added or removed (Rₜ), pages with a template added or removed (Pₜ).
lang Tₐ Tᵣ Rₜ Pₜ
arz 134 30 132 123
de 99 278 337 299
en 95,359 86,486 106,109 71,358
es 5,173 3,856 5,575 4,167
fr 6,831 5,583 7,719 5,275
it 2,163 1,149 2,127 1,675
ja 10,010 7,322 11,797 9,283
nl 633 403 764 601
pl 649 549 773 671
ru 23,173 23,937 28,359 24,446
sv 2,021 2,412 3,579 3,273
zh 6,991 5,068 7,469 5,815

To gain deeper insight into the types of editors involved in crowdsourced content moderation, we analyze the distribution of revisions adding or removing article maintenance templates by the edit count buckets of editors. We classify bots and IP editors as separate categories from registered human editors because of the distinct nature of their editing activity. Results are shown in the figure below and reveal that content moderation revisions involving templates are predominantly performed by experienced users with a high number of prior edits, a pattern consistent across language editions. Bots perform the majority of moderation revisions in Russian Wikipedia and play an important role in Dutch, Swedish, and Chinese Wikipedia as well. Additionally, IP editors contribute a notable share of content moderation revisions in certain language editions, such as Japanese and Italian.

Percentage of crowdsourced moderation revisions by edit count bucket.
Percentage of crowdsourced moderation revisions by edit count bucket.

User expertise can be characterized not only by the edit record, but also by the advanced rights groups to which editors belong. The figure below shows the distribution of revisions involving moderation by these groups. For readability, we only show user groups that account for more than 10% of edits in at least one language edition. Also, as editors can belong to more than one user group, the total percentage in a given language edition may exceed 100%. Compared to the previous analysis, we do observe a pattern consistent across language editions. Many moderation revisions in German, Polish, and Russian Wikipedia are made by users with editor and autoreview rights, which are required to operate the aforementioned Flagged Revisions extension. Also, we observe a remarkable percentage of edits by users with extendedconfirmed rights, which aligns with the analysis of edit count buckets, as this right is granted in specific language editions after 30 days and at least 500 edits. Wikipedia admins, typically referred to as editors with sysop rights, are not particularly involved in most languages, except for Swedish. Notably, in French and Spanish Wikipedia, more than half of revisions adding or removing article maintenance templates are made by users without advanced rights (none).

Percentage of crowdsourced moderation revisions by user group.
Percentage of crowdsourced moderation revisions by user group.

Taxonomy of Article Maintenance Templates

[edit]

We further organize the article maintenance templates obtained from our dataset into the following categories, based on content policies that were present across all studied Wikipedia language editions. Template names retrieved from our HTML data are categorized via regular expressions, based on the template documentation available on the corresponding language edition.

  • Reliability: templates related to the principle of source reliability.
  • Verifiability: templates related to the principle of source or claim verifiability.
  • Original research: templates related to the prohibition of original research as a valid source.
  • Citation: other templates related to citation policies.
  • Translation: templates around requesting, expanding, or correcting translations.
  • Collaboration: templates indicating work-in-progress on a given page, or those that invite another user's input or contributions.
  • Maintenance: templates related to the organization of articles, including page moves, renames, merges, deletions, disambiguation, categorization, dead links, and orphan articles.
  • Clarity: templates that point out issues of clarity, or any situation where the text is difficult to understand.
  • Accuracy: templates that point out issues with the factual accuracy of claims within a page.
  • Formatting: templates that point out non-standard or incorrect formatting of pages and sections.
  • Grammar: templates for copy-editing, minor typographic errors, spelling and grammar issues.
  • Style: other templates related to style, tone, and incorrect or non-standard formatting.
  • Language bias: templates that point out biased language, such as non-neutral, fannish or in-universe points of view, editorializing, or promotional language.
  • Content bias: templates that point out biases in the structure of an article beyond language, such as unbalanced representations of controversies, or undue weight given to particular perspectives.
  • Neutrality: templates that point out issues with bias and neutrality in a page, not covered by "language bias" or "content bias".
  • Notability: templates about notability, or whether or not a given subject is suitable to be on Wikipedia.
  • Copyright: templates about copyright concerns or violations.
  • Paid editing: templates about undisclosed conflicts of interest or undisclosed paid editing, grouped together due to the similarity of policies in addressing both situations.
  • Multiple: templates that are routinely used to indicate issues across two or more high-level categories, or placeholder templates that nevertheless indicate some kind of problem within the flagged text.
  • Miscellaneous: all other templates in the dataset. The most notable examples were those used to format pages in and of themselves, such as infoboxes, reference formatting templates, or purely communicative templates used to convey warnings or commendations.


English Wikipedia had the largest number of unique non-miscellaneous templates, followed by Chinese Wikipedia and Russian Wikipedia , as shown in the table below. Confirming our earlier findings, German Wikipedia is a clear outlier that uses very few article maintenance templates for its size and activity. Every language edition includes some untranslated templates, which are aliases for localized policy templates. The most common untranslated template used across all language editions are derivatives of the English Wikipedia citation needed template, such as cn (an initialism for citation needed or fact. On languages other than English, these templates redirect to the local equivalent template for citation needed. Chinese Wikipedia uses the highest proportion of non-local-language templates, with the majority of unique non-misc. templates in the dataset being in Latin characters (n=179, 68.6%) versus in non-Latin characters (n=82, 31.4%). This is presumably done to avoid issues between privileging Traditional or Simplified Chinese writing systems.

Number of unique, non-miscellaneous article maintenance templates (Uₜ) across language editions (ISO 639 codes).
lang de en es fr it ja nl pl ru sv zh
Uₜ 4 944 128 110 16 150 12 22 247 144 261


The most commonly used templates across all language editions aside from German Wikipedia, are multiple-issue templates or citation templates, as shown in the table below. The most commonly used templates are also some of the least specific templates in those categories, even in language editions that have many specialized templates for specific policy violations (like French, Russian, or English Wikipedia).

When categorizing these templates, we found many situations where certain wikis' use of templates does not align with the others. For instance, French Wikipedia's Interprétation personnelle template is written for cases where original research is used, where unpublished or unverifiable sources are used, or when a claim comes from personal interpretation. On English Wikipedia, the same situations would be covered by the separate templates Original research, Opinion, Synthesis, or Self-published source.

Generally, across all wikis aside from German, we see that less specific categories are more likely to be added or removed. This suggests that templates do not serve as a marker of specific policy violation. These templates indicate what might need to be fixed, where that fix should be applied, and possibly how the fix should be done, but not always why the fix is needed in the first place.

Metrics on the use of templates by category, across all language editions excluding German Wikipedia. Metrics include count of templates in the category (C), number of templates from the category added or removed (T), revisions with a template in that category added or removed (Rₜ), and pages with a template in that category added or removed (Pₜ).
category C T Rₜ Pₜ
citation 1,430 226,434 165,671 149,864
multiple 296 119,027 68,550 63,481
maintenance 1,068 93,444 80,247 77,657
other 5,762 75,189 52,739 47,953
collaboration 415 26,307 21,657 19,779
style 992 22,008 19,224 18,205
clarity 907 20,742 18,265 17,357
verifiability 237 10,420 8,598 8,108
translation 464 9,559 8,457 7,934
formatting 113 7,273 5,261 4,939
notability 104 6,549 6,478 6,294
reliability 146 4,510 3,344 3,125
original research 95 2,803 2,166 2,061
copyright 59 2,009 1,975 1,870
accuracy 74 1,723 1,700 1,658
grammar 61 1,190 1,172 1,132
language bias 68 808 802 780
paid editing 32 754 754 734
neutrality 59 444 418 400
content bias 16 87 87 83

Conclusion

[edit]

We have presented a preliminary look at Wikipedia article maintenance templates. By exploring data from multiple language editions, we have organized templates into 20 distinct categories. Some of these categories (e.g., verifiability, style of writing, neutrality), were already identified in earlier work focused on English Wikipedia templates. This research expands on this by introducing additional categories, like notability and paid editing, which are tied to policies crucial for maintaining knowledge integrity in Wikipedia.

Recent research has highlighted variations in rules and rule-making across major language editions[37]. Our findings further emphasize discrepancies in template usage and its relationship to policies across language communities. This not only provides insight into cultural differences between these communities but also presents an opportunity for tools that could facilitate better alignment and collaboration.

Our analysis has some limitations, the most significant being the limited time frame of the data we examined, which only covers revisions from October 2024. This limitation is due to the data engineering resources available at the time the analysis was conducted. However, we anticipate that upcoming data pipelines will not only enable us to gather more data but also help address other important questions, e.g., the time it takes for different types of templates to be resolved[38] and the extent to which maintenance templates contribute to improvements in article quality[39] and readability[40].

Another notable limitation is that our observations of moderation activity do not provide insight into how editors learn to use article maintenance templates. Research indicates that Wikipedia learning is experiential, with editors facing specific expectations and rules that require developing certain skills to contribute effectively[41]. Therefore, future work should include complementary methods, like surveys and interviews, to better understand how moderators develop their skills and face challenges.

Next Directions

[edit]

References

[edit]
  1. Forte, Andrea; Bruckman, Amy (2008). "Scaling Consensus: Increasing Decentralization in Wikipedia Governance". Proceedings of the 41st Annual Hawaii International Conference on System Sciences (HICSS 2008). IEEE. pp. 157–157. 
  2. Yasseri, Taha; Menczer, Filippo (2023). "Can Crowdsourcing Rescue the Social Marketplace of Ideas?". Communications of the ACM (ACM) 66 (9): 42–45. 
  3. Leskovec, Jure; Huttenlocher, Daniel; Kleinberg, Jon (2010). "Governance in Social Media: A case study of the Wikipedia promotion process". Proceedings of the International AAAI Conference on Web and Social Media 4 (1). pp. 98–105. 
  4. Morgan, Jonathan (2019). "Patrolling on Wikipedia". 
  5. Halfaker, Aaron; Kittur, Aniket; Riedl, John (2011). "Don't bite the newbies: how reverts affect the quantity and quality of Wikipedia work". Proceedings of the 7th International Symposium on Wikis and Open Collaboration. pp. 163–172. 
  6. Pavalanathan, Umashanthi; Han, Xiaochuang; Eisenstein, Jacob (2018). "Mind Your POV: Convergence of Articles and Editors Towards Wikipedia's Neutrality Norm". Proceedings of the ACM on Human-Computer Interaction (ACM New York, NY, USA) 2 (CSCW): 1–23. 
  7. Viegas, Fernanda B; Wattenberg, Martin; McKeon, Matthew M (2007). "The Hidden Order of Wikipedia". Online Communities and Social Computing: Second International Conference, OCSC 2007, Held as Part of HCI International 2007, Beijing, China, July 22-27, 2007. Proceedings 2. Springer. pp. 445–454. 
  8. Kuznetsov, Andrew; Novotny, Margeigh; Klein, Jessica; Saez-Trumper, Diego; Kittur, Aniket (2022). "Templates and Trust-o-meters: Towards a widely deployable indicator of trust in Wikipedia". Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems. pp. 1–17. 
  9. Cosley, Dan and Frankowski, Dan and Terveen, Loren and Riedl, John (2007). "SuggestBot: Using Intelligent Task Routing to Help People Find Work in Wikipedia". Proceedings of the 12th international conference on Intelligent user interfaces. pp. 32–41. 
  10. Warncke-Wang, Morten and Ho, Rita and Miller, Marshall and Johnson, Isaac (2023). "Increasing Participation in Peer Production Communities with the Newcomer Homepage". Proceedings of the ACM on Human-Computer Interaction (ACM New York, NY, USA) 7 (CSCW2): 1–26. 
  11. Halfaker, Aaron and Geiger, R Stuart (2020). "ORES: Lowering Barriers with Participatory Machine Learning in Wikipedia". Proceedings of the ACM on Human-Computer Interaction (ACM New York, NY, USA) 4 (CSCW2): 1–37. 
  12. Das, Paramita; Guda, Bhanu Prakash Reddy; Seelaboyina, Sasi Bhushan; Sarkar, Soumya; Mukherjee, Animesh (2022). "Quality Change: Norm or Exception? Measurement, Analysis and Detection of Quality Change in Wikipedia". Proceedings of the ACM on Human-Computer Interaction (ACM New York, NY, USA) 6 (CSCW1): 1–36. 
  13. Bertsch, Amanda and Bethard, Steven (2021). "Detection of Puffery on the English Wikipedia". Proceedings of the Seventh Workshop on Noisy User-generated Text (W-NUT 2021). pp. 329––333. 
  14. De Kock, Christine and Vlachos, Andreas (2022). "Leveraging Wikipedia article evolution for promotional tone detection". Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). pp. 5601––5613. 
  15. Ando, Kenichiro and Sekine, Satoshi and Komachi, Mamoru (2024). "WikiSQE: A Large-Scale Dataset for Sentence Quality Estimation in Wikipedia". Proceedings of the AAAI Conference on Artificial Intelligence 38 (16). pp. 17656––17663. 
  16. Chuai, Yuwei and Tian, Haoye and Pröllochs, Nicolas and Lenzini, Gabriele (2024). "Did the Roll-Out of Community Notes Reduce Engagement With Misinformation on X/Twitter?". Proceedings of the ACM on Human-Computer Interaction (ACM New York, NY, USA) 8 (CSCW2): 1––52. 
  17. Matamoros Fernandez, Ariadna and Jude, Nadia Alana (2025). "The importance of centering harm in data infrastructures for ’soft moderation’: X’s Community Notes as a case study". New Media and Society (SAGE Publications Ltd). 
  18. Seering, Joseph and Kraut, Robert and Dabbish, Laura (2017). "Shaping Pro and Anti-Social Behavior on Twitch Through Moderation and Example-Setting". Proceedings of the 2017 ACM conference on computer supported cooperative work and social computing. pp. 111––125. 
  19. Matias, J Nathan (2019). "Preventing harassment and increasing group participation through social norms in 2,190 online science discussions". Proceedings of the National Academy of Sciences (National Academy of Sciences) 116 (20): 9785––9789. 
  20. Matias, J Nathan (2019). "The civic labor of volunteer moderators online". Social Media+ Society (SAGE Publications Sage UK: London, England) 5 (2): 2056305119836778. 
  21. Butler, Brian; Joyce, Elisabeth; Pike, Jacqueline (2008). "Don't look now, but we've created a bureaucracy: the nature and roles of policies and rules in wikipedia". Proceedings of the SIGCHI conference on human factors in computing systems. pp. 1101–1110. 
  22. Lampe, Cliff and Zube, Paul and Lee, Jusil and Park, Chul Hyun and Johnston, Erik (2014). "Crowdsourcing civility: A natural experiment examining the effects of distributed moderation in online forums". Government Information Quarterly 31 (2): 317–326. ISSN 0740-624X. doi:10.1016/j.giq.2013.11.005. 
  23. Thebault-Spieker, Jacob; Venkatagiri, Sukrit; Mine, Naomi; Luther, Kurt (2023). "Diverse Perspectives Can Mitigate Political Bias in Crowdsourced Content Moderation". 2023 ACM Conference on Fairness, Accountability, and Transparency. pp. 1280–1291. doi:10.1145/3593013.3594080. 
  24. Iqbal, Waleed; Arshad, Muhammad Haseeb; Tyson, Gareth; Castro, Ignacio (2022). "Exploring Crowdsourced Content Moderation Through Lens of Reddit during COVID-19". Proceedings of the 17th Asian Internet Engineering Conference. pp. 26–35. ISBN 978-1-4503-9981-4. doi:10.1145/3570748.3570753. 
  25. Di Iorio, Angelo; Vitali, Fabio; Zacchiroli, Stefano (2008). "Wiki Content Templating". Proceedings of the 17th International Conference on World Wide Web. pp. 615–624. 
  26. Gaio, Loris; den Besten, Matthijs; Rossi, Alessandro; Dalle, Jean-Michel (2009). "Wikibugs: Using Template Messages in Open Content Collections". Proceedings of the 5th International Symposium on Wikis and Open Collaboration. pp. 1–7. 
  27. Rossi, Alessandro; Gaio, Loris; den Besten, Matthijs; Dalle, Jean-Michel (2010). "Coordination and Division of Labor in Open Content Communities: The Role of Template Messages in Wikipedia". 2010 43rd Hawaii International Conference on System Sciences. pp. 1–10. 
  28. Anderka, Maik; Stein, Benno (2012). "A breakdown of quality flaws in Wikipedia". Proceedings of the 2nd joint WICOW/AIRWeb workshop on web quality. pp. 11–18. 
  29. Bouma, Gosse; Duarte, Sergio; Islam, Zahurul (2009). "Cross-lingual Alignment and Completion of Wikipedia Templates". Proceedings of the third international workshop on cross lingual information access: Addressing the information need of multilingual societies (CLIAWS3). pp. 21–29. 
  30. Johnson, Isaac and Lescak, Emily (2022). "Considerations for Multilingual Wikipedia Research". arXiv. 
  31. Redi, Miriam and Gerlach, Martin and Johnson, Isaac and Morgan, Jonathan and Zia, Leila (2020). "A Taxonomy of Knowledge Gaps for Wikimedia Projects (second draft)". arXiv. 
  32. Aragón, Pablo and Sáez-Trumper, Diego (2021). "A preliminary approach to knowledge integrity risk assessment in Wikipedia projects". arXiv. 
  33. Elmimouni, Houda; Forte, Andrea; Morgan, Jonathan (2022). "Why People Trust Wikipedia Articles: Credibility Assessment Strategies Used by Readers". Proceedings of the 18th International Symposium on Open Collaboration. pp. 1––10. 
  34. Redi, Miriam; Fetahu, Besnik; Morgan, Jonathan; Taraborelli, Dario (2019). "Citation Needed: A Taxonomy and Algorithmic Assessment of Wikipedia's Verifiability". The World Wide Web Conference. pp. 1567––1578. 
  35. Wong, KayYen and Redi, Miriam and Saez-Trumper, Diego (2021). "Wiki-Reliability: A Large Scale Dataset for Content Reliability on Wikipedia". Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. pp. 2437––2442. 
  36. Mitrevski, Blagoj and Piccardi, Tiziano and West, Robert (2020). "WikiHist.html: English Wikipedia's Full Revision History in HTML Format". Proceedings of the International AAAI Conference on Web and Social Media 14. pp. 878––884. 
  37. Hwang, Sohyeon and Shaw, Aaron (2022). "Rules and Rule-Making in the Five Largest Wikipedias". Proceedings of the International AAAI Conference on Web and Social Media 16. pp. 347––357. 
  38. Mindel, Vitali and Aaltonen, Aleksi and Rai, Arun and Mathiassen, Lars and Jabr, Wael (2024). "Timely Quality Problem Resolution in Peer-Production Systems: The Impact of Bots, Policy Citations, and Contributor Experience". Information Systems Research (INFORMS). 
  39. Das, Paramita and Johnson, Isaac and Saez-Trumper, Diego and Aragón, Pablo (2024). "Language-Agnostic Modeling of Wikipedia Articles for Content Quality Assessment across Languages". Proceedings of the International AAAI Conference on Web and Social Media 18. pp. 1924––1934. 
  40. Trokhymovych, Mykola and Sen, Indira and Gerlach, Martin (2024). "An Open Multilingual System for Scoring Readability of Wikipedia". arXiv. 
  41. McDowell, Zachary J and Vetter, Matthew A (2022). "Wikipedia as Open Educational Practice: Experiential Learning, Critical Information Literacy, and Social Justice". Social Media+ Society (SAGE Publications Sage UK: London, England) 8 (1): 20563051221078224.