Strategy/Multigenerational/Artificial intelligence for editors
This strategy brief was published in April 2025 and was authored by Chris Albon and Leila Zia from the Wikimedia Foundation. It is also available on Figshare and Wikimedia Commons.
Executive summary
[edit]The community of volunteers is the most important and unique element of success behind Wikipedia – a leading example of encyclopedic knowledge governance around the world. For over a decade the community and Wikimedia Foundation (WMF) have developed and used artificial intelligence (AI) to power bots, products, and features.[1] The dual forces of recent advances in AI and the increasing challenges of moderating a complex knowledge ecosystem motivated us to develop a strategy to further leverage AI’s potential and mitigate its risks on the projects. This strategy focuses on using AI to enhance the work of editors in areas that AI can have the highest impact in the projects: by automating routine and repetitive tasks that do not require human judgment or discussions, onboarding and mentorship of new editors, opening time for editors to focus on local and specialized encyclopedic knowledge across languages, and prioritizing the safeguarding of knowledge integrity, all while preserving human agency. By significantly and purposefully supporting editors’ work with AI in these areas, this strategy aims to maintain and enhance Wikipedia’s position as a trusted source of knowledge, driven by the community for generations to come.
Scope
[edit]This strategy will set a high-level direction for the development, hosting, and use of AI in product, infrastructure, and research at WMF at the direct service of the editors of the Wikimedia projects. We further narrow the focus on Wikimedia projects which are deemed important for sharing in the encyclopedic knowledge.
By adopting the above scope, we would like to explicitly highlight the following outside of the scope of this strategy: 1) WMF’s AI advocacy as well as what affiliates and volunteers choose to do with AI; 2) AI work that WMF may do outside of the Wikimedia projects and technology platforms that WMF owns; 3) Strategic recommendations with respect to how WMF interacts with technology companies that use Wikimedia’s content to build their own AI; 4) the use of AI by WMF for applications that are internal to the organization.
Horizon
[edit]This strategy aims to direct the work of the organization between July 1, 2025 to June 30, 2028 within the scope we specify above.
Updating cadence
[edit]The space of research and development in AI is highly dynamic. We recommend revisiting this strategy once a year to consider updates. If major breakthroughs or changes take place, we will revisit this strategy outside of the annual cycle.
Goal
[edit]AI offers opportunities for many aspects of Wikipedia’s experiences. With this strategy, we are recommending that we should, in particular, develop and host AI powered technologies to:
- Onboard new editors.
- Motivate existing editors to continue contributing to the encyclopedic work by reducing their workload and supporting them in contributing in areas that they are uniquely positioned to contribute to.
- Strengthen Wikipedia’s position as the most trusted source of encyclopedic knowledge in more languages.
- Establish the Foundation as the leader in developing and using AI with a human-first approach, prioritizing tools that empower and not replace editors, protect community values, and increase access to knowledge.
Key questions
[edit]- What are the different strategies WMF could take with respect to applying AI in the domain of editing (content generation and content moderation)? And what are the tradeoffs of choosing each of those strategies?
- Which of the strategies should we pursue?
- With this strategy, how will our development, hosting and use of AI align with our core values?
- With this strategy, how will humans interact with AI?
- With this strategy, how should we apply AI to content generation? How should we apply AI to content moderation? How should we prioritize AI use between improving existing content quality and generating new contributions?
- With this strategy, what investments will be needed for success? What investments should be altered or discontinued?
Current state
[edit]As the internet continues to change and the use of AI increases, we expect that the knowledge ecosystem will become increasingly polluted with low-quality content, misinformation, and disinformation. We hope that people will continue to care about high quality, verifiable knowledge and they will continue to seek the truth, and we are betting that they will want to rely on real people to be the arbiters of knowledge. Made by human volunteers, we believe that Wikipedia can be that backbone of truth that people will want to turn to, either on the Wikimedia projects or through third party reuse.
Wikipedia’s model of collective knowledge generation has demonstrated its ability to create verifiable and neutral encyclopedic knowledge. The Wikipedian community and WMF have long used AI to support the work of volunteers while centering the role of the human. Today we use AI to support editors to detect vandalism on all Wikipedia sites, translate content for readers, predict article quality, quantify the readability of articles, suggest edits to volunteers, and beyond. We have done so following Wikipedia’s values around community governance, transparency, support of human rights, open source, and others. That said, we have modestly applied AI to the editing experience when opportunities or technology presented itself. However, we have not undertaken a concerted effort to improve the editing experience of volunteers with AI, as we have chosen not to prioritize it over other opportunities.
Recent advances in AI have led to new possibilities in the creation and consumption of content. Large language models (LLMs) capable of summarizing and generating natural language text make them particularly well-suited to Wikipedia’s focus on written knowledge. The long-term potential of these technologies to create high-quality, scalable user experiences is significant and warrants careful consideration. At the same time, the same technologies are posing risks to the Wikimedia projects and the workflows of editors. For instance, many people and governments across the world are now able to, within minutes, generate thousands of Wikipedia-like articles that at a first glance may look legitimate, and they may well be hard-to-detect Wikipedia hoaxes. While generating new content has become extremely cheap and accessible, the verification of content has remained slow and costly. However, verifiability is a backbone of encyclopedic work. Editors need significant support by WMF to harness the best of what AI can offer to the projects in the face of these opportunities and challenges.
Possible solutions
[edit]In developing this strategy, we explored multiple options and considered many trade-offs. Our goal was to determine the best path for integrating AI into our work while upholding our values and ensuring that the editors and the communities remain at the core of the project.
We first explored the option to support editors with AI incidentally. This is the status quo option that is closest to how we currently do research, product and feature development. We invest some resources on AI but not a lot. Following this path means that we will not respond to a changing internet[2] at the peril of the projects. As shared earlier, given how recent advancements in AI have made content generation easy, and given that the cost of verifiability remains high for editors, the editors will be at significant risk of overload and burnout. Maintaining the status quo risks leaving Wikipedia’s user experience behind the expectations of modern internet users and even more so with the next generation. While Wikipedia as a platform has traditionally evolved at a slower pace, the broader internet landscape continues to advance rapidly, setting new standards for usability, mobile-first design, interactivity, and accessibility. Without adapting to meet these expectations, we risk alienating both current and future users and diminishing Wikipedia’s value and relevance.
The second possible strategy would be to invest in AI knowledge generation over human knowledge generation. Advances in AI make it clear that in the coming years there will be increasing attempts at using AI for knowledge creation or curation by directly summarizing primary, secondary, and tertiary sources. There are obvious appeals to this approach for companies, such as efficiency and scalability. But there are also downsides, such as limited human oversight, vulnerability to biases, hallucinations, potential for misinformation and disinformation spread, limited local context, significant invisible human labor,[3] and weak ability to handle nuanced topics. Adopting this strategy has a further and more important risk. Since volunteers are the core and unique element of success for the Wikimedia projects this strategy can discourage existing volunteers, to the peril of the projects.
The strategies discussed above will not enable us to reach our goals for a multi-generational Wikipedia. Therefore, we recommend a third strategy: make a significant and targeted investment in supporting editors with AI. While companies are leaning away from human-created knowledge, we should lean in on the collective power of editors and use AI to assist them. Humans supported by AI will be more effective at generating knowledge than humans or AI alone. In addition, we propose using AI in areas where AI is uniquely positioned to support editors and advance the goals of the Wikimedia Movement. This targeted approach is important because it allows us to achieve the highest impact within the realities of our budget and resources.
Prioritized strategy and tradeoffs
[edit]Our prioritized strategy is to invest in AI to support editors in areas where AI can have a unique advantage over other technologies to solve problems of impact and to prioritize editors’ agency in interacting with AI. More specifically, we recommend investing in AI to support the editors as follows:
Prioritize AI assisted workflows in support of moderators and patrollers. The recent advancements in AI, particularly generative AI, have made content generation significantly easier and that introduces significant risk for the editors and the projects as validating content remains costly. Thousands of hard-to-detect Wikipedia hoaxes and other forms of misinformation and disinformation can be produced in minutes.[4] Therefore we should prioritize using AI to support knowledge integrity and increase the moderator's capacity to maintain Wikipedia’s quality. Overloading editors with the task of managing an influx of AI-assisted content risks burnout and compromises Wikipedia’s quality and existence. This focus on workflows for moderators and patrollers ensures Wikipedia remains a trusted source, allowing editors to do their work effectively.
Create more time for editing, human judgment, discussion, and consensus building. Editors spend a significant amount of time before they can edit Wikipedia. Part of this time is invested in finding the information they need for their editing, discussion, or decision making. AI excels at handling tasks such as information retrieval, translation, and pattern detection. By automating these repetitive tasks, AI frees up editors’ time to focus on areas of encyclopedic work that require human expertise: editing, discussions, consensus building, and making judgment calls in complex situations where the stakes are high and the impact is significant.
Create more time for editors to share local perspectives or context on a topic. Editors of less represented languages face pressure to create more content in their local languages. Automating the translation and adaptation of common topics[5] allows editors to enrich the encyclopedic knowledge with cultural and local knowledge and nuances that AI models cannot provide. This allows editors to invest more time in creating content that strengthens Wikipedia as a diverse, global encyclopedia.
Engage new generations of editors with guided mentorship, workflows and assistance. Editors drive knowledge curation and governance. For the projects to be multigenerational, new editors must encounter editing workflows that fit their expectations and must find effective ways to get help. AI offers opportunities for generating valuable types of suggested edits that make sense for a new generation. And generative AI in particular offers a promising solution for automated mentoring and guidance of newcomers. AI can provide personalized support, from retrieving information and understanding policies to giving feedback on edits, helping newcomers feel confident and capable.
How we will implement this strategy
[edit]Our implementation of this strategy is shaped by WMF’s vision, mission, guiding principles, privacy policy, human rights policy, 2030 Movement Strategy and Multigenerational pillars. Below we highlight the core tenets drawn or informed by these sources that should define how we implement this strategy.
- We adopt a human-centered approach. We empower and engage humans, and we prioritize human agency.
- We prioritize using open source AI technologies or open weights, and we develop only open source AI.[6]
- Our use of AI will allow editors to focus more on what they want to accomplish, not how to technically achieve it.
- We coordinate with Wikimedia affiliates and we invest in the distributed network of people, institutions, and organizations to contribute to this strategy.
- We prioritize transparency.
- We prioritize multilinguality in nuanced ways.
- We continue to offer a space where humans can share in the sum of all encyclopedic knowledge without the fear of persecution or censorship.
Trade-offs
[edit]In order to arrive at the above prioritized strategy, we faced trade-offs and we had to make choices. We will share more about them below. Note that implementing this strategy will require more trade-offs and choices to be made by us and other decision makers in WMF. We currently capture draft implementation trade-offs in the Appendix.
Content generation vs. content integrity. Our resources are limited and we cannot significantly support editors with AI for both content generation and content integrity at the same time. We made a decision to first prioritize using AI to support editors to assure content integrity. By doing so we want to assure that moderators and patrollers are adequately supported for any surge of new content on the projects. Our reasoning is that new encyclopedic knowledge can only be added to Wikipedia at a throughput that is defined by the capacity of existing editors to moderate that content. If we invest heavily in new content generation before moderation, the new content will overwhelm the capacity of the editors to moderate them. This balance might well shift in time as the needs of moderation vs. new content shifts.
Open source models vs. open weight models.[7] We commit to building open source models for AI. However, we have to acknowledge our resources are too limited to develop our own open source foundational model which would require thousands of new servers[8] and hundreds of thousands of hours of work by machine learning engineers and researchers. For this reason we have made a choice to use open weight models when necessary to build features to support editors. We hope open source foundational models able to compete on best practice evaluations are released in the future.
Use AI in many places vs. use AI for specific areas of impact. Our resources, even considering the collective resources in the community and the broader free knowledge ecosystem, are not sufficient (i.e. expertise, funds for infrastructure, etc.) for planning, developing, tuning, and using AI for myriad different applications without focus. We have made a choice through this strategy to limit the applications of impact by focusing on four main areas. We acknowledge that there is hype and excitement around AI. We expect that we will be asked to apply AI to more and more applications, which may create friction when balancing new proposals with our focused approach. This tension can create frustration among those advocating for other areas of impact and may require us to regularly revisit our prioritization. It can also slow down the work on this strategy as we may need to regularly revisit our prioritization.
Acknowledgements
[edit]This strategy brief is possible thanks to the contributions and input of numerous people who engaged with us between June 2024-February 2025. We recognize and thank these individuals below.
Throughout the process, Selene Deckelmann and Marshall Miller supported us in a variety of ways including re-scoping of the work to specifically focus on editors as well as providing extensive feedback, particularly to the early stages of strategy. Nadee Gunasena partnered with Selena and us to create spaces and opportunities for us to engage and gather input from different groups. Miriam Redi provided frequent feedback to the early stages of our thinking and work. These conversations had varying dimensions: from the importance of prioritizing “open and free” to prioritizing a sustainable symbiosis between Wikipedia and generative AI. We also would like to thank Isaac Johnson for supporting us early on in arriving to a more nuanced understanding of generative AI for Wikipedia and multilingualism; and for proposing the framework of using AI where AI is more uniquely positioned (compared to other social or technical solutions that may not scale) to support editors (e.g., Mentorship).
In July-August 2024, we had a few sessions with WMF’s senior leadership to learn about their perspectives and priorities. These sessions were important for us because we wanted to have organizational alignment about the strategy we were developing and aligning with the leadership was one important aspect of it. We thank (in order of last name) Lane Becker, Nadee Gunasena, Maryana Iskander, Stephen LaPorte, Lisa Seitz Gruwell, Amy Tsay, Denny Vrandečić, and Yael Weissburg for deeply engaging with our questions and sharing their thoughts and perspectives freely.
In August, we held a session with some of the affiliates and volunteers during Wikimania 2024. We thank those who joined us in that conversation, sharing their perspectives, and providing valuable feedback to our thinking. One of the learnings we had from the session was that multiple affiliates looked forward to having clarity on “how” we implement the AI strategy. We dedicated one subsection in this strategy brief to this topic inspired by those conversations.
And lastly, we would like to thank Pablo Aragón, Adam Baso, Suman Cherukuwada, Rita Ho, Caroline Myrick, and Santhosh Thottingal for their questions, comments or input that helped improve this work.
Footnotes
[edit]- ↑ See ClueBot NG, one of the first AI powered community bots and WMF developed and hosted AI models.
- ↑ Special:MyLanguage/Strategy/Multigenerational
- ↑ Humans in the AI loop: the data labelers behind some of the most powerful LLMs' training datasets
- ↑ See Asaf Bartov's presentation in CEE 2024 for examples.
- ↑ Examples of such common topics include but is not limited to List of articles every Wikipedia should have/Expanded
- ↑ Note that the code for the major open source LLM technologies is currently not open. For some of these AI models the weights are open.
- ↑ Open source models provide access to the training data and code, while open weights only provide the trained parameters (weights), often in Safetensors format. These weights can be hosted on Wikimedia’s infrastructure using open source software libraries
- ↑ For comparison, according to one source Meta has 600,000 GPUs for AI, while the Wikimedia Foundation currently has fewer than 20.