Research:Implications of ChatGPT for knowledge integrity on Wikipedia

From Meta, a Wikimedia project coordination wiki
Created
06:07, 12 July 2023 (UTC)
Duration:  2023-July – 2024-June
ChatGPT, misinformation, AI, language learning models, policies

This page documents a research project in progress.
Information may be incomplete and change as the project progresses.
Please contact the project lead before formally citing or reusing results from this page.


Large Language Models (LLMs) like ChatGPT have captured global attention, promising great advances to knowledge. Some in the Wikimedia community have identified the possibilities of LLMs: enabling editors to generate a first draft of an article, to summarise sources, to produce transcriptions of video and to more easily query Wikidata content.[1][2] Others have highlighted the possible risks of LLMs producing vast swathes of AI-generated content or automated comments to simulate the appearance of discussion, debate and consensus that make the job of maintaining quality, verified, consensus-driven content difficult.[1][2] The aim of this project is to explore the implications of content-generating AI systems such as ChatGPT for knowledge integrity on Wikipedia and to investigate whether Wikipedia rules and practices are robust enough to deal with the next generation of AI tools. Knowledge integrity is a foundational principle of Wikipedia practice: the verifiability of Wikipedia content is a core content policy and Wikipedians have been working to understand how to counter systemic bias on the project for almost two decades. By garnering perspectives from Wikimedia practitioners, LLM experts and academic and grey literature about its possible (and evolving) implications and by analysing current policies and practices for vetting automated tools, this project will map out the most important areas for possible policy expansion and adjustment of current practice to deal with possible risks to the Wikipedia project. This work supports the 2030 Strategic Direction in its aim to ensure that Wikimedia constitutes essential infrastructure of the ecosystem of free knowledge.[3] It will also provide insight into potential two-way information flows between Wikipedia and AI systems, with the aim of developing strategies to ensure that flow comprises comprehensive, reliable, and high-quality information.

Methods[edit]

This project engages with data collection on three distinct levels:

  1. Online documentation of Wikimedia policies as well as Wikitalks, Signposts, and Wikimedia lists discussing ChatGPT, LLM or Language Learning Models, and algorithmically generated content.
  2. Focus groups and interviews with Wikimedians discussing the above. Potential participants will be identified through self-selection, online discussion participation, and adjacency to knowledge integrity functions and concerns.
  3. A systematic review of literature investigating the risks derived from LLM or other algorithmic content generation systems like ChatGPT for online collaborative media

As noted, the project is focused on the systematic analysis of the implications of latest-generation LLMs for Wikimedia policy and practice. Accordingly, we apply analytical methods from policy analysis and applied epistemology to:

  • analyse relevant Wikipedia content policy and practice
  • model the epistemic processes embedded in the practices of the Wikipedia community, including the application of policy
  • understand the potential interactions of LLMs in Wikipedia knowledge generation, verification and dissemination
  • assess the risks for Wikipedia information integrity presented by LLMs
  • analyse potential interventions in policy and practice to mitigate these risks.

We consider it critical to ground this analysis in the actual practices of the Wikipedia community. Policy analysis alone cannot provide adequate insight into epistemic practice, nor therefore, into any risks presented by LLMs. Our analysis will therefore be grounded in the everyday experience and actual practice of Wikipedia editors and Wikimedia Foundation experts working on information integrity issues, with particular consideration given to the context of community, academic and public conversations about LLMs and their implications for knowledge and truth.

Timeline[edit]

July 2023[edit]

In phase 1 we will conduct a comprehensive review of:

  • on-wiki discussions about LLMs and their possible impact on Wikipedia
  • relevant grey literature from commercial operators like OpenAI, Bard, Bing
  • alternative approaches to verifiability from alternative operators like Mozilla.AI
  • current moves to regulate, govern or issue moratoria on LLMs (e.g. the open letter to pause “Giant AI Experiments” organised by the Future of Life Institute[4])
  • current Wikipedia policies (e.g. verifiability and bot policies) and practices (e.g. page patrol) most likely to be affected by the introduction of latest-generation LLMs like ChatGPT
  • digital media policy and governance literature and applied epistemology literature that focuses on issues of information quality, verification, transparency and data provenance, particularly on Wikipedia

This phase will identify a series of venues where public debate is happening around these issues. This will enable us to hone our interview list and interview questions in preparation for phase 2. We will continue to monitor these venues through to the final phase of the project. This phase will also provide a corpus of relevant Wikipedia policies and a comprehensive understanding of recent research that will provide a basis for our analysis.

August-September 2023[edit]

In phase 2 we will conduct a series of semi-structured interviews that build on the data collected in phase 1. The objective of this phase is to identify risks to Wikipedia’s information integrity from LLMs (to answer RQ1) and to gather information on the application of Wikipedia policy and processes in practice (to suggest leads for RQ2). We will conduct about 15 interviews with the following groups, using snowball sampling to find those with relevant experience:

  • Wikimedians: starting with an in-person focus group at Wikimania Singapore and interviews with individuals identified in Wikimedia-l conversations, followed by individual interviews. The goal is to understand to what extent LLMs are already having an impact on Wikipedia practice, which areas of practice might be most affected, and whether there are other risks not already identified that will be useful to consider. We will focus on community members who have direct experience working in areas most likely to be affected or related to LLMs (e.g. in new page patrol, bot policy etc).
  • Wikimedia Research Team members: particularly those connected to the Knowledge Integrity program. The goal with this group is to understand how knowledge integrity relates in practice to questions of verifiability and provenance and to garner ideas about what is possible in terms of governing LLMs (given previous practice in relation to governing other automated processes and tools).
  • OpenAI and other LLM practitioners (e.g. for Mozilla.AI and FOSS alternatives). The goal in interviewing this group is to understand the way that engineers are thinking about threats to information integrity from their products and what is being done or considered to mitigate against it.

October-December 2023[edit]

In phase 3 we will analyse the data gathered in phases 1 and 2. This will involve the application of a range of methods from digital ethnography, applied epistemology and policy analysis:

  • Analyse interview data from phase 2 using close reading and thematic analysis techniques from similar ethnographic studies (for example, see Ford, 2022[5]).
  • Drawing on this analysis and relevant data from phase 1, analyse and model Wikimedians’ verification practices using frameworks derived from existing analyses of the epistemology of Wikipedia[6][7], as well as process mapping and epistemic network analysis.[8][9][10]
  • Identify risks in existing policy and practice presented by the latest generation of LLMs using policy analysis methodology, including scenario analysis, case-study analysis and risk analysis, against the models and data obtained from (1) and (2).
  • Identify potential interventions in Wikipedia policy and practice to mitigate risks identified in (3), drawing again on applied epistemology and policy analysis methods. This will include drawing on data obtained from phase 2 consultations with Wikimedians and Wikimedia Research as well as the results from steps 2 and 3 of the phase 3 analysis.

January-June 2024[edit]

Consolidate and write up results for publication and presentation. We are planning three outputs for the project:

  1. A research report for the Wikimedia Community highlighting the risks and possible mitigations against those risks. The report will aim to inform Wikipedia policy and practice, supporting knowledge integrity and increasing the resilience of Wikipedia and other Wikimedia projects to threats posed by LLMs. After our initial report is first drafted, we will send it to all interviewees and ask for feedback, conduct final clarification interviews over email or video calls where necessary and then publish the final report aimed at the Wikimedia community. We also intend to present results from the report at WikiWorkshop 2024
  2. After receiving feedback on the draft report, which will be integrated into our research results, we will finalise a journal article for our research audience. We aim to publish this in a peer-reviewed OA Q1 policy-oriented journal such as Policy and Internet. The intended audience is digital media and media policy scholars. As well as contributing to scholarship on knowledge integrity and verification on Wikipedia, the article is likely to make an early contribution to understanding the implications of AI-generated content for information-integrity policy and practice in the digital environment more generally.
  3. We intend to hold a public-facing event at the Centre for Media Transition at UTS to communicate results to a wide audience, including researchers in digital media and media, AI and technology policy; the tech industry; and policy practitioners.


Policy, Ethics and Human Subjects Research[edit]

This research focuses on achieving data saturation through non-intrusive and undisruptive methods. While it uses emergent approaches, all contact will be preceded by desk-based research through WikiMedia texts as well as scholarly research and reports. Interviews and focus groups will be conducted in discussion-focused environments like WikiMania and will discuss topics of direct relevance to participants' daily practices. All participation is voluntary and candidates will be positioned to decide whether, and how much, to contribute. Their contribution will be de-identified and the research will adhere to rigorous academic ethical standards monitored by UTS ethical review processes.

Results[edit]

Forthcoming, 2024

Resources[edit]

Grants:Programs/Wikimedia Research Fund/Implications of ChatGPT for knowledge integrity on Wikipedia

References[edit]

  1. a b Harrison, S. (January 12, 2023). “Should ChatGPT Be Used to Write Wikipedia Articles?”. Slate. https://slate.com/technology/2023/01/chatgpt-wikipedia-articles.html
  2. a b Wikimedia contributors (2023a). Community Call Notes. Accessed 29 March, 2023. https://meta.wikimedia.org/w/index.php?title=Wikimedia_Foundation_Annual_Plan/2023-2024/Draft/External_Trends/Community_call_notes&oldid=24785109
  3. Wikimedia contributors (2023b). Movement Strategy. Accessed 29 March, 2023. https://meta.wikimedia.org/w/index.php?title=Movement_Strategy&oldid=24329161
  4. Pause Giant AI Experiments: An Open Letter. (2023, March 22). Future of Life Institute. https://futureoflife.org/open-letter/pause-giant-ai-experiments/
  5. Ford, H. (2022). Ethnographies of the Digitally Dispossessed. In The Routledge Companion to Media Anthropology. Routledge.
  6. Frost-Arnold, K. (2019). Wikipedia. In The Routledge Handbook of Applied Epistemology (1st ed., Vol. 1, pp. 28–40). Routledge. https://doi.org/10.4324/9781315679099-3
  7. Fallis, D. (2008). Toward an epistemology of Wikipedia. Journal of the American Society for Information Science and Technology, 59(10), 1662–1674. https://doi.org/10.1002/asi.20870
  8. Reijula, S., & Kuorikoski, J. (2021). Modeling Epistemic Communities. In M. Fricker, P. J. Graham, D. K. Henderson, & N. J. L. L. Pedersen (Eds.), The Routledge handbook of social epistemology. Routledge.
  9. Sullivan, E., Sondag, M., Rutter, I., Meulemans, W., Cunningham, S., Speckmann, B., & Alfano, M. (2020). Vulnerability in Social Epistemic Networks. International Journal of Philosophical Studies, 28(5), 731–753. https://doi.org/10.1080/09672559.2020.1782562
  10. Shaffer, D. W., Collier, W., & Ruis, A. R. (2016). A Tutorial on Epistemic Network Analysis: Analyzing the Structure of Connections in Cognitive, Social, and Interaction Data. Journal of Learning Analytics, 3(3), Article 3. https://doi.org/10.18608/jla.2016.33.3