Grants:Programs/Wikimedia Research Fund/Implications of ChatGPT for knowledge integrity on Wikipedia

statusFunded

Wikimedia Research Fund

Implications of ChatGPT for knowledge integrity on Wikipedia

start and end datesJuly 2023 - July 2024

budget (USD)32,449 USD

fiscal year2022-23

applicant(s)• Heather Ford, Michael Davis and Marian-Andrei Rizoiu

give feedback

friendly space expectations

browse all Wikimedia Research Fund requests

Overview[edit]

Applicant(s)

Heather Ford, Michael Davis and Marian-Andrei Rizoiu

Affiliation or grant type

University of Technology Sydney

Author(s)

Heather Ford, Michael Davis and Marian-Andrei Rizoiu

Wikimedia username(s)

Heather Ford: User:hfordsa

Project title

Implications of ChatGPT for knowledge integrity on Wikipedia

Research proposal[edit]

Description[edit]

Description of the proposed project, including aims and approach. Be sure to clearly state the problem, why it is important, why previous approaches (if any) have been insufficient, and your methods to address it.

The aim of this project is to explore the implications of content-generating AI systems such as ChatGPT for knowledge integrity on Wikipedia and for information derived from Wikipedia.

Research questions:

Does the potential use of ChatGPT and other AI tools for generating content threaten information integrity on Wikipedia or on information derived from Wikipedia and distributed elsewhere? If so, what policy and process interventions may mitigate this threat?

We will answer these questions by focusing on the following themes:

1. The implications of AI-generated content for knowledge integrity and information quality on Wikipedia and for information derived from Wikipedia

2. Analysis of existing policies and processes to identify potential gaps relevant to (1)

3. Recommendations on strategies to fill gaps identified in (2).

Methodology:

a) Information gathering: interviews with Wikipedians/editors and Wikimedia data science, and policy / integrity team members. Interviews with OpenAI and ChatGPT users.

b) Extrapolation of epistemic process from step (a), Wikipedia guidelines, supplemented with analysis of a sample of Wikipedia page histories, and data from existing studies

c) Methodological analysis using framework derived from information studies, data science and social epistemology

d) Content analysis of page content / AI-generated data

The project will be housed by the Centre for Media Transition (CMT) at the University of Technology Sydney. CMT’s interdisciplinary approach cuts across law, policy, communication and journalism, with technology as a key theme. We conduct both fundamental and applied research, developing theoretical frameworks from empirical data to inform effective policy and industry interventions. We collaborate with industry, government, academic and civil society on research, engagement and education projects around the Asia Pacific.

Our broad-based approach understands information quality as contextual, i.e. dependent on features and practices of the environment in which information is developed and applied. Instead of focusing only on obvious cases of false information at the extreme, we are interested in practices of information production on Wikipedia more generally and the extent to which they enable robust, secure and ethical information production.

Personnel[edit]

Monica Attard, co-director of the Centre for Media Transition, University of Technology Sydney (Australia)
Derek Wilding, co-director of the Centre for Media Transition, University of Technology Sydney (Australia)

Budget[edit]

Approximate amount requested in USD.

32,449 USD

Budget Description

Briefly describe what you expect to spend money on (specific budgets and details are not necessary at this time).

The budget will be spent on personnel (a part-time 0.6 FTE research assistant for 6 months), teaching buy-out for two academics for one tutorial each during the term when the fieldwork will be conducted ($6,752), a public event hosted at UTS ($3,376) and 15% overheads required by the university.

Impact[edit]

Address the impact and relevance to the Wikimedia projects, including the degree to which the research will address the 2030 Wikimedia Strategic Direction and/or support the work of Wikimedia user groups, affiliates, and developer communities. If your work relates to knowledge gaps, please directly relate it to the knowledge gaps taxonomy.

This project will contribute significantly to understanding the implications of AI technologies such as ChatGPT for knowledge integrity on Wikipedia. This work supports the 2030 Strategy Direction in its aim to make Wikipedia the essential infrastructure of the ecosystem of free knowledge. It will provide insight into potential two-way information flows between Wikipedia and AI systems, with the aim of developing strategies to ensure that flow involves comprehensive, reliable, and high-quality information.

Dissemination[edit]

Plans for dissemination.

We will produce at least 1 open access journal article in a high ranking communications journal, e.g. Journal of Computer-Mediated Communication and a public report aimed at the Wikimedia community focusing on practical implications of the research for ensuring knowledge integrity on Wikipedia in the face of AI tools such as ChatGPT. We will present the research at a relevant forum (e.g. Wikimedia Research Showcase, Wiki Workshop or Wikimania) and an event at UTS for researchers and Wikipedians

Past Contributions[edit]

Prior contributions to related academic and/or research projects and/or the Wikimedia and free culture communities. If you do not have prior experience, please explain your planned contributions.

Heather Ford is a former Wikimedia Foundation Advisory Board Member, former Executive Director of iCommons and co-founder of Creative Commons South Africa. A longtime Wikipedia researcher, she just released a book with MIT on Wikipedia and the 2011 Egyptian Revolution.

Michael Davis is a social epistemologist working on disinformation. He formerly worked in the Disinformation Taskforce at the Australian Communications and Media Authority.

Marian-Andrei Rizoiu leads the Behavioral Data Science lab at UTS. His research crosses computer science, psycholinguistics and digital communication to understand human attention dynamics in the online environment. None of the team has had prior funding from Wikimedia Foundation or Wikimedia Research Fund

I agree to license the information I entered in this form excluding the pronouns, countries of residence, and email addresses under the terms of Creative Commons Attribution-ShareAlike 4.0. I understand that the decision to fund this Research Fund application, the application itself along with all the information entered by my in this form excluding the pronouns, country of residences, and email addresses of the personnel will be published on Wikimedia Foundation Funds pages on Meta-Wiki and will be made available to the public in perpetuity. To make the results of your research actionable and reusable by the Wikimedia volunteer communities, affiliates and Foundation, I agree that any output of my research will comply with the WMF Open Access Policy. I also confirm that I have read the privacy statement and agree to abide by the WMF Friendly Space Policy and Universal Code of Conduct.

Yes