We are a research group at Cornell University studying ways to encourage healthier online discussions. Our group has had prior successful experiences collaborating with the Wikipedia community towards this goal.1 We are now running a user study which will directly involve the participation of Wikipedia editors. The study revolves around a prototype browser extension "ConvoWizard" which uses AI technology2 to provide Wikipedia editors with real-time warnings of rising tension within conversations. Specifically, whenever an editor who has ConvoWizard installed replies to a discussion on a talk page or noticeboard, the tool will provide an estimate of whether or not the discussion looks to be getting tense (i.e., likely to deteriorate into incivility), as well as feedback on how the editor’s own draft reply might affect the estimated tension. This is based on a tool that we previously piloted on Reddit, so those interested in finding out more can check out NPR's coverage of the study.
We are actively recruiting participants for this study. Anyone interested in joining can sign up using this form.
Participants will be asked to install and use the ConvoWizard browser extension for a specified period of time (note that ConvoWizard is only officially supported on Chrome and Firefox; other Chromium-based browsers like Arc, Brave, and newer versions of Microsoft Edge may also work but have not been tested. Safari is not supported). During this period, ConvoWizard will record participants' commenting behavior with the tool enabled (e.g., what edits they make to their draft before posting it) to enable research on the effects of using the tool. Participants will also be asked to fill out a pre-survey and post-survey, which will ask general questions about their commenting habits and their thoughts on ConvoWizard.
Official recruitment is planned to begin mid-June 2023. After signing up, participants will be sent a link to install ConvoWizard. We ask that participants use ConvoWizard for a minimum of two months (starting from time of installation), although they are allowed to keep using it longer if they find it helpful. Regardless of whether they choose to continue using ConvoWizard, at the end of the two-month period participants will be asked to fill out the exit survey, which should take no more than 30 minutes.
Policy, Ethics and Human Subjects Research
In order to understand the effects of using the tool, ConvoWizard will collect information about how participants interact with it. Specifically, whenever a participant starts drafting a reply to a discussion, the following information will get logged, regardless of whether they go through with publicly posting the reply:
- The timestamp at which the user started drafting the reply, as well as (if the reply gets posted) when it got posted
- The ID of the comment the user was replying to (ID as represented in the Wikipedia page HTML)
- Periodic snapshots of the contents in the user's in-progress reply text box
ConvoWizard does not collect any other sort of user information, and we use standard Chrome/Firefox security features to ensure that it cannot do anything outside the context of Wikipedia pages.
All data collected during the study will be stored securely and confidentially on Cornell servers. The data is only stored for the duration of the study and subsequent analysis period, and will be deleted afterwards. We do not collect any personally identifying information, with the sole exception of if a participant voluntarily chooses to disclose their email address (for participants who do not wish to disclose this information, we fully support the Wikipedia email system or User talk page posts as methods of communication).
This study has been reviewed and approved under Cornell IRB #2007009714.
1For a recent example, see this study we conducted on talk page moderation.
2For those with a technical background in machine learning and/or natural language processing who are interested in more details about the technology, it is introduced in this paper; the model is also open-source and its training data is publicly accessible and documented.