Jump to content

Future Audiences/Discord bot

From Meta, a Wikimedia project coordination wiki

Future Audiences is exploring strategies to expand beyond our existing audiences of readers and contributors so we can truly reach everyone in the world as the “essential infrastructure of the ecosystem of free knowledge.” This includes investigating whether/how audiences might like to interact with Wikipedia information in a more conversational way.

Building on FY2023-4 experiments with conversational AI, we wanted to understand whether and how people might like to get information from Wikipedia via a conversational AI bot. We tested this via an experimental conversational application on Discord. This provided us with an opportunity to test new ways of delivering free knowledge – i.e., social and/or gamified approaches – to younger audiences on Discord.

We created a Discord bot with two primary functions: answering questions and generating trivia quizzes. Question answers and quizzes were synthesized using Wikipedia information. Our bot utilized Llama3 70b, with GPT-4o acting as a backup model for potential outages.

Learning & insights: Through this experiment, we encountered new challenges with providing Wikipedia information on third-party platforms (specifically: Wikipedia contains topics that may be considered inappropriate for younger audiences, and we were not able to get adoption of our bot in Discord communities with younger audiences because we did not have a way to filter the information served through the bot according to Discord's terms of service). We have ended this experiment, but are continuing to experiment with new ways to bring Wikipedia information to younger audiences (e.g. Roblox).

Experiment FAQs

[edit]

Why Discord and why a Discord application?

[edit]

In 2023, conversational AI was still new and our ChatGPT experiment did not surface a large appetite among consumers to interact with Wikipedia conversationally. Since then, AI-powered chatbots have become a more routine fixture online, including on social media and messaging apps. Facebook, WhatsApp, Instagram, and Twitter/X all now have an AI powered general knowledge chatbot.

We picked Discord, a chat platform, due to its popularity among Millennials and Gen Z. It has close to 200 million monthly active users and 21 million “servers” (unique spaces for groups of users to chat). While Discord began as a chat platform for gamers, it is now used for a variety of groups to gather and connect (including Wikipedians). Discord is also a popular platform for 3rd party developers, with over 9,000 third party bots developed for the site, including by platforms like Netflix and Soundcloud.

Many popular Discord bots provide fun/social experiences for a group chat – e.g., create in-chat gameplay, produce images on command, recommend music or anime, etc. We believed providing some social and/or gamified functionality would be critical in getting organic adoption and usage of our bot.

What did our application do?

[edit]

MVP Functionality

[edit]

We created a chat experience that:

  • Answered questions based on information contained in Wikipedia
  • Provided a fun gamified way to learn about a Wikipedia knowledge topic

1. As a user, I can:

  1. Ask the bot questions and receive:
    1. A natural-language summary of the relevant Wikipedia knowledge
    2. Link(s) to relevant pages used
    3. Metadata on quality of content (number of contributors, date last edited)
  2. Ask the bot to generate a quiz from a Wikipedia article and get a fun interactive Q&A experience

Technical Functionality

[edit]

The Discord bot reused the Retrieval-augmented generation (RAG)-based methodology we developed for the Wikipedia plugin for ChatGPT:

  • In response to a natural language query provided by the user (either a question or a request for a quiz) the bot used the ChatGPT API to break this down into keywords that can be sent to the Mediawiki Search API
  • The bot then took the content of the relevant article(s) returned by the Mediawiki Search API and run it back through ChatGPT to provide the user back a summarized answer to a question or a quiz on the topic, depending on what they asked for.

We provided links to the source article(s) for each response.

Data Collection

[edit]

We relied on Discord archives to log queries and answers for quality assessment purposes. Users were able to mark bad responses, which helped us understand perceived quality.

We ensured that personal data, such as usernames, was handled appropriately: potentially identifying data was deleted in accordance with our retention policies. No privacy compliance issues have been found.

Other Ideas Explored

[edit]
  • Other utility or fun use-cases (games, trivia, “Citation Needed”-like fact checking, subscriptions)
  • Prompts to learn more or learn about “related concepts” after a query was completed

Research questions and insights

[edit]

1. Do people want to interact with Wikipedia knowledge in a more conversational way?

[edit]

We set out to learn to what degree people expect, want, or trust these types of experiences to get encyclopedic information, and whether it is possible for a chat experience to reflect Wikipedia’s values of neutrality and information integrity. We began experimenting to learn about this via an off-Wiki AI application that would be less risky than starting with an experimental application on-Wiki, but still deliver valuable insights about consumer experience and behavior.

After development, we reached out to 40 Discord servers that we believed would find our bot most useful and/or interesting. These servers were generally concerned with academics, trivia, or niche knowledge. Servers were contacted either through designated surveys for reaching moderator (mod) teams or by directly approaching mods. While five Discord servers showed interest, only two agreed to try out our bot. The first of these was an academics-based server with 50k+ members, and the second was a comedy-based server with 2k+ members.

Regarding the relevant mods for the three servers that did not follow through with implementing the bot: two of these mods did not continue correspondence, and the mod of the third server did not believe our bot's capabilities matched their server's needs.

The mods of the fourth server (a server that provides academic help for students in the US with over 40,000 members) agreed to pilot the bot in their moderator channel. However, after less than 12 hours, the mod team decided not to move forward with granting access, due to concerns around exposing minors to NSFW content. Despite offering further improvements, we were not able to acquire permission for a second round of testing–this mod team was not confident that the bot could be filtered to their desired extent, especially since other models have faced difficulty guarding from inappropriate requests.

The mods of the fifth server (a comedic server with over 2,000 members) agreed to try out the bot on a specific channel. Though users (primarily mods) did ask the bot questions, usage decreased and eventually halted.

Insights:
[edit]

Users were interested in interacting with Wikipedia, but the primary benefit to our bot that spoke to users was that the bot would allow users to stay on Discord. Users were less interested in the conversational nature of our bot and more interested in the convenience the bot could enable.

However, we were not able to get significant adoption or usage in these communities. One major barrier was the ability of the bot to provide any information from Wikipedia requested by the user, including information that may be deemed inappropriate/potentially violating Discord's terms of use. Discord servers that cater to minors have additional content restrictions that can result in the server being terminated if inappropriate content appears there, and without the ability to filter NSFW questions/content, a Wikipedia bot cannot be used there.

Ensuring only appropriate content is accessible, especially for LLM-based experiments and features, is absolutely key to adoption and continued usage. Had our bot had better content filtering capabilities, we may have been able to continue testing in the academic server.

2. Do people want to interact with Wikipedia knowledge in a fun/gamified way?

[edit]

Games and quizzes have become popular ways for information/media platforms like The New York Times to attract new audiences. Via a partnership with Kahoot! (a quiz platform) Wikipedia has had a Kahoot! channel since November 2023, where 1.9M players have played a Wikipedia quiz in the past year, indicating that there is appetite for off-platform gamified knowledge experiences. Testing new concepts for gamified knowledge experiences off platform could help get insights for on or off-platform games investment. The WMF Android App team has prototyped a Wikipedia game experience, but more exploration of how we might deliver gamified experiences (e.g., quizzes around knowledge topics specified by users) would be useful for developing games on or off-wiki.

The comedy-based server agreed to a bot integration for a specific channel, a channel dedicated to a roleplay game based on Wikipedia. This group tried out a few prompts to test the bot’s question-answering capabilities, but were more interested in the bot’s quiz-generating capabilities. However, usage quickly dropped off.

The academic server was also curious about the generated quizzes, but similarly dropped off in usage. Interest was otherwise expressed through passing comments, but the server mods generally did not further engage with quizzes, opting instead to test the bot’s question-answering capabilities. Altogether, we did not achieve the level of usage that we had anticipated for generated quizzes, which we hypothesized would serve as a proxy for interest in gamified features. From observation in other Discord servers, Discord users do heavily engage with games on Discord, but prefer to use bots that specialize in games instead. A future strategy for assessing interest in gamification would likely require specialization.

Insights:
[edit]

Altogether, users seemed interested in the bot’s quiz-generating capabilities, but did not have enough interest to sustain continued usage. By contrast, other bots–specifically those that specialize in games–enjoy high levels of usage.

We were ultimately not able to validate this question due to lack of adoption/usage of the bot. However, we want to continue to learn about ways to turn Wikipedia content into gamified experiences on platforms where this is the explicit goal (e.g. Roblox).

3. Can a Wikipedia experience on Discord increase engagement with Wikipedia among younger audiences?

[edit]

Discord is home to many highly active Wikipedians (the community-run Wikipedia Discord server has over 7,000 members), but Discord is also home to many other niche communities that are popular among young people, e.g.: gaming, anime/K-Pop fandoms, etc. We were interested in seeing if a Wikipedia bot could spread organically among these communities, and whether this might encourage more active engagement with/contribution to Wikipedia from younger generations.

Insights:
[edit]

We were not able to validate this question due to lack of adoption/usage of the bot. During our test, we did not observe increased engagement with Wikipedia among younger audiences beyond the test period.

4. Can we ensure that the bot doesn't deliver incorrect, misleading, or harmful information?

[edit]

We performed several rounds of internal evaluation before releasing the bot publicly on Discord. As with our previous work on the Wikipedia plugin for ChatGPT, this included testing for specific known jailbreak use-cases and failure conditions to minimize incorrect, misleading, or harmful output. We were aiming for a 75-90% acceptable answer rate, with following output considered acceptable:

Acceptable:
[edit]
  • "Can't answer this" response (when there is insufficient content on Wikipedia to answer the question, or when the question is not relevant to Wikipedia – e.g., a matter of opinion)
  • Relevant and correct information (provides nuance, clearly states when there is a lack of information or ambiguity in the information on Wikipedia)
  • Irrelevant but correct information (provides information that is not relevant to the user's query – e.g., summarizes from a different but similarly-named article – but does so faithfully and transparently, linking to source article)
Insights:
[edit]

Out of 336 queries, 43% were labeled for answer quality. 30% of all queries were marked as good, and 13% were marked as questionable. This means that 70% of marked answers were marked as good, and that 30% of marked answers were marked as questionable. Questionable answers included wrong, inappropriate, and other poor-quality answers. This level of content filtering proved to be insufficient for users.

Our internal testing indicated that the bot was providing a relatively high degree of accurate and relevant answers. However, in real-world testing, the bigger issue was the bot providing information that Discord communities deemed inappropriate (this information is contained in Wikipedia and is encyclopedic, but is not allowed in servers with young users per Discord's policies).

Conclusion

[edit]

Our Discord bot did not provide the level of value that would have inspired continued usage to our users. Though there was interest in the knowledge-providing and gamified features of our bot, these functions were ultimately unable to generate enough interest. For continued experimentation on how gamification can help Wikipedia reach new audiences, we have undertaken experiments in Roblox.