Jump to content

Research:Computational Audiology and Large Language Models: Enhancing Hearing Health Literacy through Wikimedia Projects

From Meta, a Wikimedia project coordination wiki
Contact
Hector Gabriel Corrale de Matos
Lilian Cássia Bornia Jacob
Duration:  2024-September – 2026-March
Artificial Intelligence, Computational Audiology, Hearing Health Literacy

This page documents a research project in progress.
Information may be incomplete and change as the project progresses.
Please contact the project lead before formally citing or reusing results from this page.

Introduction and Background

[edit]

Accessible, accurate health information is critical for public health, yet “information disorder” and online misinformation pose serious threats, especially in healthcare. Large Language Model (LLM) chatbots like LLM have emerged as powerful tools in education and health communication, with the potential to combat misinformation by providing reliable, conversational access to knowledge. In the field of hearing health, computational audiology has developed as an intersection of audiology and artificial intelligence, applying computing methods to improve hearing care outcomes on a global scale. This approach highlights the importance of digital health literacy – the skills to find, understand, and use health information via digital tools – as a means to empower patients and practitioners alike. Improving health literacy, along with ensuring information is readable and understandable, can greatly enhance self-care, prevention, and health outcomes.

Hearing loss is a widespread public health challenge: approximately 1.5 billion people worldwide experience some degree of hearing loss (about one in five people), and this number could reach 2.5 billion (1 in 4) by 2050. Many cases are preventable or untreated, impacting quality of life through communication difficulties, social isolation, and even cognitive decline. Hearing health literacy has thus emerged as a strategic priority to address the global burden of hearing loss. By improving individuals’ understanding of hearing health (prevention, diagnosis, treatment options, assistive technologies, etc.), we can reduce stigma and delays in seeking care. However, achieving this requires not only medical interventions but also effective educational approaches and broad access to trustworthy information.

Wikimedia projects provide a vital open platform for disseminating health information and promoting health education globally. For example, Wikipedia articles on hearing-related topics and Wikidata entries for medical knowledge can reach millions of readers, complementing formal healthcare guidance. Integrating cutting-edge AI like LLM with Wikimedia’s free knowledge resources presents an innovative approach to boosting hearing health literacy. By leveraging LLM’s conversational abilities alongside Wikipedia and Wikidata’s extensive, peer-reviewed content, we aim to help students, professionals, and patients access and contribute high-quality information about hearing health. This research project builds on prior initiatives (such as a FAPESP-funded Wikipedia Education Program in hearing health) and seeks to explore how Large Language Models, combined with Wikimedia content, can enhance learning and public engagement in hearing healthcare.

Objectives

[edit]

Main Objective

[edit]

Evaluate the use of computational audiology, specifically the integration of LLM, in promoting hearing health literacy, by leveraging information and open data from Wikimedia free knowledge projects in both educational and clinical contexts.

Specific objectives

[edit]
  • Use LLM as a tool to facilitate health literacy in hearing health education and clinical practice.
  • Train undergraduate students in Speech-Language Pathology and Audiology (SLP-Audiology) to edit and contribute to Wikimedia projects (e.g. Wikipedia, Wikidata) and to utilize LLM effectively in this process.
  • Improve and expand hearing health content on Wikipedia and Wikidata by using LLM as part of an educational methodology, thereby increasing the quality and reach of hearing health information.
  • Evaluate the quality of the Wikimedia content (e.g. Wikipedia articles) that SLP-Audiology students create or improve, through assessments by human experts and machine-learning-based metrics.
  • Assess the level of hearing health literacy in patients with hearing loss, using instruments developed for this purpose, and track any improvements throughout the study.
  • Verify and enhance access to hearing health information on Wikimedia projects for patients and the general public, as a means to support hearing health literacy initiatives.

Parallel objectives

[edit]

Additionally, in parallel with the above goals, the project will explore new computational audiology approaches in both clinical hearing care and educational research, and contribute to advancing the implementation of AI technologies (like chatbots) in healthcare and educational settings. These parallel initiatives ensure the research not only addresses immediate educational outcomes but also informs broader applications of AI in audiology and health education.

Ethical Consideration

[edit]

The research protocol was submitted to and approved by the Research Ethics Committee of the Bauru School of Dentistry, University of São Paulo (CEP-FOB-USP), under approval No. 6.823.110 on May 14, 2024. The clinical study protocol is currently under development by the research team and will be registered with the Brazilian Registry of Clinical Trials (ReBEC).

SPIRIT-Artificial Intelligence Protocol

[edit]

The SPIRIT-Artificial Intelligence (Standard Protocol Items: Recommendations for Interventional Trials-Artificial Intelligence) statement was used to develop the research project. SPIRIT-Artificial Intelligence is an extension of the Standard Protocol Items: Recommendations for Interventional Trials, specifically designed to guide the preparation of clinical trial protocols that involve AI components. Using this checklist ensures that the study protocol meets the specific requirements for Artificial Intelligence-based interventions. The checklist will be available in the Open Science Framework repository.

Methodology

[edit]

This project is designed as a two-phase, mixed-methods intervention study, with both phases using a cross-sectional approach and randomized paired sampling to form control and experimental groups. Below, we outline Phase I (educational intervention) and Phase II (clinical intervention) methodologies:

Phase I: LLM in Educational Activities (Wikipedia/Wikidata Integration)

[edit]

Context and Participants: Phase I takes place in an academic setting – specifically, the Audiological Theory and Diagnosis III course for second-year undergraduate SLP-Audiology students at the Bauru School of Dentistry, University of São Paulo (FOB-USP). All 38 students enrolled in this course will be invited to participate, with an expected sample of about 16–32 students who consent (final number depending on voluntary uptake). Participants are paired and grouped using stratified randomization based on prior academic performance, ensuring that each **intervention group** student is matched with a **control group** counterpart of similar academic level. This pairing aims to balance the groups and reduce bias when comparing outcomes.

Intervention Design: Students will undergo training and then engage in a Wikipedia/Wikidata editing assignment about hearing health. They are divided into two groups: an Education Intervention Group (eIG) and an Education Control Group (eCG).

Intervention (eIG): Students in this group will use LLM (GPT-3.5/4) as a writing assistant to help research and generate content for Wikipedia articles and Wikidata entries related to hearing health. LLM is accessed via the OpenAI web interface (using institutional accounts) and used in Portuguese, the students’ working language. The chatbot can help with tasks like outlining articles, suggesting references, translating or simplifying technical information, etc., under the students’ direction. Control (eCG): Students in this group will perform the same Wikipedia/Wikidata editing tasks without using LLM or any AI assistance. This allows the project to isolate the effect of the AI tool by comparing outcomes between AI-assisted and non-assisted student work.

Training and Support: Before beginning the editing tasks, all participating students attend training sessions on both Wikimedia editing and effective use of the LLM tool. Training covers Wikipedia editing basics (using guides like Wikipedia de A a Z and a Wikiversity tutorial on Hearing Health) as well as Wikimedia’s content policies, citation practices, and use of Wikidata for structured data. In parallel, students are introduced to LLM functionality and best practices: they learn about AI’s capabilities and limitations, ethical considerations of using AI in health contexts, and techniques for writing good prompts (drawing on resources such as an AI in Higher Education Quick Start Guide and OpenAI’s prompt engineering guidelines). This combined training ensures that students have the digital skills to contribute productively to Wikimedia projects and to use the chatbot responsibly as a learning aid. All editing activities take place in a supervised computer lab with internet access, where students log into their Wikipedia/Wikidata accounts and the LLM interface as needed.

Activities: After training, students propose or select specific topics in hearing health (e.g. articles on hearing loss, ear diseases, interventions, public health campaigns, etc.) to improve or create on Wikipedia (and corresponding data items on Wikidata). Over the academic term (one semester), each student (or student pair) in both groups works on their assigned topic. Those in the eIG group can query LLM for help – for example, asking for an explanation of a medical term, a summary of research findings, or suggestions on how to simplify complex jargon – and then use that information (with proper citations) to edit Wikipedia. All students are encouraged to incorporate reliable sources (scholarly references, WHO data, etc.) into their contributions. The Wikimedia edits made by both groups are tracked and recorded (for instance, via the Programs & Events Dashboard or page revision histories) for later analysis.

Evaluation (Phase I): We will evaluate two main outcomes in Phase I: *student learning* (health literacy gains and wiki-editing skills) and *content quality* of their Wikipedia/Wikidata contributions. Several instruments and metrics are used for this assessment:

Knowledge Retention Quiz: Immediately after the training sessions and again after the editing phase, students complete an *Assessment Questionnaire* to gauge how well they retained the training content (covering Wikimedia editing principles and hearing health facts). This helps measure the educational impact of the intervention (with vs. without LLM support).

Wikipedia Contribution Quality: The articles edited or created by students are evaluated through a combination of human review and automated analysis Instructors or project researchers will perform a Wiki Academic Writing Assessment – essentially reviewing the entries for accuracy, clarity, proper sourcing, and completeness (using a rubric or the Wiki Education Foundation’s evaluation framework). Additionally, we will use Wikimedia’s machine-assisted tools to get quantitative indicators of quality: for example, Lift Wing scoring to estimate content quality, plagiarism detection tools (like CopyVio Detector) to ensure originality, and Wikidata-specific quality checks (e.g. the ProVe tool for item completeness). By comparing these metrics between the eIG and eCG groups, we can determine if LLM assistance led to measurable improvements in content quality or editing efficiency.

'Engagement and Editing Behavior: We will also monitor editing metrics such as the number of edits, bytes added, use of references, and whether students continue editing beyond the assignment. These data (available via the wiki platform) could indicate if the LLM group felt more confident or productive in editing. Any notable differences in behavior or feedback from students (e.g. their opinions on using the chatbot) will be documented qualitatively.

At the end of Phase I, results from the above evaluations will be analyzed statistically to test our hypothesis that the AI-assisted group produces higher-quality content and/or gains more knowledge than the control group. If the LLM-augmented approach shows significant benefits, it will support the viability of LLMs as educational tools in this domain. These findings will also guide adjustments for Phase II and be prepared for dissemination.

Phase II: Chatbot-Augmented Hearing Care Sessions (Clinical Application)

[edit]

Context and Participants: Phase II moves the research into a clinical environment – the Clinical Audiological Diagnosis course (clinical internship) for third-year SLP-Audiology students, conducted at the Hearing Care Service (Audiology Clinic) of FOB-USP. Only students who participated in Phase I (and met a minimum performance/knowledge criterion in Phase I) are eligible to continue into Phase II. We anticipate about 6–16 students in total for Phase II, which will be organized into three small groups for the intervention and control conditions. Each of these groups will work with a cohort of patients at the clinic.

On the patient side, Phase II will recruit adult patients with hearing loss (of mild to moderate degree) who come for services at the university’s Audiology Clinic. Patients will be invited to participate during their routine appointment intake, with a target sample of ~60–120 patients overall (about 20–40 patients assigned to each student group in this pilot). Inclusion criteria ensure patients have the capacity to participate in a counseling dialogue (e.g. excluding those with severe cognitive impairments or illiteracy that would preclude the literacy assessment). All participating patients sign informed consent, and their involvement consists mainly of receiving guidance and answering survey questions.

Intervention Design: Phase II employs a controlled intervention trial in the clinical setting, featuring three groups of student–patient interactions: two intervention variants and one control. Each group consists of 2–6 students (paired with patients one-on-one during sessions) under one of the following conditions:

Clinical Intervention Group (cIG): Students use a general-purpose LLM (GPT-4) assistant (using Microsoft’s Bing Chat / Copilot interface) during patient counseling sessions. In practice, this means while a student is explaining hearing test results or answering a patient’s questions about their condition, the student can consult the chatbot in real time (or just before the session) for any additional information or clarification. The chatbot here operates with its standard knowledge base (no special training beyond its default GPT-4 model), providing answers based on general training data. This simulates using a state-of-the-art AI assistant as a clinical reference tool.

Clinical Wikimedia Intervention Group (cWIG): Students use a Wikipedia/Wikidata-augmented LLM (GPT-4) during sessions. This is a specialized setup where the chatbot (through the Microsoft Copilot interface) has access to Retrieval-Augmented Generation: it can fetch up-to-date information from Wikipedia articles or Wikidata facts in response to prompts. For example, if a patient asks about a specific treatment or the prevalence of a condition, the chatbot can pull the relevant data from Wikipedia to provide a factually grounded answer. The aim is to combine the conversational strength of GPT-4 with the reliability of vetted Wikimedia content. Students in cWIG will incorporate this augmented chatbot to support the guidance they give patients, ideally providing more accurate or source-backed information.

Clinical Control Group (cCG): Students conduct their patient guidance sessions without any AI assistance, following the usual standard of care and counseling they have been taught. They rely solely on their own knowledge and any printed reference materials, which reflects typical clinical practice. This group serves as the baseline to compare how having an AI assistant might change the information given to patients or affect patient outcomes.

All participating students, regardless of group, are instructed to cover the same basic scope during the Guidance Sessions: e.g. reviewing the patient’s audiological evaluation results, discussing the nature of the hearing loss or ear condition, advising on management (like hearing aids, follow-up, or preventive measures), and answering the patient’s questions. Each session is supervised by clinical faculty or the research team to ensure patient safety and that misinformation is not provided. Interaction with the AI (for cIG and cWIG) is designed to be optional and supplementary – students are not required to use it for every question, but they have it available as a tool. When used, the student may either consult the chatbot privately (and then relay info to the patient) or even show the patient certain info from Wikipedia, etc., if appropriate. The research team logs these chatbot interactions for later analysis (recording the questions asked and answers given, without storing any patient identifiers). Sessions typically last the standard length (~15-30 minutes), within which the AI consultation might take a few minutes of that time.

Evaluation (Phase II): The primary outcome in Phase II is the hearing health literacy of the patients after receiving counseling. We measure this using a custom Hearing Health Literacy Assessment Questionnaire (developed by the research team), which patients fill out at the end of their session. This questionnaire (delivered via a tablet or paper, with assistance if needed) collects data on two aspects:

  • (i) the patient’s background and demographic factors (age, education, duration of hearing loss, etc.) and
  • (ii) their health information habits and understanding. Key questions ask how confident they feel about understanding their hearing condition, whether they know where to find information (e.g. have they used the internet or Wikipedia for hearing health info before), their perceived need for further healthcare, and their ease of understanding the explanations given during the session.

The responses will be compared across the three groups to see if those who experienced an AI-augmented session show higher comprehension or satisfaction. We hypothesize that patients guided with the support of the wiki-augmented chatbot (cWIG) may demonstrate the greatest gains in understanding, given the more evidence-based information provided, but this will be tested against the control group (cCG) statistically.

In addition to patient surveys, Phase II collects qualitative and quantitative data on the content of the guidance sessions. The research team will analyze the “dialogue” that occurred in each session, particularly for the AI groups: what questions were asked to LLM, what answers were given, and how the student integrated that information when speaking to the patient. A Chatbot Interaction Analysis Protocol (developed by the research team) will be applied to evaluate the relevance and accuracy of AI-provided content. For example, we might rate each AI answer on whether it was factually correct (and aligned with Wikipedia/Wikidata sources), and whether it addressed the patient’s question effectively. Any differences in the quality of information exchange (AI-assisted vs not) will be noted. We will also observe the students’ behavior – does having the chatbot make them more confident or does it introduce any delays? – and gather feedback from students about the experience afterward.

All Phase II data (survey results, session logs, etc.) will undergo statistical analysis similar to Phase I. We will use appropriate tests (parametric or non-parametric, depending on data distribution) to compare outcomes between the two intervention groups and the control. The relatively small sample of students means this phase is exploratory (pilot-scale), but with up to ~100 patients we expect to see indicative trends. Our analysis will examine whether the presence of a general or wiki-augmented chatbot significantly improves patient literacy scores, and whether one type of chatbot outperforms the other. This will inform conclusions on the practicality and benefit of integrating LLMs into routine hearing care consultations.

Timeline

[edit]

The project is slated to run over 2024 and 2026, divided into the two phases described above. Below is a timeline of the major stages and milestones:

Milestone / Activity 2024 (Jan–Jun) 2024 (Jul–Dec) 2025 (Jan–Jun) 2025 (Jul–Dec)
Project planning and ethics approval (Research proposal submitted to IRB)
Done
Phase I participant recruitment (invite & enroll students) and preparatory training sessions on Wikimedia editing and LLM use
Done
Phase I Wikimedia contributions by students (Wikipedia/Wikidata editing assignments on hearing health)
Current
Phase I evaluation – collect post-training quiz, track edits, survey student experiences (Wikipedia Academic Writing assessment) and apply quality metrics (LiftWing)
Current
Phase I data analysis – analyze student outcomes & content quality, perform statistical comparisons
Current
Interim results & dissemination – compile a partial report; present Phase I findings at workshops/conferences
Current
Phase II preparation – update training content for clinicians, organize patient guidance sessions schedule; recruit patients & brief participating students
Current
Phase II implementation – conduct LLM-assisted and control guidance sessions with patients; administer literacy questionnaires
Current
Phase II outcome analysis – evaluate questionnaire data and analyze chatbot interaction logs; statistical analysis of Phase II results
Current
Final report and publications – integrate Phase I & II results into final research report; draft journal papers and share outcomes with Wikimedia community (e.g., Diff blog, Meta page)
Current

Note: Throughout the project timeline, ongoing tasks include project management and community engagement. For instance, the project will maintain a presence on Wikimedia platforms: updates or preliminary results may be posted on Meta-Wiki or discussed in the Wikimedia Research community, and the project is expected to be featured on the Wikimedia Diff blog to disseminate progress to a broader audience. By the end of 2025, we anticipate having completed both phases, analyzed the data, and begun publishing the findings.

Research Team

[edit]

Host Institution: Bauru School of Dentistry, University of São Paulo (FOB-USP), Brazil – Department of Speech-Language Pathology and Audiology

Project Lead: Dr. Lilian Cássia Bornia Jacob – Principal Investigator.

Student Researcher: B.Sc. Hector Gabriel Corrale Matos – Graduate student (Masters). Primary executor of the project.

Funding

[edit]

The research project titled Computational audiology as a gateway for hearing health literacy: usage of ChatGPT for education and health information in Wikimedia projects is funded by the São Paulo Research Foundation under Grant No. #2024/05572-7 for the period May 2025 – April 2026.

The principal investigator is Prof. Dr. Lilian Cássia Bornia Jacob and the associate investigator is Hector Gabriel Corrale de Matos, both based at the Department of Speech-Language Pathology and Audiology, Bauru School of Dentistry, University of São Paulo, Brazil.

This project does not receive any funding from the Wikimedia Foundation or other funding source.