Research:Wikipedia and consumer health information behaviour

From Meta, a Wikimedia project coordination wiki
Duration:  2021-Summer – 2022-Summer/Fall

This page documents a research project in progress.
Information may be incomplete and change as the project progresses.
Please contact the project lead before formally citing or reusing results from this page.


Background[edit]

Gutenberg's press in 15th century disrupted the esoteric tradition of the academy. The printing press reduced the cost of printing books and opened the gates to knowledge and stories. Likewise, the emergence of Wikipedia two decades ago removed the cloak of the academy and democratized human knowledge. Traditionally, new knowledge has been created by researchers and published in academic journals that are, more often than not, locked behind a paywall. Further, financial means is not sufficient as the language of academia, and in particular of medicine, is not universally accessible.

The quality of Wikipedia’s health content has received the vast majority of the academic attention paid to Wikipedia in the context of its usage as a health information resource. The reports of Wikipedia’s quality in the academic literature generally focus on Wikipedia’s suitability for patients or the general health consumer, students in health sciences, or professionals in the field of health and wellness.

To date, Wikipedia’s quality assessments have included gastroenterology, nephrology, cancer, autoimmune disorders, medicinal drugs or herbal supplements, and more.[1] These assessments assess the readability, reliability, and accuracy or completeness and specifically discuss their findings in relation to the public consumer or patient.

There is some agreement that Wikipedia is suitable for patients and a 2010 study found that while Wikipedia is not necessarily the superior resource, it is the preferred resource. Wikipedia’s medical content is valuable simply because it is available, it is free, and it is accessible to any individual with an internet connection and reading skills. People are reading Wikipedia and people are reading its medical content.[2] More recent research has found that efforts to improve Wikipedia’s medical content has improved it’s readability for the general public (Brezar & Heilman, 2019), efforts that come from volunteer editors, students.[3][4][5][6], and professionals[7] alike.

Wikipedia’s influence is not yet fully understood. It has been assumed that “the public relies on free online medical information for making health decisions”[8] but it has not yet been documented whether there are outcomes, such as decision making, related to the health information behaviour of individuals who access Wikipedia’s health and medical content. An investigation into the CHIB of individuals who use Wikipedia’s health and medical content could produce evidence that would provide information behaviour researchers health professionals, and Wikipedia advocates, particularly Wiki Project Medicine, with a richer understanding of health information seeking, sharing, use, or encounters on Wikipedia and the outcomes of this behaviour, if any.

Conceptual framework[edit]

The researcher will take a criticalist approach to this study. A criticalist research philosophy is framed by the acknowledgement of unequal power structures in society, among other factors such as human consciousness, biases and values[9]. Criticalist researchers conduct their investigations from the perspective of an advocate with the goal that their findings might influence social change[10]. The goal of this study is to produce insight into the role of Wikipedia in consumer health information behaviour and how it might intersect with the digital divide phenomenon. The outcomes of this research could influence the strength of the Wikipedia editing community’s efforts to attract more health and medical experts to contribute to its pages, thus raising the quality, readability, and completeness of the world’s most frequently accessed web site for health information.

Through this criticalist lens, the researcher will apply Bodie and Dutta’s theory, framed by their Updated Integrative Model of eHealth Use[11], that:

…structural inequities reinforce themselves and continue to contribute to healthcare disparities through the differential distribution of technologies that simultaneously enhance and impede literacy, motivation, and ability of different groups (and individuals) in the population.[12]

Bodie and Dutta explicitly consider the influence of an individual’s situation, personality, demographic and internet use history as determinants of their position in the digital divide and subsequently as factors that can influence CHIB. While this model is not grounded in the library and information science discipline, it identifies the relationship between health outcomes, personal characteristics, ability, motivation, and health literacy in the context of understanding the role of Wikipedia’s health and medical content in CHIB. The researcher will apply this theory to the role of Wikipedia to explore whether these structural inequities, operationalized as micro-level variables (e.g. income, education, access to internet at home, situation) influence patterns of use of Wikipedia’s health and medical pages.

In order to explore actual use of Wikipedia’s health and medical content, the “Longo Health Information Model: Information seeking, passive receipt, and use” will be used to frame the research. Longo’s model includes a comprehensive list of possible personal variables (e.g. education, health literacy, education, language) and contextual variables (e.g. health status, situation, reason for searching) that influence consumer health information behavior. Further, it provides space for both active use of Wikipedia’s health and medical content (active information seeking) as well as more passive encounters with Wikipedia (passive receipt of information). Finally, Longo’s model (Figure 1) offers the researcher an opportunity to consider four possible outcomes of the CHIB: empowerment, satisfaction, changes to activities of daily living, and health outcomes.

Research questions[edit]

  1. Who is accessing Wikipedia’s health and medical pages? How can this population be described? Of those who access it, how is Wikipedia’s medical and health content encountered (e.g., passively or actively)?
  2. Are there benefits or concerns to having access to Wikipedia’s health content and if yes, what are they?
  3. Is there a relationship between how Wikipedia is accessed and the social, physical, personal, or cognitive factors that can place an individual on either end of the digital divide?
  4. Is there a relationship between individuals’ perceived benefits and concerns about Wikipedia’s health content and the social, physical, personal, or cognitive factors that can place an individual on either end of the digital divide?

Methods and design[edit]

Participants and recruitment[edit]

Participants will be eligible for inclusion in this study if they are at least 18 years of age at the time of interviewing and they have read or have had shared with them health or medical content from Wikipedia in the past year. There will be no geographic restrictions to participation.

Participant recruitment will be conducted through study advertisement, using paper and electronic recruitment posters. Paper posters will be posted in community-based settings where appropriate and permitted such as local coffee shops or community centres. Electronic posters will be distributed by email through large scale listservs such as Wikipedia’s community listserv.

Participants will be asked to consent to participation by completing either an electronic form on MS Forms or by completing a print-form if they choose to be interviewed and complete the survey in person. For electronic form submission, the researcher will complete a second form confirming that she has provided all the relevant information about the study and believes to the best of her knowledge that the participant has made an information decision to consent or decline to consent. Electronic forms will be stored on the researcher’s McMaster OneDrive account. Paper forms will be stored in a locked filing cabinet in the researcher’s home office. All participants will be able to withdraw their consent at any time during the interview or while completing the survey.

All participants will be informed that they can withdraw their consent at any time during the interaction with the researcher. However, once the interaction is complete and the participant’s interview and survey data have been anonymized, it will not be possible to remove their data from the study.

Sampling[edit]

The researcher has one year to collect an analyze data for the study.  Given the temporal limitations of the study, the research has set an achievable goal of interviewing and surveying a maximum of 25 participants. While larger scale studies are generally preferred, in-depth qualitative analysis of 25 or fewer purposively sampled participants will allow the researcher to record insightful findings and provide a solid foundation for future, more in-depth or broad-sweeping research.

The researcher will stop interviewing and surveying participants when 25 interviews/surveys have been conducted, or when the researcher reaches a point of data saturation, whichever occurs sooner.

Data collection, analysis, and storage[edit]

Survey data will be collected online. Participants will not be asked to provide identifying information in this survey. Each survey response will be assigned the same identification number as the interview. For example, the number WP-001 will be assigned to a participant’s interview recording as well as their survey data. The survey questions are provided in Appendix B. The researcher will conduct analysis of survey data using SPSS statistical software. The researcher will generate descriptive statistics to describe any and all variables related to the use of Wikipedia’s health content. This data will provide insight into RQ1.

All interviews will be conducted either in-person or via Zoom video-conferencing software. For in-person interviews, the full interview will be audio-recorded on the researcher’s private recording device. For video-conference interviews, the full interview will be recorded in Zoom. The researcher will select the option to save audio recordings of the interviews directly to the researcher’s hard drive. Interviews will be transcribed, and the transcriptions will be uploaded to NVivo. Each interview will be assigned a unique identification number (e.g., WP-001). The interview guide is provided in Appendix A. Data analysis will begin after the first participant has completed the interview and survey. Analysis will continue on a rolling basis until the researcher has either reached information saturation or 25 interviews have been conducted, whichever occurs first. This process allows themes to emerge from the data organically and provides the researcher an opportunity to consider these themes in future interviews.

The researcher will analyze interview data using Nvivo software by employing a two-cycle open coding methodology using a variety of descriptive, in vivo, interpretation, feeling, belief system, assessment, and causation codes as they arise (Aurini, 2016, p. 195). The researcher is aware that open coding is often aligned with grounded theory methodology. Although this research is already grounded in a model of health information behaviour, the researcher has selected open coding for data analysis in order to identify any and all relevant findings with respect to how users engage with Wikipedia’s health and medical content. Since this topic has yet to be explored by any researcher, it is important to consider all data in the interview transcripts.

In addition to determining how the use of Wikipedia for health information aligns with the Longo model of health information behaviour, the analysis of interview transcripts has the potential to provide rich insight into how Wikipedia’s health and medical content is used by those who access it, whether individuals have concerns about it, what those concerns might be, whether individuals have benefitted from using Wikipedia’s health and medical content, and what those benefits might be.

Finally, RQ3 will be answered with data collected from the survey and interviews to identify whether any patterns or relationships emerge between how Wikipedia’s health and medical content is accessed (e.g., passively received or actively sought out) and the social, physical, personal, or cognitive factors that can place an individual on either end of the digital divide. RQ4 will also be answered with data collected from the survey and interviews to discern whether any patterns or relationships emerge between individuals’ perceived benefits and concerns about Wikipedia’s health content and the social, physical, personal, or cognitive factors that can place an individual on either end of the digital divide. This analysis will be conducted using Chi-Square tests in SPSS to test whether there is any relationship between how Wikipedia’s content is accessed (passively or actively) and demographic variables (RQ3) and to test whether there is any relationship between whether the individuals perceive health information on Wikipedia as concerning, beneficial, or both, and their demographic variables RQ4).

Two years from the date of the final interview, all data from both the survey and interviews will be permanently erased from any software (e.g., NVivo, SPSS), hard disks, or cloud-based software and their associated temporary files on the researcher’s computer. All in-print documentation such as notes taken during interviews, print administered surveys, will be shredded and recycled.

Limitations[edit]

The researcher as intentionally elected not to include in the study those who do not access Wikipedia. While this is an incredibly important population to consider in the exploration of Wikipedia’s use among the general population, and particularly in consideration of the broader impact of the digital divide, the researcher has decided to focus this investigation on how Wikipedia is used, accessed and perceived, among those who use it for health information. Gaps in access to Wikipedia is beyond the scope of this study, but could be explored in future research.

Timeline[edit]

Please provide in this section a short timeline with the main milestones and deliverables (if any) for this project.

Policy, Ethics and Human Subjects Research[edit]

This protocol is reviewed and approved by the Hamilton Integrated Research Ethics Board (HIREB) under project ID 13392.

Author information[edit]

The author is a Wikipedia researcher Wikipedia contributor. The author also supervises undergraduate student editing projects.

Funding[edit]

This study has been awarded funding through Wikimedia Foundation's Rapid Grants Program.

Progress Report[edit]

August 2021: Interviews underway July 2022: Data analysis complete. Drafting manuscript for publication

Results[edit]

Early findings and preliminary results will be shared here. Final findings will be submitted for publication to appropriate peer-reviewed journal such as, Journal of Consumer Health on the Internet, Journal of Medical Internet Research, or WikiJournal of Medicine.

Publications[edit]

The groundwork for this study was published in The Journal of Documentation in August 2021 [13]. A full-text pre-print manuscript can be found at: http://hdl.handle.net/11375/26812.

Results published in August 2023,[14] in First Monday.

References[edit]

  1. Smith, Denise A. (2020). "Situating Wikipedia as a health information resource in various contexts: A scoping review". PLOS ONE 15 (2): e0228786. ISSN 1932-6203. PMC 7028268. PMID 32069322. doi:10.1371/journal.pone.0228786. 
  2. Heilman, James M.; West, Andrew G. (2015-03-04). "Wikipedia and Medicine: Quantifying Readership, Editors, and the Significance of Natural Language". Journal of Medical Internet Research 17 (3): e4069. PMC 4376174. PMID 25739399. doi:10.2196/jmir.4069. 
  3. Azzam, Amin; Bresler, David; Leon, Armando; Maggio, Lauren; Whitaker, Evans; Heilman, James; Orlowitz, Jake; Swisher, Valerie; Rasberry, Lane; Otoide, Kingsley; Trotter, Fred (2017). "Why Medical Schools Should Embrace Wikipedia: Final-Year Medical Student Contributions to Wikipedia Articles for Academic Credit at One School". Academic Medicine (in en-US) 92 (2): 194–200. ISSN 1040-2446. PMC 5265689. PMID 27627633. doi:10.1097/ACM.0000000000001381. 
  4. Maggio, Lauren A.; Willinsky, John M.; Costello, Joseph A.; Skinner, Nadine A.; Martin, Paolo C.; Dawson, Jennifer E. (2020-12-01). "Integrating Wikipedia editing into health professions education: a curricular inventory and review of the literature". Perspectives on Medical Education 9 (6): 333–342. ISSN 2212-277X. PMC 7718341. PMID 33030643. doi:10.1007/s40037-020-00620-1. 
  5. Murray, Heather; Walker, Melanie; Maggio, Lauren; Dawson, Jennifer (2018-06-01). "24 Wikipedia medical page editing as a platform to teach evidence-based medicine". BMJ Evidence-Based Medicine 23 (Suppl 1): A12–A13. ISSN 2515-446X. doi:10.1136/bmjebm-2018-111024.24. 
  6. Weiner, Shira Schecter; Horbacewicz, Jill; Rasberry, Lane; Bensinger-Brody, Yocheved (2019). "Improving the Quality of Consumer Health Information on Wikipedia: Case Series". Journal of Medical Internet Research 21 (3): e12450. PMC 6441860. PMID 30882357. doi:10.2196/12450. 
  7. Shafee, Thomas; Masukume, Gwinyai; Kipersztok, Lisa; Das, Diptanshu; Häggström, Mikael; Heilman, James (2017-11-01). "Evolution of Wikipedia’s medical content: past, present and future". J Epidemiol Community Health 71 (11): 1122–1129. ISSN 0143-005X. PMC 5847101. PMID 28847845. doi:10.1136/jech-2016-208601. 
  8. Shafee, Thomas; Masukume, Gwinyai; Kipersztok, Lisa; Das, Diptanshu; Häggström, Mikael; Heilman, James (2017-11-01). "Evolution of Wikipedia’s medical content: past, present and future". J Epidemiol Community Health 71 (11): 1122–1129. ISSN 0143-005X. PMC 5847101. PMID 28847845. doi:10.1136/jech-2016-208601. 
  9. Waller, V; Farquharson, K; Dempsey, D (2016). Qualitative social research: Contemporary methods for the digital age. SAGE. p. 29. 
  10. Waller, V; Farquharson, K; Dempsey, D (2016). Qualitative social research: Contemporary methods for the digital age. SAGE. p. 21. 
  11. Bodie, Graham D.; Dutta, Mohan Jyoti (2008-07-02). "Understanding Health Literacy for Strategic Health Marketing: eHealth Literacy, Health Disparities, and the Digital Divide". Health Marketing Quarterly 25 (1-2): 175–203. ISSN 0735-9683. PMID 18935884. doi:10.1080/07359680802126301. 
  12. Bodie, Graham D.; Dutta, Mohan Jyoti (2008-07-02). "Understanding Health Literacy for Strategic Health Marketing: eHealth Literacy, Health Disparities, and the Digital Divide". Health Marketing Quarterly 25 (1-2): 175–203. ISSN 0735-9683. PMID 18935884. doi:10.1080/07359680802126301. 
  13. Smith, D.A. (2021), "Wikipedia: an unexplored resource for understanding consumer health information behaviour in library and information science scholarship", Journal of Documentation, Vol. ahead-of-print No. ahead-of-print. https://doi.org/10.1108/JD-03-2021-0049
  14. Smith, Denise A. (2023-08-12). "“I’m comfortable with it”: User stories of health information on Wikipedia". First Monday. ISSN 1396-0466. doi:10.5210/fm.v28i8.12897.