Research:Explaining the Wikipedia reader gender gap
This research project focuses on identifying and summarizing academic and industry research related to a gender gap among Wikipedia readers, and generating a set of testable hypotheses related to the causes of that gender gap.
A recent large-scale survey by Wikimedia Research of Wikipedia readers across 13 Wikipedia language editions found a widespread and substantial gender gap, with women underrepresented among survey respondents.
A gender gap among Wikipedia editors has been observed since at least 2008, and has been confirmed by multiple research studies, using different methods, across several different Wikipedias. There have been a variety of reasons proposed for the editor gender gap, as well as potential consequences of that gap.
However, up until the recent large-scale survey, there has been relatively less research on Wikipedia readers. The magnitude and persistence (across projects and time) of the gender gap found in the recent survey is both a surprise and a cause for concern—on average, roughly 67% of respondents across all 13 wikis studied identified as men.
This finding is surprising because, given both Wikipedia's immense reach and widespread popularity, and a general lack of evidence of gender disparity in overall internet usage, the Wikimedia Foundation did not approach this survey project with a hypothesis that a substantial gender gap would exist among readers, or that this gap would be observed across most or all projects surveyed.
This finding is a cause for concern for several reasons:
- A reader gender gap threatens the Wikimedia Foundation's mission of "empowering and engaging people around the world" with free knowledge. If roughly 50% of humanity is under-utilizing Wikipedia, it is important for the Wikimedia Foundation and the Wikimedia Movement to understand why, so that we can ensure we are doing all we can to make Wikipedia meet the information needs of women.
- A reader gender gap likely contributes to the gender gap among contributors, most of whom presumably start out as readers.
- A reader gender gap likely contributes to systemic knowledge gaps in a collaborative project where contributors write and curate content on a voluntary basis, based on topics that they have prior knowledge about and/or are interested in writing about.
This research project consists of a literature review. Candidates for review were selected from Google Scholar using combinations of the following keywords with "Wikipedia": read[ing | er], gender, women, gender gap, female. The abstracts of all results for these queries were screened and ones that seemed relevant were downloaded for subsequent review. The Research namespace on Meta.wikimedia.org was also searched for a subset of these keywords.
During the review, additional candidate research papers were identified and added to the set if they appeared within the set of articles already collected, and appeared relevant to the overall project goal.
After all works were reviewed, we identified common themes and key results, and generated hypotheses through an inductive and iterative process of qualitative analysis.
The list of reviewed works can be found here.
A. The reader gender gap has been well-documented
There is substantial evidence for a reader gender gap from previous research conducted across multiple geographies and Wikipedia languages. However, much of this work consisted of small scale studies, often of particular populations (chiefly secondary school and college students), making it difficult to know whether the observed trends were representative of Wikipedia's worldwide readership. In addition, much of this research is a decade (or more) old, and it is fair to question how gender-mediated reading behaviors may have changed as Wikipedia's content, reputation, and popularity have grown.
Many of the Wikimedia Foundation's own reader surveys have either not asked about the gender of readers, or have not cross-tabulated the answers to use-related questions in a way that would allow them to identify gender gaps in overall frequency or type of usage. However, there are exceptions. The UNU-MERIT survey, conducted in 2008, found a substantial reader gap, with 69% of Wikipedia reader respondents identifying as male. The 2014 Wikimedia Global South Survey also found both a different overall usage rate, and a substantial difference in frequency of use between male and female identified readers. Both of these surveys were conducted across multiple languages and geographies.
The largest scale independent study to demonstrate a reader gender gap is the 2011 PEW research survey of Americans. In that study, the main finding showed a small gap in the answer to the question "Do you use Wikipedia?" (56% men vs. 50% women). And when participants were asked about frequency of use ("Did you use Wikipedia yesterday?"), the gender gap opened wider: 60% men vs 40% women. The frequency of use response data was not published in the final report, but is available via the PEW website to account-holders. A smaller scale survey study conducted in December 2015, delivered via banner on Greek Wikipedia, also found that only 38% of readers identified as women.
More recent research among student populations has also shown a gender gap in frequency of use. Kim et al. found that male students at a large public (American?) university were significantly more likely to use Wikipedia for "information seeking" than female students. This study was published in 2014. Lim and Kwon, in a study of University of Wisconsin students in 2008, also found that men reported using Wikipedia more frequently within the past semester than women. Other studies of college students have also found gender differences in likelihood of use. A survey of Amazon Mechanical Turk workers also found that women reported using Wikipedia less frequently than men.
In summary, there is a body of literature that shows that a reader gender gap exists, that it is most strongly reflected in frequency of use (rather than likelihood of occasional use), and that has existed for over a decade. In view of this, perhaps the most surprising thing about this finding from the 2019 survey is that it was a surprise at all: given all the attention paid to gender-mediated contributor and content gaps by the Wikimedia Foundation, the broader academic community, and to some extent even the popular press, why have the general extent and causes of the Wikipedia reader gender gap not been a major area of focus for deep, large, and systematic studies?
B. Women tend to find Wikipedia less useful and rate it less favorably than men
The literature review contains several studies that ask people (mostly students) how useful Wikipedia is, or ask them to evaluate its quality in some other respect. Among these studies, those that broke out responses by gender showed that female students in these studies tended to rank Wikipedia lower by these measures than their male counterparts. Other studies have shown that, at least among college students, perceptions of Wikipedia's usefulness or value tend to increase over the course of their education.
Considered alongside the finding that men tend to use Wikipedia more than women in general, it is possible that the lower value evaluation of Wikipedia among women is at least in part due to the fact that they tend to use it less often... although the relationship here is probably more complex than simple one-way causation.
C. Trust and self-efficacy likely play a role
It is likely that trust plays a role in why women read Wikipedia less frequently than men do. Some studies have shown that women rate the quality of Wikipedia lower on average than men, or trust it less. One of the causes of this trust/perceived quality gap is likely to be the well-documented self-efficacy gap between men and women. In other words, women trust Wikipedia less because they tend to under-estimate their ability to assess the quality/veracity/usefulness of information, compared with similarly-skilled men. This gap has been demonstrated for internet skills in general.
In the case of Wikipedia, if a self-efficacy gap exists, Lim & Kwon hypothesize that it likely has something to do with the means of production. Because Wikipedia is written by effectively anonymous (from a reader's perspective) volunteers, it does not gain its authority as an information source from the reputation of the individual or organization that publishes the information, unlike traditional information sources (e.g. The New York Times, the University of Washington Libraries, Encyclopedia Britannica). The way someone is raised and treated in the home, in their cultural context, and in educational institutions have a substantial impact on how they come to trust their own capabilities (self-efficacy). This may extend to how they trust information that comes from non-traditional, or non-authoritative sources. If women, on average, are less confident in their ability to vet the information they read on Wikipedia, they may be more likely to turn to more traditional information sources instead.
Some direct evidence exists to support the self-efficacy gender gap. Hinnosaar found that women Mechanical Turk workers were less likely than their male counterparts to rate themselves as sufficiently knowledgeable and competent to edit Wikipedia.
D. Certain kinds of content gaps likely play a role
It is likely that content gaps play a role in why women read Wikipedia less frequently than men do. There has been some research, as well as some high profile anecdotes, that show that in certain cases topics that are (again, on average) more likely to be of interest to women than men are less well represented on Wikipedia—in terms of whether content related to that topic exists at all, as well as in terms of how thoroughly it is covered.
Note that there is an important distinction to be made here between topics that are of interest to women, and articles about women. These are best considered as distinct content gaps. It has been amply demonstrated that, for example, there are fewer, and fewer high quality, articles about notable women on Wikipedia than men, However, it has not yet been established that women readers inherently and consistently prefer or are more likely to seek out articles about women (e.g. women scientists). However, in the case of topics that are known to be more interesting to women (e.g. Hollywood movies that are reviewed more favorably by women than men, or topics covered in commercial media publications advertised to women), these content gaps can be expected to directly influence readership. If Wikipedia contains less of the stuff you're interested in reading, you're less likely to read it.
E. STEM participation differences may play a role
There is some evidence that the Wikipedia reader gender gap reflects established gender gaps in educational and career paths. Basically, Wikipedia content has a STEM bias, both because of the demographics of its contributors and (perhaps) because of the social expectations of the encyclopedia genre.
Women tend to be under-represented in both STEM majors at the college level, and in STEM careers. Wikipedia tends to be used extensively as a resource for college coursework and professional reference. Some research comparing Wikipedia use to college student major has found that use is higher in STEM majors. It would make sense that the people who use Wikipedia in this way tend to skew male. However, not all of the survey findings support a "simple" correlation here, so it should be treated as a tentative hypothesis.
F. Information consumption preferences may play a role
There is some evidence that women may read Wikipedia less not (simply) because of the topics covered, or its authoritativeness compared to traditional media gatekeepers, but also because of the way information is presented on Wikipedia. Factors related to usability (findability, format, writing style) have been demonstrated to be among the most important factors in students' decisions to use Wikipedia. Evidence from another college student survey by Lim suggests that factors related to perceived information utility (including ease of use) were the most important predictors of Wikipedia use. Other survey studies also showed that the way information is presented on Wikipedia is more important than the accuracy, completeness, or currentness of the information.
Research on gender differences in health-related information-seeking behavior suggests that for women, the readability or understandability of health information on a website is one of the most important factors for deciding where to look. Wikipedia is a popular source for health information, and that information is not always presented in a clear and understandable way, and this may influence womens' average propensity to use Wikipedia in this way.
Some research has suggested that women may find the way Wikipedia content is written to be less accessible or usable than men. Other research suggests that women prefer more social or collaborative sources, such as forums and social media, when looking for certain kinds of information (e.g. health information), rather than news or encyclopedia articles. This finding is not yet well supported but deserves further study because it suggests that certain design choices may be an effective lever for encouraging women to read Wikipedia more, and because it may also relate to the hypothesis around trust in ways that have not yet been extensively studied.
G. Internet usage and information-seeking behavior may play a role
While I found no direct evidence that women use the internet less than men overall, there is substantial—if complex and contingent—evidence of differences in the type of things that men and women use the internet for. However, due to differences in methodology, theoretical foundations, and results it is often unclear what the implications of these studies are for the question at hand: why women appear to read Wikipedia less frequently than men do.
This literature review did find some direct evidence that men use Wikipedia more for certain reasons: for example, male college students report that they use Wikipedia more for entertainment or idle reading purposes than female students.
H. Usage gaps can create vicious cycles
Research has shown that students who use the internet to seek out information make greater gains in information literacy compared to students who use the internet more for entertainment purposes or social reading activities (e.g chatting). The principle gains come from an increase in metacognitive strategies—essentially "learning how to learn" and gaining competence at evaluating incoming information. If women are less likely than men to use seek out information on sites like Wikipedia, this may increase their disadvantage in terms of information literacy. That said, Wu (2014) did not find any evidence that female students exhibited lower information literacy or used fewer metacognitive strategies related to internet information-seeking than their male counterparts, so it's not clear that women are less information literate than men in the first place.
There's also evidence that perceptions of Wikipedia quality and trust in Wikipedia both increase with use (See findings B and C above). If women are less likely to use Wikipedia (see Finding A above), their perceptions of Wikipedia and trust in Wikipedia information will continue to lag behind those of men over time.
Hypotheses for the reader gender gap
- Hypothesis 1. Content gaps
- Women read Wikipedia less than men because, among topics that reflect a gender-mediated valence, Wikipedia's topical coverage is weighted towards men's interests over women's.
There is evidence that within particular topics, there are fewer articles relevant to women's interests, those articles are shorter or of lower quality, and that the way the topical information in presented is also exhibits a bias. There is also evidence that articles about women are underrepresented on Wikipedia, and this may also contribute to women reading Wikipedia less—directly, if women tend to read articles about women more than men, or indirectly, if women's trust in Wikipedia or perceptions of its quality is lowered when they notice these gaps.
- Hypothesis 2. Trust and self-efficacy gap
- Women read Wikipedia less than men because they on average trust Wikipedia less, and this trust gap is due (at least in part) to women are being average less confident in their own ability to evaluate (self-efficacy) the quality or correctness of information on Wikipedia, regardless of topic.
There is evidence that women trust Wikipedia less than men. There is also evidence that women have lower confidence in their own abilities than men, including their internet skills and their ability to evaluate the quality or correctness of information on the internet.
- Hypothesis 3. Information consumption preferences
- Women read Wikipedia less than men because women on average perceive way information is presented on Wikipedia to be less usable or accessible than men do, regardless of topic.
Women have been shown to value different features, such as readability and authoritativeness of source, in their consumption of certain kinds of encyclopedic information than men. Wikipedia may be perceived by women to be a less desirable information source both because Wikipedia is written primarily by men (and therefor reflects men's information consumption preferences), and primarily by amateur volunteers (and therefore doesn't have the same signals of authority as other online information sources).
- Hypothesis 4. STEM gaps
- Women read Wikipedia less than men because many common use cases for reading Wikipedia are associated with participation in educational or professional activities that themselves exhibit a gender gap in participation.
Wikipedia's content tends to be weighted towards STEM topics, and people in STEM majors and careers seem to use Wikipedia more frequently. If women are under-represented in these careers and majors, then they will be less likely to use Wikipedia.
Not under consideration
- Biological differences
- Women and men exhibit biological differences that lead to differences in information consumption habits or abilities.
All gender differences (i.e. behaviors, perceptions, metacognitive strategies) attested in the research covered in this literature review can be accounted for based on social or cultural differences—e.g. historical differences in child-rearing, education, and career opportunities. There is no reason to assume biological differences unless observed differences cannot be accounted for through social factors.
- Awareness gap
- Women are less likely to be aware of Wikipedia's existence or purpose than men are.
We found no evidence that suggested that women were systematically less likely to be aware of Wikipedia than men. While this may be the case in some languages and geographies, it cannot explain the consistent reader gap we found across all languages and geographies.
- Discretionary time gap
- Women have less discretionary time than men, and Wikipedia use is primarily a discretionary activity.
We found no evidence that women have systematically less leisure or discretionary time than men. While this may be the case in some languages and geographies, it cannot explain the consistent reader gap we found across all languages and geographies. A survey of Greek Wikipedians, a survey of Wikipedia and an analysis of student internet skills and a survey of reasons for not editing Wikipedia looked for the influence of gender differences in discretionary time, and did not find it.
- Negative press coverage
- Women avoid Wikipedia because of main stream media accounts of gender bias in Wikipedia content or gender bias in the contributor community.
Although there have been some high-profile negative press coverage related to 'toxic' community cultures on Wikipedia, the lack of gender diversity in the community, or the bias in coverage of men and women, it is unlikely that this could account for a reader gender gap that is consistently exhibited across languages and geographies.
- Search engine bias
- Most of Wikipedia's traffic comes from search engines, and women use search engines less than men do.
While this is intriguing and could be the subject of future research, the Wikimedia Foundation currently has no direct way to answer this question directly, and I found no evidence for it in my literature review.
- Glott, R., Schmidt, P., & Ghosh, R. (2010). Wikipedia survey--overview of results. In United Nations University: Collaborative Creativity Group. Retrieved from http://www.wikipediasurvey.org/docs/Wikipedia_Overview_15March2010-FINAL.pdf
- Zickuhr, K. and Rainie, L. 2011. “Wikipedia, past and present”, Pew Research Center. https://www.pewresearch.org/internet/2011/01/13/wikipedia-past-and-present/
- Protonotarios, I., Sarimpei, V., & Otterbacher, J. (2016). Similar gaps, different origins? Women readers & editors at Greek Wikipedia. AAAI Workshop - Technical Report, WS-16-16-, 80–87.
- Kim, K. S., Sin, S. C. J., & Tsai, T. I. (2014). Individual differences in social media use for information seeking. Journal of Academic Librarianship, 40(2), 171–178. https://doi.org/10.1016/j.acalib.2014.03.001
- Lim, S., & Kwon, N. (2010). Gender differences in information behavior concerning Wikipedia, an unorthodox information source? Library & Information Science Research, 32(3), 212–220. https://doi.org/10.1016/j.lisr.2010.01.003
- Selwyn, N., & Gorard, S. (2016). Students’ use of Wikipedia as an academic resource — Patterns of use and perceptions of usefulness. The Internet and Higher Education, 28, 28–34. https://doi.org/10.1016/j.iheduc.2015.08.004
- Hinnosaar, M. (2019). Gender inequality in new media: Evidence from Wikipedia. Journal of Economic Behavior & Organization, 163, 262–276. https://doi.org/10.1016/j.jebo.2019.04.020
- Shen, X.-L., Cheung, C. M. K., & Lee, M. K. O. (2013). What leads students to adopt information from Wikipedia? An empirical investigation into the role of trust and information usefulness. British Journal of Educational Technology, 44(3), 502–517. https://doi.org/10.1111/j.1467-8535.2012.01335.x
- Garrison, J. C. (2015). Getting a “quick fix”: First-year college students’ use of Wikipedia. First Monday, 20(10). https://doi.org/10.5210/fm.v20i10.5401
- Hargittai, E., & Shafer, S. (2006). Differences in Actual and Perceived Online Skills: The Role of Gender. Social Science Quarterly, 87(2), 432–448. https://doi.org/10.1111/j.1540-6237.2006.00389.x
- Menking, A., McDonald, D. W., & Zachry, M. (2017). Who Wants to Read This?: A Method for Measuring Topical Representativeness in User Generated Content Systems. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing - CSCW ’17 (pp. 2068–2081). New York, New York, USA: ACM Press. https://doi.org/10.1145/2998181.2998254
- Lam, S. T. K., Uduwage, A., Dong, Z., Sen, S., Musicant, D. R., Terveen, L., & Riedl, J. (2011, October). WP: clubhouse?: an exploration of Wikipedia's gender imbalance. In Proceedings of the 7th international symposium on Wikis and open collaboration (pp. 1-10). ACM.
- Schellekens, M. H., Holstege, F., & Yasseri, T. (2019). Female scholars need to achieve more for equal public recognition. 1–6. Retrieved from http://arxiv.org/abs/1904.06310
- Adams, J., Brückner, H., & Naslund, C. (2019). Who Counts as a Notable Sociologist on Wikipedia? Gender, Race, and the “Professor Test.” Socius, 5, 2378023118823946. https://doi.org/10.1177/2378023118823946
- Clark Blickenstaff, J. (2005). Women and science careers: leaky pipeline or gender filter? Gender and Education, 17(4), 369–386. https://doi.org/10.1080/09540250500145072
- Head, A., & Eisenberg, M. (2010). How today's college students use Wikipedia for course-related research. First Monday, 15(3).
- Lim, S. (2009). How and why do college students use Wikipedia? Journal of the American Society for Information Science and Technology, 60(11), 2189–2202. https://doi.org/10.1002/asi.21142
- Rowley, J., Johnson, F., & Sbaffi, L. (2017). Gender as an influencer of online health information-seeking and evaluation behavior. Journal of the Association for Information Science and Technology, 68(1), 36–47. https://doi.org/10.1002/asi.23597
- Wu, J.-Y. (2014). Gender differences in online reading engagement, metacognitive strategies, navigation skills and reading literacy. Journal of Computer Assisted Learning, 30(3), 252–271. https://doi.org/10.1111/jcal.12054
- Lee, Y. H., & Wu, J. Y. (2013) The indirect effects of online social entertainment and information seeking activities on reading literacy. Computers and Education, 67, 168–177. https://doi.org/10.1016/j.compedu.2013.03.001
- Hargittai, E., & Shaw, A. (2015). Mind the skills gap: the role of Internet know-how and gender in differentiated contributions to Wikipedia. Information Communication and Society, 18(4), 424–442. https://doi.org/10.1080/1369118X.2014.957711
- Collier, B., & Bear, J. (2012). Conflict, criticism, or confidence: an empirical examination of the gender gap in wikipedia contributions. Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work, 383–392. https://doi.org/10.1145/2145204.2145265
- Gender gap
- 2017 literature review on the gender gap
- Knowledge Gaps — Wikimedia Research 2030
- 2019 Wikipedia reader demographics survey
- Category:Reader surveys