Research:Knowledge Gaps Index/Taxonomy/Summary of Changes for Second Version

From Meta, a Wikimedia project coordination wiki

We summarize here the actions taken to address the feedback received for the first version of the taxonomy of knowledge gaps. The PDF of the second version can be found here: https://arxiv.org/abs/2008.12314

Major Taxonomy Revisions[edit]

  • Barriers: We received a number of requests to discuss barriers preventing the Wikimedia ecosystem from closing knowledge gaps.  Defining an exhaustive list of gaps' potential causes lie beyond the scope of this stage of the project. Still, throughout our literature review of knowledge gaps, we found evidence of elements which could potentially amplify or cause inequalities, and describe them, for each dimension, in a “barriers” section. Examples include:
    • Added Notability Policy to Content barriers, and moved Verifiability and Neutrality from gaps to the barrier section;
    • Added Community Health to Contributor barriers;
    • Added Censorship to Readers and Contributors barriers;
    • Moved Internet Connectivity and Tech Skills from gaps to barriers.
  • Standardization: As suggested, to enhance consistency across dimensions, we restructured the taxonomy to be more standardized. Each dimension now has two main facets: Representation, which includes all gaps reflecting how well different communities of people and their identities are represented within Wikimedia, and Interaction, which includes all gaps related to how people access and interact with Wikimedia sites. We reorganized sections to be in the same order where possible across facets.
  • Metrics / Objectives: We received many valuable recommendations around metrics for knowledge gaps. While building metrics is outside of the scope of this taxonomy, we will definitely take into account these suggestions for the next stage of this project which will focus on identifying or defining metrics. Given the level of interest on this front, we made the following changes:
    • We clarified that the definition metrics is outside the scope of the current version of the taxonomy, and expanded the “Metrics” section with some more details.
    • We acknowledged that the definition of the objectives for each facet was not mature enough for publication. Therefore, we removed facet Objectives and addressed this more directly in the “Metrics” section under Future Work.

Detailed Taxonomy Revisions:[edit]

General Feedback on the Taxonomy[edit]

  • Why Gaps? We received questions about whether gaps are the most effective way to represent inequalities. While it is hard to claim that understanding gaps is the most effective way to understand inequalities, we chose this framing because of its familiarity and long history of discussion on wikis -- e.g., gender gaps. To counteract these deficit narratives, we strove to not just include details of what is missing but also to highlight the powerful work done by members of the community to address these gaps. Our recommendation is to use the taxonomy of knowledge gaps as a means to understand gaps and not the only means.
  • Definition of knowledge gaps. We received requests to clarify the definition of knowledge gaps, usually defined as content gaps. To accommodate these requests, we changed the definition of knowledge gaps to "disparities in content coverage or participation of a specific group of readers or contributors" from “disparities in participation or coverage of a specific group of readers, contributors, or content.”. The rationale behind including readers and contributors into knowledge gaps stems from the point that "knowledge is socially constructed" and as a result, understanding the knowledge gaps cannot be done without understanding the people who contribute and access it.
  • Methodology. To further help clarify upfront how we constructed the taxonomy, the rationale and the methods behind this manuscript, we moved the “Methods” section before the main taxonomy sections.
  • When are gaps satisfactorily addressed? This is an important question. After the next stage of this project, where we recommend how the gaps can be measured, we will need to think about targets, which will determine when a gap is satisfactorily addressed. Here are a couple of basic principles we have in mind at this point in time:
    • Targets may differ depending on who is making decisions about addressing a gap type. For example, the targets set about increasing the gender diversity of readers, content, or editors by an organization or group focusing on the gender diversity challenges may be different than targets set by other organizations in the Movement.
    • Shared targets may be rare and will be powerful. In places where across the Movement we can agree on targets for specific gap types, we expect to see major mobilization around addressing the gap and that can help us collectively push to make a bigger change for a gap type we have more alignments around.
    • Equitable decision making when setting the targets is key. In line with Wikimedia's strategy of knowledge equity, we will need to take into account that much knowledge and many communities have been left out by structures of power and privilege
  • What types of knowledge are missing? In the “Future Work” section, we clarified important pieces of knowledge missing from this taxonomy, and provided reasons for these omissions, as well as plans to include those as part of future work.
  • What types of sources are missing? In the “Future Work” section, we clarified types of sources missing from this taxonomy as well as our positionality and background as the authors of this manuscript. Moreover, as suggested, we revised the bibliography section to offer a more structured and comprehensive list of the sources and references we used throughout the manuscript.
  • How do we build a KG index with gaps that cover such a wide range of themes? This is a good question and we do not know the answer to it at this stage. We have looked at how other organizations develop such composite indices including for the Global Innovation Index and Gender Equality Index. We know know that given the complexity involved in developing such an index, it is very important that we have milestones along the way and make sure the intermediate milestones are useful for the decision makers in the Movement. For example, we hope that the taxonomy of knowledge, independent of whether we develop a knowledge gap index or not, be of use as a framework to more comprehensively understand the knowledge gaps.


General Feedback on Barriers[edit]

  • Power: several suggestions pointed out power as a central theme that is missing (e.g., issues with European / North American contributors dominating smaller language editions). To address these, we added a subsection on the central role of power to the taxonomy as part of the “Future Work” section.
  • Barriers left for future work: while we included the discussion of some barriers to equity, we received many recommendations about barriers to include as part of this manuscript. These include free knowledge awareness, on-wiki socio-cultural conditions,  political freedom, religious oppression, time. Since defining all possible barriers preventing from reaching equity lie beyond the scope of this stage of the project, we will keep these in mind for future work and research on barriers.  


General Feedback on Gaps[edit]

  • Income and education: given the feedback asking to have a more standard definition of income and education across dimensions, we merged these two gaps into a “socioeconomic status” gap which is uniform across readers, contributors, and content gaps.
  • Politics / Ethnicity / Religion Gap: to respond to the requests around including gaps in nationality, ethnicity, religion and politics, we improved the definition of the “cultural background” gaps, to reflect representation of people with different “ethnic backgrounds, political and religious beliefs”. These aspects of identity, while incredibly important, are so contextual and region-specific that they could defy easy measurement or interpretation at the global scale of the Wikimedia Movement.
  • Language: as requested, we added a language gap in content, reflecting the gap of language coverage in projects that are multilingual by nature, e.g. Wikidata. We also clarified potential downsides of reducing language gaps for contributors (colonization of smaller communities -- e.g., Scots Wikipedia)
  • Geography: we received several recommendations around adding geography to all three dimensions, as well as expanding the notion of the geographic gap to take into account both within-country (locale) and cross-country gaps. We standardized the geographic gap that adds country differences to the locale gaps for readership/contributors, and adds locale differences to the country gap within content.
  • Sexual orientation: we welcomed the suggestion around adding gaps reflecting difference in coverage and participation of people from different sexual identities, and added a “Sexual Orientation” gap to all dimensions.


Readers Gaps[edit]

  • Kiwix: it was suggested to add Kiwix as part of readership. While we unfortunately are not able to measure readership volume through Kiwix (see the “Measurability” guiding principle), hence its exclusion from definition of readers,  we do mention Kiwix as a fundamental element to break the internet connectivity barrier.

Contributors Gaps[edit]

  • Movement Organizers and Developers: we received several recommendations to include Movement Organizers and technical contributors such as bot developers within the contributors gaps. To address this, we added a few lines and references to organizers in the introduction of the Contributors section, extended the “Role” gap, and included a small section on organizers and tool developers to the Future Work section of taxonomy.
  • Role Gap: as suggested, we expanded the definition  of the “Role” gap to include affiliates, board members, developers, organizers,  type and number of projects contributed, and time on the project.
  • Cultural Background gap: similar to readers, we added a “Cultural Background” gap to the Contributors dimension.


Content Gaps[edit]

  • Structured data: added discussion of categories, templates, and other annotations to the “Structured Data” gap, and clarified challenges associated with structured data and marginalized communities.
  • Multimedia: as suggested, we underlined the importance of multimedia, not only for accessibility but also as a different knowledge format that doesn't always have textual equivalents.
  • Verifiability: to address feedback about looking at verifiability as a form of barrier, we moved the “Verifiability” gap in the barrier section. Additionally, we expanded the barriers subsection about verifiability to reflect the issue of source availability in different languages, and added a reference to the analysis of sources by country.
  • Recency gap: to accommodate the suggestion around adding a “recency” gap to take into account the freshness of content on the sites. We added the “Age/Recency” gap to the Content dimension, and added relevant references to research on recency gaps.
  • Notability: as suggested, we added “notability” within the “Barriers'' section in content.
  • Impactful topics: to clarify the confusion behind the “impactful topics” gap, we renamed it “important topics” to reflect research and initiatives around defining importance and value of articles and items in Wikimedia projects.