Research:Developing Metrics for Content Gaps (Knowledge Gaps Taxonomy)/Community Engagement

From Meta, a Wikimedia project coordination wiki

In this page, we describe the different points of engagement between the Wikimedia Research team and the communities in order to carry out this research.

The project originates in the 2030 Strategic Direction goal of knowledge equity, as measuring content gaps is a way of monitoring progress towards it. As of July 2020, Wikimedia Research proposes the project aimed at supporting the communities work, by aiming at selecting a clear set of metrics that explains the content gaps.

While the project is led by members of the Wikimedia Research team, it is essential to acknowledge that this is built on the premise that the communities support and collaborate in this research process to make it successful. For this research has been carried as an open research [1], following the principles of opening the methodology, code, data, and other resources.

Four different points of engagement between research and communities have happened since the beginning of the project: project presentation, scope and definition, development, and closure.

Project presentation[edit]

September 2020

  • Wikimedia research:

Presentation of a project Knowledge Gaps Index (Taxonomy).

  • Communities:

Wikimedians gave feedback on the potential risks and missing priorities in the approach in a discussion page and videocalls.

Conclusions and following steps

As far as the content dimension of the project, this allowed prioritizing on some specific gaps: gender, sexual orientation, geography, cultural background, recency (time), topics for impact, and language.

Scope and definition[edit]

January 2021

  • Wikimedia research:

Presentation of the specific dimension of "content" along with the main detected content gaps. The content gaps part of the project is presented in a mail to the Wikimedia Research mailing list as well as mentioned in Wikimedia Strategy 2030 calls.

  • Communities:
Overview of stakeholders

Wikimedians gave feedback on the potential risks and missing priorities in the approach in a discussion page and video calls.

Conclusions and following Steps

To identify the project stakeholders, we navigated through the website Meta and the different Wikiprojects existing in Wikidata in order to find individuals and groups related to each gap.

All the possible stakeholders are valuable both to understanding and bridging the content gaps. Nonetheless, we proposed investing more time in those who presented more maturity in bridging the gaps. This often translates into more (i) problem awareness and discourse, (ii) human capital and organization, (iii) capacity and measurement, (iv) local and international scope, and (v) strategy and long-term goals.

Having this classification of stakeholders will allow us to review their documentation and to approach them for an interview in order to deepen into their thoughts (mindset) and practices.

Taking this into account, we recognized the following types of stakeholder from higher to lower levels of maturity:

  • Mutilanguage/Multiproject Contest/Wikiproject (e.g. Asian Month, subcontest with a gender scope organized by WikiDonne). [MC/WP]
  • Wikimedia Movement Affiliate (e.g. Wikimujeres). [A]
  • Multilingual Wikiproject (e.g. Systemic bias, Women in Red, etc.). [MWP]
  • Single-affiliate Project (e.g. “Her Art and Her Digital Story” by the affiliate Art+Feminism). [AP]
  • Single-language Project Contest/Wikiproject (e.g. BBC Women in Catalan Wikipedia). [C/WP]
  • Researcher on diversity [R]

Once the main stakeholders were identified, their documentation (Meta pages and websites) was reviewed in more detail in order to understand their practices, needs and goals.

In addition, the Wikimedia Research team could start thinking 1) how to map articles to a gap, 2) how to measure gaps according to scientific literature.

Iterative development[edit]

March 2021

  • Wikimedia research:

Presentation of the content gaps and a series of questions to understand community needs, tasks and goals when bridging the gaps.

  • Communities:

Wikimedia affiliates and community leaders engage in interviews (exploratory conversations): Viquidones/Wikimujeres, WikiDonne, Art+Feminism, Wikiesfera, Amical Wikimedia, Wikimedia LGBT+*. The conversations revolved around three topics:

  1. The mindset in general and the wording used to refer to the content gap.
  2. The specific spaces where to identify a gap or its manifestations.
  3. The process or processes followed to identify – bridge a gap.

* Conversations happened in the context of writing a paper on the LGBT+ content gap with one of its affiliate representatives.


Conversations highlights (summary)

  • There is good consensus on gender gap main models (e.g. Wagner’s, Morgan-Martin Extent-Select-Framing, etc.) and they use similar terms/have similar views to refer to their editing tasks.
  • The concepts used are very well-known in the gender gap community. There is maturity in the analysis of the problems. At the same time, the same fears or discussions with other Wikipedians on feminism and neutrality also appear.
  • There is the perception that “women biographies” receive more warnings and are deleted more easily.
  • There is the need to understand how women are presented in the first line-snippet in their biographies independently of their partners, or the degree of completion of the article compared to their partner's.
  • We see interest in understanding as much as possible about the gender gap in all spaces (in all kinds of topics, on the main page, etc.). The sister projects are mentioned. It is interesting to think of them as “article subsections”.
  • There is consensus on the importance of visibility metrics on the “main page”, among others. Visibility is essential in order to obtain pageviews and to encourage editing more articles of a topic.
  • Notability is skewed towards popularity in mainstream media. For example, for women in science, the databases are clear. So, having metrics of relevance based on these authority databases would be useful.
  • From a gender perspective, it would be useful to have a metric that allows understanding who is behind an article or a group of articles. Sometimes when some editors (admins or with a large edit count) are behind the articles, this intimidates women and prefer not editing there.

Conclusions

  1. Communties/User Groups embrace research to articulate better their discourse and promote better practices correcting biases/bridge the gaps.
  2. Stakeholders know which current metrics are valuable and which metrics could show the barriers prevent them from being more effective.
  3. The use of collaboration strategies (e.g., red links list, contests, etc.) is paired with the creation of attainable milestones (e.g., 20,000 women in cawiki).

Following Steps

Having identified the most important needs/demands by gender gap stakeholders, we could proceed with the selection of the final metrics by triangulating community needs with other criteria.

Project closure[edit]

June 2021

  • Wikimedia research:
This is a presentation of the research process and results (PDF 13 slides).

Presentation of the content-gaps mapping and the chosen set of metrics to the communities (see slides) in a series of calls with Wikiesfera, Wikidonne, Art+Feminism and Viquidones/Wikimujeres.

  • Communities:

Wikimedians give feedback on the suitability of the chosen metrics and highlight those that are missing. These are the most repeated comments:

  1. Three metrics is clearly too little to explain a gap; at least, there should be 4 or 5.
  2. Page views is a necessary metric as it shows an external dimension (readers' interest)
  3. Refined metrics on references would be very useful to understand the gaps.


Conclusions

The selected metrics (Number of items - Selection, Extent-quality score, Main page - Visibility) satisfy their expectations but at the same time, from the gender perspective, three metrics seems insufficient. There are aspects/dimensions unaddressed.

Following Steps

Once having confirmed the value of the three metrics for the five gaps and the methodology in order to implement them is ready, the project may proceed with an engineering phase to scale it to the 300 Wikipedia language editions.

References[edit]

  1. Was ist Open Science? online 23 June 2014 from OpenScience ASAP