Strategy/Wikimedia movement/2018-20/Recommendations/Iteration 1/Diversity/5

From Meta, a Wikimedia project coordination wiki

Recommendation # 5: Identifying the Wikimedia Editing and Community Diversity Barriers in Each Country and Introduce Them in Wikidata[edit]

Q 1 What is your Recommendation?[edit]

To identify and look for indicators related to the different barriers to editing on Wikimedia projects and to representing knowledge on the different diversity groups (i.e. women, geographical, marginalized groups, etc.) and store them in one place where they can be consulted (e.g. as properties/values in Wikidata to link them to other stakeholder projects).

Q 2 What assumptions are you making about the future context that led you to make this Recommendation?[edit]

  1. There is a causal relationship between human groups that do not edit and therefore do not get their knowledge or knowledge about them neutrally represented - indigenous groups, national minorities, women, etc. Therefore, it is crucial to understand why they do not edit.  We are aware of the different types of barriers, yet, we do not know which communities/contexts are affected and to what extent (this relates to “overcome related cultural, institutional, technological and behavioral barriers to inclusion and to knowledge equity”).
  2. Different regions have different challenges or different intensities to similar challenges. Therefore, it is important to understand each obstacle in each territory. Usually, evaluating a country and its primary subdivisions is a good basis for identifying general barriers. Identification is important because language and country-specific barriers can point to important differences between territories. This is related to what we asked in the scoping question: Geographical location, socio-economic status, access to technology and formalized academic study can be barriers to inclusion. What kind of technological support and systems can be designed to help bridge gaps and give voice to more diverse groups of society?
  3. Taking this into account, we assume that identifying the barriers in each territory (country or region) is key to finding solutions (partnerships, technology donations, etc.) and prioritizing those which have a greater probability of success. The more we know about the different barriers and the solutions we applied in each, the more we will be able to replicate what works.
  4. Having identified the barriers and societal characteristics for each territory-language (i.e. literacy, digital divide) can be useful in order to assess the relationship between certain characteristics and the current degree of representation of the territory and its cultural richness (i.e. biographies, traditions, language, etc.) within the Movement (from the community to the final projects’ content) and estimate its potential to grow.

Q 3a What will change because of the Recommendation?[edit]

  1. The outcome of this recommendation (temporary project name “Barriers to Wikipedia”) is a clear mapping of barrier-indicators in Wikidata (or another database and public site) for every country and region. There are barriers to a) edit Wikimedia platforms which tend to be related to infrastructure, education level and, and those that prevent b) representation of certain concepts - whether they are about their territory, marginalized group, etc -, which tend to be psychological or sociological. In fact, they are barriers at all levels: institutional, cultural, technological, educational, linguistic, psychological, etc.
  2. A barrier-based site “accessible to anyone” like this will help in having a common place for community building work, partnerships, or anyone, to work in the field and make strategic decisions in order to assist groups in need. For instance, when a user group or any stakeholder asks for a rapid grant it will be useful information to take into account. Therefore, it is information that could be important for triggering action.
  3. At the moment, knowledge about the barriers is based on research and personal testimonials. For example, we found the following comment in community feedback: “A female user from Mozambique believes that the biggest barrier in her country is the lack of internet access and the limited or no knowledge about how [participation] works. She participated in a WikiGap Event, which was productive for her, especially for receiving editing information on-site.” Even though this testimonial has an important value, there is no way to obtain and verify this data on a large scale. Some of the indicators we need would come from external stakeholders (e.g. illiteracy rates is provided by the United Nations). Some others could be provided by community feedback.But having a full-picture is important in order to detect the extent of each problem, the replicable solutions, etc. Otherwise, there's a risk in repeating conversations with community members who report some factors for a specific region but we do not move on from this phase. We need to identify all the barriers and indicators (whether they are 20, 30 or 50) and make them accessible. The Movement is the perfect example that spreading knowledge has powerful effects. Part of the solution is understanding the problem well enough and making everybody in power know the situation with clarity.
  4. Having this data collected and accessible is also key to stimulating research which analyzes the causes of the lack of representation of content. Most importantly, it may explain better for instance, which countries we have reached the representation of the majority of their knowledge and which countries have significant content gaps, based on the population, geographical extent, etc. For example, one particular case is the Arabic one, a language which has a large number of speakers and countries, and yet the content on these territories is very limited compared to English and English-speaking territories.

Q3b. How does Recommendation relate to the current structural reality? Does it keep something, change something, stop something, or add something new?[edit]

It does not have a direct impact on the structure but it has strong potential in changing dynamics and decision making.

Q4a. Could this Recommendation have a negative impact/change?[edit]

No, having this knowledge can only be positive. In fact, having sound data may stop circular discussions about the barriers.

Q4b. What could be done to mitigate this risk?[edit]

There is no risk of negative impact, therefore there is no need to mitigate any undesired effect of having this data and raising such awareness on the barriers.

Q5. Why this Recommendation? What assumptions are you making?[edit]

One specific model that may help us in understanding the impact of barriers to make business is the ‘funnels model’. This is especially used in marketing for understanding what it takes for users to turn into customers. This model assumes that the path from user to customer may be truncated by many obstacles (lack of information, price, site usability, etc.).

This model assumes that without data on how the user goes from informed to persuaded and motivated to buy it is not possible to make good decisions and strategically move forward and make business. Likewise, we also assume that barriers to editing come earlier than barriers to representing a specific kind of diversity - some of which are psychological or a diminished sense of self-worth in certain historically colonized groups.

Every diversity problem is a problem of representation and inclusion. There may be some general barriers that affect editing, but some are specific to a group. Some language speakers may not consider that their local knowledge (e.g. their medicine, history, etc.) is worth writing into an encyclopaedia, and in this case the barrier is a psychological one. Other underrepresented groups like gender or LGBT+ may find other barriers in community harassment, lack of sources, among others.

Perhaps the biggest group to be represented for a language Wikipedia is the same language speakers cultural context (e.g. geography, biographies, traditions, history, etc.). This is what the project Cultural Diversity Observatory defined as the Cultural Context Content (CCC), which for the first 40 languages takes an average extent of 25% of the Wikipedia - some like the English take 45% and others like the Catalan take the 17%. Ideally, the most relevant part of each language CCC should be translated into every other language - at least to achieve a minimal cultural diversity in each Wikipedia.

Let us take this as an example of the barriers and how identifying them could work. The following two figures show two funnels: (1) cultural context content representation in one language edition and (2) cultural context content sharing across language editions. The first funnel explains the process from editing Wikipedia to representing the own cultural context, and the second one explains the one from being self-aware of the own Wikipedia content gaps to finally enriching it and others with the exchange of cultural context content.

In the first figure, centered in one language and territory, we can see: (A) Context Characteristics (B) Society and Wikipedia (Barriers), (C) Engagement (Editors), and (D) the final Cultural Context Content created. To simplify: in the funnel we introduce speakers, some of whom come across the different barriers to become engaged editors, and who ultimately create the content. The good thing is that each of the four parts of the Wikipedia-context interaction can be measured.

At the moment, even though they could be further developed, we have statistics on editor engagement and also on cultural context representation. We lack all the indicators regarding the barriers. In fact, in the diagram we depicted them in four groups (society, knowledge, Wikipedia technology and bureaucracy, and community). The first two, society and knowledge, are those imposed by the language context (i.e. lack of Internet access or lack of language sources). Instead, the second two are related to some of the sociotechnical characteristics, which may be gerenal findings (i.e. lack of usability), not be suited to that environment (i.e. lack of policies enabling certain kinds of contents), etc. It is obvious that the contextual barriers come before the Wikipedia ones, but it is hard to assess the impact each of them has in each environment. The more barriers identified, the better in order to overcome them and get more editors to final contribute to the project.

In the second figure, we have a similar funnel, this time oriented towards exporting and importing articles from different Wikipedia language editions related cultural context content. Hence, in the diagram we see that (A) and (D) are almost interchangeable depending on whether editors are importing or exporting. Considering an importing process, (A) is the cultural context content in all language editions and (D) the resulting cultural context content imported. Instead, (B) Society and Wikipedia (Barriers) and (C) Engagement (Editors) of the diagram present some differences because of the nature of this process.

For instance, in order to know the editors that go through the funnel and get to the other end, we need to measure their participation levels, the number of editors who edit across languages (multilingual), the number of editors local to a Wikipedia who at the same time edit in Wikidata, and finally, the number of editors who export their local content across other languages. These indicators do not always present a progression but they are indicative of the sharing potential for a language community.

Parelally, we have the barriers these editors need to overcome. We classified them in four groups (Wikipedia, Knowledge, Technology and bureaucracy). The first one is based on the content gaps awareness that can be minimal (i.e. seeing an interwiki link missing), medium (i.e. knowing that global south has less articles) or advance (i.e. following statistics on percentage of articles missing for a particular culture or region, using gap browsing tools and even completing lists of relevant articles). The second group of barriers concerns the necessary multilingual skills to contribute, the third and the fourth are related to the translation tools and the communities initiatives in order to stimulate editors to cover articles from other languages (i.e. CEE Spring, Catalan Culture Challenge, etc.). These latter barriers seem something easier to work on than those from the first diagram of representation.

Q6. How is this Recommendation connected to other WGs?[edit]

It relates to partnerships and capacity building. We believe these areas may benefit from these data.

Q^. What is the timeframe of this Recommendation in terms of when it should be implemented? 2020, 2021, etc. Does it have an urgency or priority? Does this timeframe depend on other Recommendations being implemented before or after it?[edit]

The earlier the better.

Q8. Who needs to make a decision on this Recommendation?[edit]

Considering the centrality of the proposal, it might fall between community engagement, research, data analytics and languages.

Q10. What type of Recommendation is it?[edit]

The implementation is between simple and complicated. We need to create the properties in Wikidata to identify and map for every human group in a country or subdivision of it (based on an identity characteristic such as language, politics or geographical location) the barriers to both (1) Wikipedia access and (2) group-related content (articles and points of view) representation.

In Wikidata, at the moment, there are properties missing for the digital divide, educational level, economic development, language literacy, etc. Some of these statistics are generated by organizations such as the United Nations.

Q11. How will this Recommendation be implemented?[edit]

This project (“Barriers to Wikipedia”) would require several phases (not necessarily sequentially):

  1. Identifying barriers (community conversations, research literature, etc.).
  2. Finding one or more indicator for each barrier at region or country level and evaluating the reliability of its sources.
  3. Storing the indicator in a repository (i.e. Wikidata).
  4. Displaying the barriers in a website where we can select each country and compare them (if they stored in Wikidata perhaps it would only be necessary to retrieve those properties concerning the barriers).

It has been pointed that it might be useful to count on Wikidata projects. Creating "country projects" (or topic projects) that users from a language can manage and develop. Example: WIkiProject: Italy or WikiProject WOmen

Q12. What are the concerns, limiting beliefs, and challenges for implementing this Recommendation?[edit]

Some data may not be available for all the world (yet, this is already some sort of progress in the data we have!).

There is the need for good dissemination in repeated times when all the values are introduced in Wikidata and an external website shows *only* the properties that apply to the barriers for each country.

Q13. How much money is needed to implement this recommendation?[edit]

The cost would be basically a researcher & communicator to accumulate the data, introduce it to wikidata (proposing the new properties) and spread the results among communities.

Ideally there should be a program dedicated to diversity in the future in the same way there is one dedicated to libraries. Identify the barriers, find the indicators and patrol them would be one of its focus.

Q14. How should the implementation of this Recommendation be monitored and evaluated? By who?[edit]

The same people who would use the information would monitor and evaluate this implementation, i.e. research, partnerships and community building groups.