Wikipedia Primary School/Languages
|Wikipedia Primary School Main Page||Opportunities and Challenges||Methodology||Primary education||Content||Languages||Review||History||Community||→Africa Portal|
Wikipedia Primary School aims at providing content in the languages used by the different education systems.
It is important to mention that (according to what emerged in the feasibility study and during a workshop in June 2014 with people working in education in South Africa):
Languages used by the educational systems in Africa
The following table is a rough prospect of the 35 languages used (or permitted) in the educational systems of the 54 African countries.
At a first glance, we can observe that the languages of the former colonist countries (English, French, Portuguese, Spanish) are still dominant in the educational systems in Africa: 51 out of 54 countries use one of these languages, both as the only or as one of the permitted languages. Their use mostly follows the formal colonial allegiances of the various countries.
French and English are the two dominant languages, respectively used in 26 and 24 countries. All five former Portuguese colonies (Angola, Cape Verde, Guinea-Bissau, Mozambique, and São Tomé e Príncipe) use Portuguese language in their educational systems, while Spanish is used only in Equatorial Guinea, the last Spanish colony to achieve independence.
Arabic is used in 12 countries: the whole of Northern Africa, several countries across Sahara desert (Mauritiania, Chad, and Sudan), in the Horn of Africa (Djibouti, Eritrea and Somalia), and in Comoros. Mauritius allows to choose Arabic as an optional language, as well as other “Asian languages.”
The only country which does not use any of these languages is Ethiopia, which recognizes Amharic as the national language and all regional languages as eligible for teaching in the educational system. Another notable case is South Africa, which recognizes all of its 11 official languages within the educational system.
Several countries recognize the use of local languages in schools, and some of them (Congo DR, Ghana, Namibia, and Uganda) explicitly assign a certain fixed amount of hours to teaching of those languages in the school curricula.
In general, local languages have a stronger role in primary education; while official languages become essential in further education.
General overview of involved Wikipedia projects in the languages used by the educational systems in Africa
After this initial step, we checked how many languages have a Wikipedia project. This “narrowed down” our general overview from 35 languages to 32 projects.
What comes to light is a stark difference between communities. English, French, Italian, Spanish, Portuguese are the largest projects taken into consideration, all of them well over the 750,000 articles barrier. Those communities definitely take advantage of the fact that their languages are official languages also in countries outside Africa, and that are spoken primarily in developed countries.
Another project which has a six-digit number of articles is Arabic Wikipedia: while Arab is spoken in a number of countries, most of them concentrated in the Middle East and Northern Africa (MENA) area, it has by the way a smaller community compared to the Indo-European languages’ communities.
The first African-based language projects are Malagasy Wikipedia (one of the official languages in Madagascar, the other being French), and Yoruba Wikipedia (one of the four official languages of Nigeria, also spoken in neighbouring Togo and Benin). These two projects are also the only “African” ones that are well over the 30,000 articles barrier.
If we consider the 11 official languages of South Africa, situations differ dramatically: given that English is, for obvious reasons, the most active version, the second-best version is Afrikaans Wikipedia, that is challenging its way up to the 25,000 articles milestone with Swahili Wikipedia and has the best growing rate, in the considered period, among all African language Wikipedias.
On the contrary, Northern Sotho, Southern Sotho (or Sesotho), Swati (or Swazi), Tsonga, Tswana, Venda, Xhosa, and Zulu Wikipedias occupy 8 of the 10 lower positions: the numbers of these projects are comprised between 141 (Xhosa) and 686 (Northern Sotho) articles, for a grand total of 2,843 articles (slightly more than Somali Wikipedia).
Southern Ndebele is the only official South African language that has no Wikipedia project: a test one is running since May 29, 2009 on Wikimedia Incubator, but is highly unlikely to officially start at all, since until now only one test article has been written. Seychellois Creole (Seychelles) and Luba (Congo DR) too doesn’t have a Wikipedia project, nor is present any test project in the Incubator.
Another aspect we took into consideration, for statistical purposes, is the growth of the number of articles over roughly one month: the top 6 projects are growing at a 2,000-articles rate at least, while the main African languages projects are growing at a far low rate (comprised between 30 and 378 articles), with the odd exception of Malagasy Wikipedia (that has decreased of one article in a month). The lower half of the table is basically not growing at all – with the extreme cases of Tsonga and Shona Wikipedia, which decreased respectively by two and three articles since the assessment started. Northern Sotho and Taqbaylit are the only two exceptions, that anyway grow at the same pace of Amharic, Somali, or Yoruba Wikipedia (50-60 articles in a month).
General overview of the articles
Our analysis focused on 126 articles: 71 articles about general issues likely to be taught in schools (such as History, Geography, Art, Citizenship, Music, Religion, and so on), plus 55 articles about the African states and Italy. Our aim was to control how many versions just had an article about that issues, still regardless (for the moment) of the quality. The statistics are the following.
Though that may have been foreseen easily, these results confirm in a way what has been presented early: English, French, Italian, Spanish, Portuguese, and Arabic still retain the first six positions, and even if the order is slightly different, they still have almost all the articles we are referring at.
The situation begins to worsen when considering the African languages. What immediately comes to the eye is that Afrikaans has the most articles that we took consideration of (despite having around 14,000 and 5,500 articles less of Malagasy and Yoruba Wikipedia respectively), as well as the relatively good results of Shona Wikipedia (a language primarily spoken in Zimbabwe). Anyway, 13 out of 26 considered Wikipedias still have less than 50% of the articles.
It must be taken into account that, before the survey, a minimum result of 54 articles has been considered as acquired by all 26 projects taken into consideration – being 54 the number of fully recognized states in the African continent. This assumption proved to be wrong: 9 versions (Hausa, Igbo, Malagasy, Sesotho, Taqbaylit, Tsonga, Tswana, Venda, and Xhosa Wikipedia) fail even in reaching this limit: while Tsonga and Malagasy Wikipedia only lacks 1 and 8 articles respectively to the “minimum,” the other 7 lack more than half of the articles.
These results are even more bleak, if we consider that mostly the articles about African countries may be considered, at best, “stubs.” This is particularly true with versions with less than 2,500 articles, where is really common to find articles like “X is an African country,” often with an outdated infobox on the right (when present).
There are, however, some exceptions: there are articles that have been fairly expanded (of course, compared to the average of the project), mostly articles about the country in which the considered language is spoken the most (i.e. “Senegal” for Wolof Wikipedia, “Botswana” for Tswana Wikipedia, or “Rwanda” for Kinyarwanda Wikipedia).
Another thing that has been taken into consideration is the presence or the absence of the article about South Sudan: far from being a political issue, the article has been chosen to take into consideration how the communities are “reactive” to (relatively) new events, such as the birth of a new country. From our data, it results that 14 out of 26 projects (the most notable being Somali and Wolof Wikipedia) failed in creating the article about the new country, despite it has been independent since more than one year.
We can draw some empirical considerations, based on what we saw. A Wikipedia community may be considered “active” by taking into consideration both the number of articles and of active users that it expresses. Especially the latter condition affects the final outcome: a high number of article may be reached by using scripts that automatically insert so-called “stubs”, but with a small community it will be harder to expand those articles, with obvious reverberations on the overall quality of the project.
Most of the African languages suffer from both variables to be low, and it may be – empirically – affirmed that those projects are facing a vicious circle, in which there are few articles because there are few users, and there are few users because there are few articles. It is, anyway, far difficult to understand why participation is so low in those projects.
There are several facts that may explain it, e.g. the number of people who speak the language, the presence or the absence of linguistic institutions or academies, the possibility of accessing to the Internet to the people who speak that languages, the reason why they do not have access to the Internet (broadly speaking), and the possibility they have to learn and practise the language.
UNESCO says that 87% of the languages of instruction used in adult literacy and non-formal education programmes are African languages, and 70-75% of the languages of instruction used from nursery schools and kindergartens up to the early years of elementary schools are African. The percentages, though, fall dramatically to 25% in secondary education and to 5% in higher education. This too should be taken into consideration in our empirical considerations.
A plausible additional hypothesis may be that African users are more motivated to contribute to English, French, Spanish, or Portuguese, because both they speak that language and the community is bigger and more active than the one referring to their mother-tongue. Anyway, it is really hard to confirm such hypothesis, given that is really difficult to identify the place of origin of a user.
Looking at the history of the main communities, there is a possible solution to tackle the lack of participation: among the very first articles that were created on the Italian Wikipedia, there were the automatically-inserted articles about the 8,100 municipalities of Italy. The presence of an article that “anyone can edit” about their municipality convinced many users to stay, and later proved to be one of the keys of success for the Italian community.
The replication of this pattern has been barred until now by the impossibility of managing the data without a community: data can easily become outdated, and with no user who watch the recent changes, “vandalisms” and spam-bots can easily takeover the project.
A solution to this comes from Wikidata, a new Wikimedia Foundation project for creating a free database, officially launched on October 30, 2012. The new project will centralise access and management of structured data, such as links between Wikipedia projects (called “interwikis”) and statistical information, along with their sources. All languages for which there are Wikimedia projects will be taken into consideration.
At the moment, only links between Wikipedia projects can be included. The possibility to add core data about any subject (i.e. for municipalities, number of inhabitants, area extension, ZIP code, coordinates...) will be available in Spring 2013, since the developing team is still working on the technical features behind this. The idea of the developers is to create an entry for each article that every Wikipedia has (and for each article that Wikipedia will have) that can contain its main data.
Wikidata will thus make such data automatically available to every single Wikimedia project, as well as the sources from which these data are harvested. This means that in the future it will be extremely easy to create new “stubs” about, for example, Botswana municipalities in Tswana language (the de facto official language of the State, along with de jure official English), without the “opportunity cost” given by the need of watching the integrity of these data, since this is something that will be taken care of by the Wikidata community.
In other words, it will be possible to replicate the pattern used by Italian Wikipedia without its disadvantages. This does not mean the pattern will be followed – only that it may. Still, article bootstrapping is one of the possibilities that can be pursued.