Jump to content

Community Health Metrics/Polish Wikipedia Report

From Meta, a Wikimedia project coordination wiki

The Community Health Metrics project team and Wikimedia Poland members met in July 2021 to discuss different forms of collaboration, namely, how research could be helpful to support the work of the affiliate. Wikimedia Poland has strong community engagement programs and takes care of a wide array of editors - experienced ones, but also newcomers.

However, with the available data, they cannot keep track of the state of the community and the possible risks of stagnation or decline. For this reason, they presented us the profiles of the “target users” they want to quantify and monitor over time. These are the following:

  • Your Everyday Advanced Editor: having an active account for at least two years, at least 100 edits/actions per year, no more than 2 months' absence from the project, no blocks of 1 week or longer.
  • Project Maintenance: at least 500 edits per year in strictly defined maintenance areas of Wikipedia - Village Pump, Articles for Deletion, Admin Noticeboard, etc.
  • TechWizard: a subgroup of Your Average Advanced Editors, with at least 100 edits a year in purely technical areas - bot requests, templates, Mediawiki space, also having a bot would count the user in this group.

These three profiles are valuable members of the active community, which Polish Wikipedia cannot afford to lose. Quite the opposite, they wish there were more, of them or at least they were renewed over time.

In order to satisfy this informational need of the affiliate, we inspired in these profiles designed an approach to examine different aspects of the active community of editors.

Vital Signs


We propose the creation of 6 indicators that we call Vital Signs. In Medicine, vital signs indicate the status of the body’s vital (life-sustaining) functions. These measurements are taken to help assess the general physical health of a person, give clues to possible diseases, and show progress toward recovery.

In the case of Wikipedia, Vital Signs are related to the community capacity and function to grow or renew itself. Three of them are focused on the entire group of “active editors” creating content: retention, stability, and balance; the other three are related to more specific community functions: admins, specialists, and global community participation. We believe that obtaining the current “active community capacities” in these areas can constitute a valuable reference point to plan to guarantee “openness” in these areas, and at the same time, to observe growth and renewal, and foresee and prevent future risks (e.g., bus factor).

Vital Signs Analysis


Wikipedia has reached its second decade, being the largest multilingual and collaborative free knowledge repository in human history. But, scientific studies over the past ten years have shown that Wikipedia has been unable to continue growing its active editor communities.

In the following graph, we see the number of active editors of a group of 15 Wikipedia language editions from the Central and Eastern Europe (CEE) region. They are all consolidated communities that have succeeded at establishing a stable base of productive editors. Despite the numbers varying across the languages, they all present a similar stagnation after peaking between 2018-2012. Polish Wikipedia is the third Wikipedia in the region in the number of active editors (1,712) in August 2021.

Active Editors

An active editor makes at least 5 edits per month.

1. Retention


The first vital is retention. Community retention rate is computed as the % of new editors who survive 60 days after the first edit and edit again.

In this graph, we see the number of editors by the year in which they made their first edit and by their total number of edits (binned in 1, 2-5, 6-10, 11-100, 101-500, 501-1000, and 1000+ edits). This graph tells us about the interest to become new editors, which is stable and even growing over the years.

Some languages had an important peak of new editors in the very beginning and then decreased, while others sustained the attraction of newcomers at a similar level over the years.

Instead, the retention rate is decreasing for all languages, which means that many of the editors we saw in the previous graph just do one single edit and do not return.

In the following graph, we see the retention rate, which is several times smaller than what it was ten years ago. This means that while new editors get registered and do their first edit, they prefer not editing again. In particular, the Polish Wikipedia did not experience the biggest decrease (see German or Belarusian), but the percentage is clearly going down.

2. Stability


The second vital sign is continuity. Community stability or continuity is the persistence of active editors as well as the succession of groups of editors over time.

You want to ensure that there are fresh editors every month who had not edited on the previous month and that there are others who have edited for many more months.

In this graph, we see the number of active editors on a monthly basis and in color the number of months in a row they have been editing for each month. Grey means the first month, dark green 2 and light green 3-4.

In Polish Wikipedia, fresh editors in a given month are about 35-40% (in grey), while in others like Macedonian and Russian, they tend to be 40-45%. This means that Polish Wikipedia is less volatile, which is positive. It engages fresh editors every month, but the active community is mostly composed of editors who continue one month after another.

Even though they are not present in the graph, younger or less settled communities sometimes have a percentage of 60-70% of fresh editors. This may be seen positively as long as the overall community is growing. However, since it is not the case  for some of them, it only indicates their difficulties in consolidating stable communities.

Ideally, you want a community with editors who can edit several months in a row up to a year (for matters of collaboration or for a Wikiproject, for keeping the memory of certain conflicts or situations, for offering mentorship, etc.). We see that the “stable community” that edits more than 6 months in a row is nearly 50% of the active community of editors; those who have edited more than 30 months (2 years and a half) are almost a quarter. This means that these communities contain a core of very committed editors.

In the following graph, we zoom into three Wikipedia language editions: German, Belarusian and Polish to see the percentages of the different groups of engaged editors.

3. Balance


The third vital sign is balance. Community balance is being able to maintain an equitable proportion of old and new editors.

We want to benefit from experience, but also be able to stay open to new generations. This is a key sign indicating renewal.

In the following graph, we see for seven CEE language editions, the composition of the “very active editors” every year by lustrum of first edit (2001-2006, 2006-2010, 2011-2015, and 2016-2020.) We could say that they are “generations”.

By definition, very active editors are those who make at least 100 edits per month. So the graph shows the yearly number of editors who have been “very active” for at least one month. The “everyday advanced editors” defined by Wikimedia Poland required them to have an active account for at least two years, at least 100 edits/actions per year, no more than 2 months' absence from the project, no blocks of 1 week or longer.

In this case, we preferred following the definition of “very active editor” already set by the community/Analytics team. It is a bit more restrictive in the sense that it requires the editor to make 100 edits in one month, but this also ensures that the editor can do this kind of focused work at least one month per year. We did not use any other requirement since we already addressed the regularity (absence of editing) in the previous Vital Sign, and, the number of very active editors having been blocked is not significant to the analysis. Very active editors account for 80% of the edits made by humans every month. So, we could say that they are a group of very valuable editors. This reaffirms our choice of not setting extra requirements to select the group of experienced editors.

The graph shows this third Vital sign we called “Balance”. It gives an account of the “stagnation” in terms of growth. However, it is true that 2020 seems a better year than 2019 in number of editors, maybe due to the Global pandemic and the lock-down that took place in many countries. But more importantly, the graph also shows that the renewal of editors is occurring over time. We see that every year the percentage of editors who started editing 2016-2020 is growing, mostly at the expense of the 2006-2010 and 2011-2015 generations. The percentage of very active editors who started in 2001-2005 has not varied. In Polish Wikipedia, this is 9.13% in 2020, but 9.14% in 2019, 8.45% in 2018, and 10.18% in 2017. We could possibly say that “founder-factor” ensures more engagement.

Polish Wikipedia does not present percentages that are very different from the other CEE Wikipedia language editions. For example, compared to Macedonian and Belarusian, the proportions between the different generations is more balanced, which we would intuitively say that it is better for the community. It may not be desirable that “he “productivity” relies too much on an older generation, but it would not depend mainly on the last one. The active communities (or at least, the number of very active editors) do not grow, but at least, they renew over time.

4. Specialists


The fourth vital sign is special functions (specialists). Community technical and coordination functions undertaken by editors are essential for the project.

This vital sign responds to the two other profiles provided by Wikimedia Poland (TechWizards and Project Maintenance). These two profiles were defined according to some “areas of work” in Wikipedia. The first was detailed as: a subgroup of Your Average Advanced Editors, with at least 100 edits a year in purely technical areas - bot requests, templates, Mediawiki space, also having a bot would count the user in this group. The second: at least 500 edits per year in strictly defined maintenance areas of Wikipedia - Village Pump, Articles for Deletion, Admin Noticeboard, etc.

For the sake of simplicity, and consistency with the previous analysis, we define Tech Specialists and Project Coordinators/Maintenance as “very active editors” in the technical namespaces (that is templates and mediawiki namespaces) and Wikipedia namespace correspondingly. We believe that for the technical editor, the selection of editors may be broader than the one proposed as TechWizard, but this selection includes editors that might be able to take tasks such as running a bot more easily than the other editors. As far as the Project Maintenance role, the selected editor is defined according to a group of pages that have all in common that they are in the Wikipedia namespace. Therefore, we preferred using only the namespace as a reference.

4.1 Technical editors


In the graph, we see the number of “Very active technical contributors”. For Polish Wikipedia, they are 25 in 2020, very few, especially when compared with the overall group of very active editors (864 in 2020) and the active editors any month this year (1341 in August 2021). Community building on this group of contributors is highly encouraged, given the very few editors. In Polish Wikipedia, the group of editors who started editing in 2016-2020 is even decreasing (4 in 2018 and 3 in 2019 and 2020).

Polish Wikipedia technical editors are much less balanced than the overall group of very active editors. The majority is from 2006-2010: 12 editors in 2020, which account for 52.17% of the very active technical editors that year. The prominence of older generations is also visible in other languages like Hungarian and Belarusian, and other languages not in this graph. In the figure, we see that Czech, German and Ukrainian seems to renew more effectively. Similarly, with the previous analysis, we would encourage that the groups are balanced, so that they can pass the knowledge/projects from one generation to another.

4.2 Project coordinators


In the graph, we see the number of “Very active project coordinators”. For Polish Wikipedia, they are 52 in 2020, many more compared with the technical (27), but many less than the overall group of very active editors or active editors. Differently than with very active technical editors, we see that there is more renewal (the generation 2016-2020 is actually growing). However, the Polish Wikipedia is not as balanced as Wikipedias such as Ukrainian or Czech in terms of different generations of project coordinators.

It is interesting to see that some languages like German or Polish show a decrease in the overall number of project coordinators, while others like Ukrainian and Czech are growing. This may be due to many factors. Nonetheless, since the project coordinators are editors contributing to Wikipedia namespace, we would say that this metric relates to how open is a Wikipedia in terms of inviting newer editors to the very centre of activity.

5. Admins


The fifth vital sign is admins. Admins have special rights and responsibilities in performing actions over content and take a key function for the community.

In this complete graph, on the subgraph, we see the admins flags granted by year and the color represents the generation they belong to. Most flags were granted from 2011 to 2016 and to previous generations. We almost do not see flags from editors 2016-2020 but in a couple of languages (Bulgarian and Czech). In the middle graph, we see the total number of admins: Polish has 16 admins.

On the very right, we see the number of active admins in the past August 2021. The percentage is the ratio between the number of active admins by the number of active editors. This percentage varies according to the language, but it tells the “load” each admin is carrying, given that their task is to patrol the production and act when necessary. The lower the percentage, the higher the load they take.

In languages like German, active admins are 0.52% of the active community (a minimum of 5 edits per month). In Polish, we see there are 13 active admins, which represent 0.76% of the active editors. Other languages have higher proportions: Belarusian 1%, Bulgarian 5.40% and Macedonian 10.29%, even though, the group is not very large. We should say that for smaller languages, the group of admins should at least be big enough not to risk its continuity.

Polish Wikipedia does not present an important risk given the number of active admins. However, given the low percentage in relation to active editors and also the lack of new flags granted (only 4 of the 16 were granted in the past five years), it would be advisable that new flags were granted in order to ensure renewal.

6. Global


The sixth and last vital sign is global participation. Communities participating in the “global community” is key to making their voice heard and learning from others.

In this graph we see the number of active editors in meta-wiki in August 2021 (usually between 1,200 and 1,900 in 2021) by their primary language. An editor's primary language edition is the one in which they made more edits. So, an editor may regularly edit German Wikipedia and more spontaneously English Wikipedia. Since she has more edits in the German Wikipedia, that’s her primary language edition.

So, among the active editors in Meta-wiki, we see that only a 6.41% of editors are “local” to that project, which makes sense because of its purpose. Meta-wiki is a coordination space across languages. The Meta-wiki active editors who have Polish Wikipedia as their primary language are 106 (1.36%) of all editors. Polish Wikipedia compares to Portuguese or Bengali. The total number of active editors in Polish Wikipedia is around 1,200, which means that a little less than 10% edit Meta-wiki, which is a reasonable proportion.

In the following graph, we repeated the same analysis of active editors by primary language focused on Polish Wikipedia. In August 2021, there were 1341 active editors, from which 84.04% (1,127) had Polish Wikipedia as their primary language. As it can be seen in the graph, the remaining 15.86% comes from English (6.63%), German (2.08%), Russian (1.12%), and other Wikipedia language editions. These editors possibly take tasks that require little linguistic skills, but still, they add value to Polish Wikipedia.

In the following and last graph, we can see the number of active editors in CEE Wikipedia language editions in August 2021, and the percentage of editors who are from the same language (primary). While Polish, Russian and German have very high percentages of primary editors, other languages like Romanian, Slovak or Belarusian are below 60%. Even though the non-primary editors surely make some useful edits, one could say that in terms of developing Wikiprojects and taking responsibility, the real community tends to be the one composed by primary editors.