Research:The sudden decline of Italian Wikipedia

From Meta, a Wikimedia project coordination wiki

This page documents a planned research project.
Information may be incomplete and change before the project starts.


The problem[edit]

The statistics of the Italian Wikipedia in 2014 became increasingly alarming.

See:

While there is regular fluctuations in the number of editors and edit, for many years Italian Wikipedia has been mostly over 2800-3000 active editors per month. However, the number of edits, both registered and unregistered, suddenly collapsed around July 2013. April 2014 has been the 9th consecutive month with 2600 or fewer active editors.

Despite this change in editing, total page views seem stable (though decreased from ~3.1 % to ~2.8 % of the total across all languages) and there is no obvious indicator of the source of the problem.

The top line shows it.wiki is the Wikipedia with the highest share of registrations from mobile (read more)

Registrations are also pretty much constant, though on July 25 the registrations on mobile quadrupled from some 5 to some 20 % of the total.

A closer look[edit]

Edits on Italian Wikipedia until the first months of 2014.

We can give a closer look to edit activity levels of registered users in main namespace, for each class of editors by number of edits per month. Let's consider the "suspicious" period (from August 2013 till latest data) compared to the equivalent one year earlier to avoid seasonality (and to include the standard January peak in both cases).

Period → 2013-08–2014-04 2012-08–2013-04 Difference
# users → with
min # ↓ edits/month
Average St. dev. Average St. dev. Absolute Relative
1 7328 388 7513 380 -185 -2,47 %
3 3541 186 3899 198 -359 -9,20 %
5 2490 134 2833 140 -343 -12,12 %
10 1591 85 1862 89 -271 -14,56 %
25 927 42 1090 48 -163 -14,94 %
100 372 20 449 26 -76 -17,03 %
250 175 17 208 14 -33 -15,81 %
New wikipedians 402 67 491 44 -89 -18,12 %

Reading the table: users with at least one edit in the given month are almost constant: the small decrease may be insignificant. However, despite requiring only 4 edits more, the 5+ class shows a sharp decrease of about 12 % which also composes most of the decrease in the 3+ class; higher classes are even worse, around -15 %, and the editors who "graduate" to their 10th edit ("new wikipedians") -18 %.

Observations to be verified:

  1. the latter value, being the worst one, suggests there is a decrease in the capacity to convert new editors to higher classes of activity, rather than (or in addition to) a deactivation of editors previously active in the respective classes;
  2. however, such a decrease in very active editors seems unlikely to be caused only by a shortage of new editors: probably a considerable pool of very active editors shifted to lower classes of activity, also "covering" part of the decrease in those classes.

Both observations suggest that the value of active editors is going to get even worse in the long run, until it aligns to the (currently) ~20 % lower capacity to make new editors active.

Causes[edit]

As for causes, no plausible hypotheses to test have been found so far; see below for some ideas.

The desktop pageviews however have a similar decrease of 19,4 % in the period (possibly "eaten" by mobile): if we prove that contributions are directly proportional to desktop views only, then this alone would be enough to explain the drop in activity; but how to prove it? In this scenario, the only hope would be to bring visitors back to the desktop site!

Scratchpad[edit]

Need to shortlist some hypotheses that can be worked on/verified.

  • Maybe pageviews are not really stable? Filtering bots etc., however, is time consuming and error-prone (cf. File:2013_Wikimedia_traffic_trends.pdf, no disaggregation though).
    • In particular, Italy is known to have one of the highest amounts of mobile phones per capita in the world. In the typical day, there are online 12.5 M desktop users and 14.5 M mobile users (of which, half only mobile).[1] [2]
    • Does it.wiki have an unusually high share of page views on its mobile version? Stats show a peak of 190+ M views in January 2014, +80M in 6 months and +81% in 1 year vs. +50% of most top7; es.wiki similar jump.
    • The proportion of our traffic coming from mobile is however still an open question in the analytics department.
    • How to verify whether an increased penetration of mobile has a direct correlation to reduced editing activity?
  • Number of views are not everything. Italians are less frequently on Internet and for less time, shorter visits probably mean less editing. In 2013, people active daily went down about 1 million compared to the top in April/May 2013.[3]
  • Two things we know about the mobile site are that:
    • phone users who use the desktop site are (in proportion) much more likely to edit than the phone users who use the mobile site (even though desktop users are still more represented as editors than as visitors);
    • automatically redirecting iPad users to the mobile site (which forbids unregistered editing) halved the number of total edits and editors from iPad on en.wiki: basically, the unregistered edit(or)s disappeared, and the registered edit(or)s were constant.

Global things with local flavour[edit]

  • (Big thing in July 2013, #1.) Any way to check VisualEditor effects?
    • Asked about latency.
    • Easy to compare in DB: editing activity of articles where VisualEditor has ever worked. The 122605 articles which were edited at least once with VisualEditor before 2014-06-02 have had an average share of 38±4 % monthly ns0 edits in 2007–2014 without obvious patterns; 43±4 % after VE (2013-08–2014-05) and 37±3 % before VE (2007-01–2013-06). Given the recentism-biased comparison, this doesn't say anything.
    • Are editors who made their first edit with VE more or less likely to do more edits, possibly on articles where VE is not enabled?
    • Does it take more or less revisions for them to do the same edits (e.g. to correct syntax mistakes)? (This can't explain higher classes though.)
    • Did VE harm productivity of unregistered power users? They don't have preferences, they can't turn it off; while registered power users don't use VE. Most VE edits are from IPs and most IP edits are from VE: VE matters a lot for the drop of unregistered editing activity.
  • (Big thing in July 2013, #2.) Frontend performance lasting effects? Second half of 2013 has been very slow.[4] We got better in 2014 but there was no recover of activity.
  • Frontend performance may be worse for Italian Wikipedia? (Only one-time numbers available.)
    • Nothing it.wiki-specific found so far.
    • Found bugzilla:65988, doesn't happen where $wgULSIMEEnabled is false. Unlikely to explain it.wiki's difference.
    • Can't find any frontend.assets.*.*.itwiki.* metric in graphite? [5] [6]
  • (Big thing in July 2013, #3.) SUL2 has had completely mysterious effects. Rumors are that users get frequently logged out; effects of the Special:CentralAutoLogin calls to login.wikimedia.org are unknown.
  • (Big thing in August 2013, #4.) HTTPS for all registered users. Fears to be tracked by NSA? Slowdown for SSL negotiations?

Discarded ideas[edit]

  • slow-parse.log could be checked to see if it.wiki has some blatantly slower parsing for some reason (makes editing slower).
    • Local template gurus have been among the fastest adopters of Lua modules, hopefully this didn't make things slower!
    • Data on talk shows huge variance. Needs analysis but unlikely to be the culprit.
  • Big social/content changes somewhere on the wiki? Nothing obvious: the peak in bot edits was just a migration of interwiki links and a bunch of cosmetic wikitext changes.
    • Abuse filter didn't change substantially: it prevented a bit more edits, but mostly for an anti-blanking filter which according to sysops has virtually zero false positives; the number of users hit per month is not unprecedented.
    • Article deletions were stable around 5 thousands per month; no changes to custom wiki CSS/JS happened which could affect a significant share of page views or action=edit.

Action items[edit]

  • Identify and produce any missing semi-standard closer-look stats. Editor retention would be interesting but there is no tool for it?[7]
    • Optionally, also edit retention of (new) accounts using visual editor vs. those who didn't.
    • Should be easy to count new 1+ 24h editors by month. WikiStats has an approximation i.e. the number of editors with 1+ edit in a given month.
    • Some other data is being collected by Incola (with DB queries?) and analysed by Alexmar983, among which: 1st->2nd edit survival and retention of 20+ edits users, 24h-(1|2)-edits users. Will be checked and summarised here later.
  • Check dates for GuidedTours/GettingStarted and the Welcome notification.

Lines of research[edit]

  • Verify the two hypothesis in #A closer look.
  • Investigate the 1+ class of users further:
    • how many were at their first edit and, of those,
    • how many made it on mobile and
    • how many made it on VisualEditor.

If we don't have ideas for a possible cause, this should at least give us an idea where to look.

See also[edit]

(Un)Related work: