Research:Portuguese Wikipedia trends and behavior/Editor Trends
Preliminary results, mainly plots
- For summary page, see PT-WP General Editor Trends
- Introduction to WikiPride plots: Research:WikiPride
The total number of editors who did their first edit in a given month is constant. Since the beginning of 2007, there are about 3000 new editors each month.
More than 1 edit per month
The number of editors in the Portuguese Wikipedia has seen a period of strong growth, though not an exponential growth as the English Wikipedia, from the beginning of 2004 until the beginning of 2008. Then the number of editors doing at least one edit has saturated at about 6000 editors per month and has stayed fairly constant since. The color histogram shows that new editors continue to make up a significant percentage of all editors. About 50% of all editors in a given month are people who have their first edit in that month as well. Over 50% of all editors editing in June 2011 have also done their first edit in the first half 2011.
|Bytes added by editors with more than 1 edit in a given month|
|Number of editors with more than 1 edit in a given month|
|Number of edits by editors with more than 1 edit in a given month|
More than 5 edits per month
The plot for active editors strongly resembles the one for total number of editors. It has been constant at about 2000 editors each month, thus about a third of all editors that edit each month contribute more than 5 edits. While in the total editor plot we can see that about 40% of all editors in a given month are people who have their first edit in that month as well, the percentage for active editors is at about 25%. This percentage is going down a little faster for active editors than for all editors, which is probably explained by the fact that active editors are also more loyal, thus they stick around longer and make up a larger percentage of older editors. About 40% of all editors editing in June 2011 have also done their first edit in the first half 2011.
|Bytes added by editors with more than 5 edits in a given month|
|Number of editors with more than 5 edits in a given month|
|Number of edits by editors with more than 5 edits in a given month|
More than 100 edits per month
The number of very active editors is also stable since the beginning of 2008 at around 300 editors in a given month. It is interesting to note that the added contributions of this cohort is very large, if you compare the bytes added and the number of edit plots for all editors (i.e. more than 1 edits) there is only a small difference. This is confirmed in the next section, in where we only look at editors who have done less than 100 edits in a given month.
|Bytes added by editors with more than 100 edits in a given month|
|Number of editors with more than 100 edits in a given month|
|Number of edits by editors with more than 100 edits in a given month|
Less than 100 edits per month
It is also interesting to look at all editors that do less than 100 edits in a given month. They constitute the majority of all editors. Out of about 7000, only 300 have more than 100 edits in a given month, so about 95% of all editors are in the less than 100 edits cohort. They contribute about 20% of all edits (40000 out of a total of 20000), and a little less than 20% of all the bytes added (15MB out of 80MB per month).
|Bytes added by editors with less than 100 edits in a given month|
|Number of editors with less than 100 edits in a given month|
|Number of edits by editors with less than 100 edits in a given month|
Line plot of the total number of editors with less than 100 edits in a given month
Please add any comment you might have. Or ideas for different analysis or metrics to look at.
Number of editors by activity
There are quite a few editors with more than a 1000 edits in a month. What kind of work do they do? In the WikiPride plots we can see that they contribute more than 30% of all bytes and 40% of all edits. Are there bots that we don't filter?
Number of edits by activity
Bytes added by activity
Bytes added per editor
The bytes added per editor has been fairly constant since the beginning of 2006, for all activity cohorts. Note that the y-axis is log scaled, so a >1000 editor is contributing a 1000 times more content than a 2-5 editor and has consistently done so.
Bytes added per edit
It is interesting to note that the number of bytes added per edit is about the same for all activity cohorts.
Edits per editor
The number of edits per editor in each activity cohort has been stable. This follows directly from the fact the the number of editors in each cohort has been stable, but it also shows that there is no tendency for on cohort to become less or more prevalent.
All vs. New Editors
(Not sure about the quality of the data yet - please enjoy with caution) The green line is the average bytes added per edit for editors that have started editing in the previous three month. The blue line is the average bytes added of all editors. The data for new editors with less than 100 edits is quite noisy. Why is that? A spike indicates that there have been many new editors that did less than 100 edit in that given month, but contributed a lot of content in a few edits.
The takeaway of the second graph for editors with a 100 edits per month is that active editors don't seem to be less productive at the early stages of their editing lives.
Very active new editor
The number of editors who are very active in their first months has fallen approximately at constant rate since a hike in 2006, displaying periodic oscillations (annual). These categories of publishers are very important to analyze, since their members are likely to become very active publishers in the future.
WikiPride : Activity Histograms
All simpler plots above are extracted from the WikiPride graphs below.
|Activity Histogram for bytes added to the Portuguese Wikipedia|
|Activity Histogram for the number of editors in the Portuguese Wikipedia|
|Activity Histogram for the number of edits to the Portuguese Wikipedia|
In the English Wikipedia, there is a noticeable uptick in policy discussions on the Wikipedia namespaces (4 & 5), which happened during the exponential growth phase of the English Wikipedia. In the Portuguese Wikipedia, this uptick happened at the same time as in the English Wikipedia, at a very early stage of growth of the Portuguese Wikipedia. This indicates that the policies were translated from English (where they were introduced as a reaction to the explosion of new editors). It would be interesting to analyse whether this has influenced the growth of the Portuguese Wikipedia negatively (as it is assumed it has for the English Wikipedia).
There seems a particularly large contribution to the user talk pages (namespace 3) in the Portuguese Wikipedia. A constant ~35% of all content added to the PT Wikipedia is on User talk pages, which is more than in the English Wikipedia (~25%). Why is that?
There is a huge increase in user talk pages activity in August 2006. What is the reason for this? Is there a bot that we haven't identified yet? Was there a significant policy change? (see talk page)