Research talk:Monthly anonymous edits (2011)

From Meta, a Wikimedia project coordination wiki
Jump to navigation Jump to search


This research makes some assumptions about quality and number of edits. On criteria ranging from speed at which vandalism gets reverted to the number of unreferenced Biographies of living People and the number of typos the average quality of the pedia has been rising. Arguably this is crowding out new editors who are less likely to come across the typo or vandalism that in previous years would have been the error that lured them into editing. As for the number of edits needed to maintain quality due to change in the real world, my suspicion is that this is a tiny faction of the amout of editing that the pedia currently receives. at an editing arte of circa 10 million edits per 50 days the 3.6 million Wikipedia articles will benefit from an average of 9 or 10 edits per annum (though many of these will be to talkpages etc). Maintaining the existing quality level by updating articles as people die, elections happen and popular culture evolves is a small subset of those edits - maintaing quality by reverting vandalsim is probably much larger subset of the editing. WereSpielChequers 16:37, 29 June 2011 (UTC)Reply[reply]

I agree with your supposition that large revisions by IP editors are frequently vandalism. Small scale revisions can also be as minimal as removing the l from public in descriptions such as "public school". Analysing IP edits by frequently edited articles will almsot certainly result in an odd anomaly amonst some of our more popular pages as these will include some of the articles that are semiprotected - stopping IPs and new accounts from editing them. WereSpielChequers 17:18, 29 June 2011 (UTC)Reply[reply]


There are a number of reasons that could explain the fall in IP editing, these include:

  1. Since IPs lost the right to create new articles there has been a drift from IP editing to logged in editing
  2. Many millions of IP addresses have now been blocked so users of those IPs need to log in.
  3. There is a theory that IP editing is a stage that new editors go through before they create accounts, so a decline in IP editing fits in with the general decline in Newbies.
  4. There is a theory that vandalism and typos encourage readers to edit and fix things. As AWB users deal with most typos, and anti vandal bots revert most vandalism pretty much in realtime so our readers are being crowded out from making such minor fixes.
  5. The increased efficacy of the edit filters is believed to be preventing much of the vandalism that would otherwise have occurred.

WereSpielChequers 17:18, 29 June 2011 (UTC)Reply[reply]

Edit Length[edit]

I find this astounding: ... the anonymous revisions spike at revisions of about 2500 bytes in length. Given that a byte is roughly one character, you're saying (if I understand you right) you're saying that the median anonymous edit is the addition of 2,500 characters of text. That's not my experience at all, and I've looked at tens of thousands of IP edits over the last five years. As support, I note that when I looked at Recent Changes, a moment ago, this is what I found for the sizes of the most recent ten IP edits: +27, +13, 0, -324, +83, -9,394 [blanking a section of an article], -6, 0, +159, +4. That's exactly what I'd suspect, and I invite you to do the same - or just do a histogram of the a snapshot of 100 recent IP edits, and the difference should be startling compared to your graphs.

I can only speculate that (a) if you separated out (say) the first five years of editing from the last five, you'd see a huge difference (long, anonymous additions were common in early years but almost totally absent in more recent years); or (b) the data used fails to distinguish between negative changes (say, blanking an entire article, which would be in the thousands of bytes of change) versus increasing the size of an article (a positive change), or (c) something else is very, very wrong with the data or its processing. -- John Broughton 17:19, 2 August 2011 (UTC)Reply[reply]

Looking at the graph, the spike is not the median at all, where did you read this? The result could still be considered strange, but b) is exactly how I understood data should be interpreted. --Nemo 16:30, 28 September 2011 (UTC)Reply[reply]


Declerambaul, could you add some URLs for the Open access and Open data nature of this project? You can add them directly to the project template with the open-data-url and open-access-url params. --EpochFail (talk) 22:54, 9 February 2017 (UTC)Reply[reply]