- @Bluerasberry: I have also created documentation for the Pageviews Analysis tool. What you have here at Traffic reporting is excellent, as it encompasses all tools available and explains the concepts behind them. However we needed a centralized wiki page for feedback for the Pageviews Analysis and sister tools. I have not finished the documentation, but feel free to copy whatever you want from it to the traffic reporting page. There is also some unsupported documentation (not monitored by maintainers) on the English Wikipedia that might interest you: See w:Wikipedia:Pageview statistics and w:Wikipedia:Web statistics tool. Best MusikAnimal (WMF) (talk) 18:29, 26 June 2016 (UTC)
Timeline of changes for counting Wikimedia traffic
I wrote about this a little in 2015 at "Is Wikipedia traffic dropping?". Now again I am seeing some reasons to interpret the way we count article traffic
- January 2008 to early 2016 - we used stats.grok.se, also called Henrik's tool. There is not much documentation. In hindsight it seems that this tool counted pageviews from desktop computers, including bot views, omitting pageviews with en:HTTP Secure, and not counting redirects.
- 2015 - WMF switches Wikimedia projects to access through HTTP Secure. This meant that external web metrics monitoring services, but not the Wikimedia internal counts, dropped in the transition
- between April - November 2018 the WikiTech team published a new definition of a pageview as defined at wikitech:Analytics/Pageviews and Research:Page view
- From May 2015 the Wikimedia Foundation designated the Analytics Cluster described at wikitech:Analytics/Systems/Cluster as the place to get official pageview data. Accessing data here is still a technical process beyond the ability of most users to access.
- ? Not sure of creation of this page it had a November 2015 revision. Page Views for Wikimedia, All Projects, Both sites, Normalized is an odd product which summarizes all traffic reports for all Wikimedia projects going back to 2009.
- February 2016 - Pageviews Analysis release. This is the first time we have an official pageview tool for a general audience presented by the Wikimedia Foundation. This tool is simple enough for any typical Wikipedia contributor to access.
- ?? A problem remains that I do not see documentation about redirects. A redirect can be an alternative name, a misspelling, or any term which does not have its own Wikipedia article but which leads readers to another article.
- ?? I thought at some point there was an effort to exclude bot views from these counts. I forget when that happened. I also do not know how this compares with other websites and counting systems, which might include bot views. I do not know the industry standard about this.
At en:Wikipedia:WikiProject Medicine/Popular pages there is data for English Wikipedia's medical content. I do know how this is calculated, but supposedly it includes redirects. I think it comes from en:User:Community Tech bot.
I am posting all this to sort my thoughts and the differences. I do not know if anyone ever retroactively calculated pageviews to the current correct standard, to do things like exclude bots, combine mobile and desktop, and to include redirects. Blue Rasberry (talk) 16:47, 7 June 2018 (UTC)
- No, HTTPS was not excluded from the data stats.grok.se used. The main FAQ is at w:en:User:Killiondude/stats.
- The treatment for bots changed when the new "definition" at Research:Page view was adopted, see there.
- Redirects are important indeed. The documentation links wikitech:Analytics/Data Lake/Traffic/Pageviews/Redirects, which AFAIK is still accurate. Usually data consumers will want to group pageviews from all redirects; this feature is mentioned in toollabs:pageviews/faq/.
- --Nemo 12:42, 8 June 2018 (UTC)