Pageviews Analysis

From Meta, a Wikimedia project coordination wiki
(Redirected from Pageviews analysis)
Pageviews Analysis
DevelopersMusikAnimal, Kaldari, Marcel Ruiz Forns
Initial releaseFebruary 4, 2016
Written inJavaScript, PHP
Browser supportChrome, Firefox, Safari 9+, IE11+
LicenseMIT
Source codeGitHub
URLhttps://pageviews.wmcloud.org
Shortcuts:
WM:PAGEVIEWS,
Meta:PAGEVIEWS
Pageviews Analysis
2019 Coolest Tool Award Winner
in the category
Reusable

Pageviews Analysis is a suite of tools to analyze page view and unique device statistics for Wikimedia Foundation wikis. There are eight tools in the suite, Pageviews, Langviews, Topviews, Siteviews, Massviews, Redirect Views, Userviews and Mediaviews. See below for further documentation.

The tools use data provided by Wikimedia's RESTBase API, based on the definitions documented under Research:Page view and Research:Unique Devices. The original Pageviews Analysis tool was created in February 2016, forked from an API demo created by Marcel Ruiz Forns, and today is maintained by Community Tech.

List of tools

  • Pageviews (FAQURL structure) – The main Pageviews tool allows you to analyze pageviews of up to 10 pages during a given time frame. You can filter statistics based on the platform and user agent. If you want to view data on more pages, consider using Massviews.
  • Langviews (FAQURL structure) – Langviews lets you see page views statistics for a given article across all languages. Supported projects are Wikipedia, Wikivoyage, Wikibooks, Wikinews, Wikiquote, Wikisource, and Wikiversity. You can filter statistics based on the platform and user agent.
  • Topviews (FAQURL structure) – Topviews shows you the most viewed pages of a given project. The data can be filtered by date range and platform. More information on this tool can be found below.
  • Siteviews (FAQURL structure) – Siteviews allows you to view pageviews or unique devices across all pages of a given project. You can filter statistics based on the platform and user agent, and compare up to 10 different projects. Additionally, you can use the "Metric" option to show unique devices instead of raw pageviews.
  • Massviews (FAQURL structure) – Massviews allows you to import a list of pages and analyze their page views. For performance reasons, you can only import up to 5,000 pages. Statistics can be filtered based on the platform and user agent. More information on this tool can be found below.
  • Redirect Views (FAQURL structure) – Redirect Views lets you view pageviews stats of a given page and all of its redirects. This is useful because when redirects are viewed, a pageview is not registered for the target page. This tool can also suggest which redirects are the most popular. Statistics can be filtered by platform and user agent.
  • Userviews (FAQURL structure) – Userviews lets you view pageviews stats of all pages created by the given user. Statistics can be filtered platform and user agent. Note that the tool may be slow or in rare cases become unresponsive when querying for users with a very large edit count.
  • Mediaviews (FAQURL structure) – Mediaviews allows you to compare playcounts of audio and video files on Wikimedia Commons.

Features

Each tool varies with available features, but they may include the following:

Chart types

You can visualize the data using one of six different chart types. Your preference will be remembered on the same browser and computer. By default, the tool shows a bar chart when viewing data for a single article, and a line chart when viewing data for multiple articles.

Settings

The settings panel allows you to change the search method, and provides options for localization and chart preferences. Some options may not be available for some tools. Your settings are remembered on the same browser and computer. The Line, Bar and Radar charts visualize data over time for one or more articles. The Pie, Doughnut and Polar Area charts compare data for total number of pageviews for each article. For this reason, only Line, Bar and Radar are available if you are viewing data for a single article.

Search method

"Autocompletion" is the default option and works the same way as the search does on the wikis. When typing a page name, it will try to intelligently correct typos and resolve some redirects. For instance, searching for Barak Obama (common misspelling) will show in the search results as the actual article title, Barack Obama. If you want to actually see the page views stats of the redirect page itself, change your search method to "Autocompletion with redirects". This will allow you to pick the redirect page Barak Obama. Using "No autocompletion" will allow you to type in any arbitrary page name, including nonexistent pages. However due to API limitations, viewing data on deleted pages is not possible.

Localization

"Format numerical data" adds delimiters to large numbers, for instance "1,000,000" instead of "1000000". You can turn this off for instance if you needed to copy/paste the data somewhere without the delimiters.

"Localize date format" formats all dates to match your computer settings. If you are seeing the wrong date format, first look in your computer's settings to see if you have the correct region/locale set to your preference. You can disable date formatting to show the standard ISO 8601 format, YYYY-MM-DD.

Chart preferences

"Automatically use logarithmic scale" tells the tool to compare the pageviews data and use a logarithmic scale if it detects large variations, so as to make the fluctuations more readable. Turning this option off will permanently disable this functionality unless you explicitly enable it using the "Logarithmic scale" option above the chart. If you want to share a link to pageviews with someone without the log scale on, add the autolog=false to the end of URL. See the URL structure documentation for more information.

"Always show y-axis starting at zero" will always show data starting at zero on the y-axis. For instance, if the range of pageviews is from 1,500 to 5,000, the tool will start the y-axis at around 1,000 to better emphasize variations in the data.

"Remember chart preference" is defaulted-on. Turn it off so that you can selectively change the chart using the "Chart type" button, and when you refresh the page it will revert to the normal logic of showing a bar chart for a single article, and a line chart when viewing data for multiple articles.

"Use Bézier curve on line charts" is purely an aesthetic preference. Enabling this will make the lines of a line chart smooth and uniform. From an analytics perspective, this is not ideal as it implies there was data between the points, when data is only collected on a per-day basis and not individual hours.

Permalink

By default the tool uses a date range relative to the current date, such as the past 20 days. If you want a permanent link to the data you see, click on "Permalink" to reload the page, and the URL will use the exact dates. You could also simply right-click and copy the URL of the "Permalink" button.

Download

You can download the data you see as CSVJSON, or a PNG image. If you are working with spreadsheets, you'll likely want to use CSV. If you want to use the data for a website, JSON might be a more appropriate format. PNG is good if you want to use an image of chart on another website, for instance. Finally, you can also Print the visible chart from the interface. Note that for CSV, totals and averages are omitted as it may be preferable to use the spreadsheet software to do these computations.

Begin at zero

This option shows the y-axis starting at zero. For instance, if the range of pageviews is from 1,500 to 5,000, the tool will start the y-axis at around 1,000 to better emphasize variations in the data. Use the "Begin at zero" option to bypass this logic.

Logarithmic scale

If there is a large spike in page views data during a given date range, the data will be shown using a logarithmic scale. For instance, if someone recently died, there might be a large spike of page views on the day they died, but for the other dates the page views are significantly less. Without a log scale, you may be unable to see the smaller fluctuations in data. You can turn off the log scale manually at any time. The log scale will not automatically be shown again until you refresh the page.

Totals

Below the chart you will see the totals for each page. This includes the total page views during the given date range, and the average daily page views. Depending on the tool, additional links will be shown. For Pageviews, the "All languages" link will bring you to the Langviews tool, where you can get page views data on that page for each language within the given project. "Redirects" will show pageviews for the page and all of its redirects. In the Siteviews tool, the "Most viewed pages" link will open the Topviews tool for the given project, showing the most viewed articles.

Topviews

Topviews shows you the most viewed pages of a given project. The data can be shown on a monthly or daily basis, and filtered by platform.

False positives

Some random pages may have inexplicably high view counts. These are likely false positives, and unfortunately they are unavoidable. These could surface because someone used an automated program to scrape the page, for instance. One tactic to identify false positives is to compare desktop views with mobile web. Mobile web views should be comparable if not higher than desktop, so if mobile web views are very low, it might be a false positive.

You can also open a page in the Pageviews app for further analysis. Just click on the view count on the right side.

Excluded pages

You can hover over the entries and hit the ✖ to remove them from view, such as false positives. By default, all non-mainspace pages are hidden.

Massviews

Massviews allows you to import a list of pages and analyze their page views. For performance reasons, you can only import up to 20,000 pages. There are several methods to import pages:

  • Category – Import all the pages that are in a given category. To use, enter the full wiki URL, such as https://en.wikipedia.org/wiki/Category:Folk_musicians_from_New_York. You can also toggle to show the subject page rather than talk pages. This is most useful for categories generated by WikiProject banners, for instance FA-class New York City articles.
  • Wikilinks – Enter the full URL of any page, and all of the links to other pages on that page will be processed. This is the easiest way to create and see pageviews of a list of articles you have.
  • PagePile – A tool that lets you create and save a list of pages. Use the PagePile tool to create the list, make note of the ID, and enter that ID in Massviews to process the pageviews statistics.
  • Subpages – Enter the full URL of any page, and the target and all of its subpages will be processed. This is useful for instance to get total pageviews for all of your userspace, or for all chapters of a Wikibook.
  • Template – Show page view stats on pages that transclude the given template. Use the full wiki URL of the template, such as https://en.wikipedia.org/wiki/Template:Infobox_Olympic_games.
  • External link – Show page view stats on pages that contain the given external link. You can use a pattern to search for multiple similar links. See the MediaWiki documentation for more information.
  • Hashtag – Show page view stats on pages that have edits containing the given hashtag in the edit summary. This feature is primarily for outreach purposes. The hashtag search functionality is made possible by Wikipedia social search. See the documentation for more information.
  • Quarry – Show page view stats on pages returned by a SQL query in Quarry. Use the numeric ID of the Quarry dataset. The pages should be in the "page_title" column.

Feedback / bug reports

You can provide feedback or report an issue using any of the following:

  • Talk:Pageviews Analysis – This is the on-wiki home for feedback. Bug reports may be migrated to Phabricator.
  • Phabricator – The primary issue tracker. If you create a new task, please tag it with "Tool-Labs-tools-Pageviews" so we can find it. "Pageviews-API" is not for the Pageviews tool.
  • GitHub – GitHub is where the code is hosted. Pull requests and other technical advice are welcomed with open arms.