Talk:Pageviews Analysis/Archives/2023/2

From Meta, a Wikimedia project coordination wiki

Mediaviews analysis on toolforge

The wikimedia file information page offers a mediaviews analysis on toolforge. This video constantly gets over a 1.000 requests every day. The wikipedia pages, where the video is embedded, are accessed much less often. How does this tool count?

https://pageviews.wmcloud.org/mediaviews/?project=commons.wikimedia.org&platform=&agent=user&referer=all-referers&range=latest-20&files=Die_Globen_von_Peter_Anich.ogg Presseteam Uni Innsbruck (talk) 07:41, 22 February 2023 (UTC)

@Presseteam Uni Innsbruck It counts actual plays, and not simply views to a page that has the video on it (such as a Wikipedia article or the File page on Commons). wikitech:Analytics/Data Lake/Traffic/Mediacounts has more information on this metric. Best, MusikAnimal (WMF) (talk) 20:11, 11 April 2023 (UTC)
Thanks for the ref! Presseteam Uni Innsbruck (talk) 09:22, 18 April 2023 (UTC)

Inconsistent values for page

I am collecting pageviews data for different endangered species through Wikimedia's REST API and came to the Pageviews Analysis site to double check the results I was getting. I am using Google Chrome Version 111.0.5563.65

Searching for the monthly pageviews from 2016-2020 for the scientific name of tiger, Panthera tigris, returns the peak around May-Jul 2016 around 4,000+ views. I have marked to include redirects in settings and the search query.

https://pageviews.wmcloud.org/pageviews/?project=en.wikipedia.org&platform=all-access&agent=user&redirects=1&start=2016-01&end=2020-12&pages=Panthera_tigris

However, when I search for Tiger, there is a peak of 700,000 views in Mar 2020.

https://pageviews.wmcloud.org/pageviews/?project=en.wikipedia.org&platform=all-access&agent=user&redirects=1&start=2016-01&end=2020-12&pages=Tiger

These should show the same result no? Since Panthera tigris redirects to Tiger. Most significantly, however, is that when you search the terms together, suddenly Panthera tigris is returning nearly the same amount of page views and peaks as Tiger, that it did not previously.

https://pageviews.wmcloud.org/pageviews/?project=en.wikipedia.org&platform=all-access&agent=user&redirects=1&start=2016-01&end=2020-12&pages=Tiger%7CPanthera_tigris

What is the reason for this difference? I could understand the fact that Panthera tigris and Tiger are technically two different pages, one of which results in a redirect to Tiger, but why is it that when they are queried together, they are the same? This seems to point to inconsistencies in the data, or in the ways in which it is being visualized.

Reddalisa (talk) 13:16, 28 March 2023 (UTC)

Hi,
if u wanna the original log data of pageviews, so You use the https://dumps.wikimedia.org/other/pageview_complete/monthly/ or in DKMF for year 2016, 2017, 2018, 2019, 2020. Dušan Kreheľ (talk) 21:28, 29 March 2023 (UTC)
@Reddalisa See https://pageviews.wmcloud.org/pageviews/faq/#redirects. The pageviews pipeline will register the pageview for the redirect and not the target page, which (as odd as it may sound) is expected behaviour. The reason why they look similar if you compare side by side is probably because you have the "Include redirects" option enabled at the bottom-left. Here are the results with that option disabled, and as you can see the redirect has significantly fewer pageviews. Hope this helps, MusikAnimal (WMF) (talk) 20:09, 11 April 2023 (UTC)
@Reddalisa: From pageviews logs, so example, if sk:Krehel is the redirect on sk:Kreheľ, so:
User pageviews in year 2022 (source)
Krehel Kreheľ
2 139

--Dušan Kreheľ (talk) 22:08, 29 March 2023 (UTC)

Добрый день, у меня каждая новая статья больше не появляется в списке статей на странице Анализ статей постоянно пишет Произошла ошибка во время обработки pageviews api - not found что я делаю не так? ссылка на страницу https://pageviews.wmcloud.org/userviews/?project=uk.wikipedia.org&platform=all-access&agent=user&namespace=0&redirects=0&start=2023-03-20&end=2023-04-09&sort=views&direction=1&view=list&user=Genamb Genamb (talk) 16:39, 16 April 2023 (UTC)

@Genamb: For the current UTC day, the data will be released when it ends. Dušan Kreheľ (talk) 17:11, 16 April 2023 (UTC)
@Genamb В данном случае это означает, что за указанный диапазон дат не было просмотров страниц. Мне нужно исправить это, чтобы он показывал нули вместо ошибки. К сожалению, исправить это не так просто, как может показаться. Извините за путаницу! MusikAnimal (WMF) (talk) 04:14, 20 April 2023 (UTC)
@MusikAnimal (WMF): He wanted pageviews for the currently created page. From that point of view, I think it should remain as it is. The day is not over yet, we have no statistics. So the mistake is ok. Dušan Kreheľ (talk) 08:10, 20 April 2023 (UTC)

Userviews Analysis: missing page

Hallo! My Userviews-Analysis page for itwiki should also list the article it:Ahmad Mamduhi, because I am the initiator and hitherto only contributor of that article. (Reply in: de, en, eo, es, it) — Super nabla (talk) 08:09, 26 May 2023 (UTC)

@Super nabla you created this article just yesterday, it seems like page view data from May 24 onwards is not yet available. Johannnes89 (talk) 11:37, 26 May 2023 (UTC)
Comment Comment Alright, thanks @Johannnes89. — Super nabla (talk) 11:43, 26 May 2023 (UTC)

Median and average number of views

Pageviews displays the median and the average number of views in some cases, but only the average in other cases. Is there some criteria that needs to be met for the median to display? Is there a reason for not displaying the median as well as the average in all cases? Nurg (talk) 00:26, 17 June 2023 (UTC)

@Nurg It only shows if it detects a "spike" in the graph, in which case the median is more meaningful. I don't see any reason to shot it all the time, though, or at least when the "logarithmic scale" option is chosen. MusikAnimal (WMF) (talk) 22:07, 18 June 2023 (UTC)
Thanks @MusikAnimal (WMF). Is there a benefit to not always showing the median? Would it place too much load on servers, for example? I would think that if the median was always shown, then the logic that decides whether or not to show the median could be removed, thus reducing executing code and load. I generally don't switch to logarithmic scale, as the default arithmetic scale shows what I am interested in. But when there is a small spike that does not reach the threshold for showing the median, we miss out on that moderately useful info. Personally I see numerous examples in the world at large where the mean is given when the median would be better. At least here we get both when it matters a lot, but I think it would be better to always show both, unless there is a real downside to doing so. Nurg (talk) 01:51, 19 June 2023 (UTC)