Talk:PetScan

From Meta, a Wikimedia project coordination wiki
Jump to navigation Jump to search

German instructions[edit]

Hi; please see w:de:Wikipedia:Technik/Labs/PetScan. Greetings --PerfektesChaos (talk) 19:22, 29 March 2016 (UTC)

operating manual in german?
Um das Tool zu verwenden hatte ich PetScan/de besucht. Das suggeriert mir eine deutschsprachige Bedienungsanleitung. Schade - welche Sprache müsste ich wählen um deutsch zu bekommen? --2003:DE:3E1:CE01:4517:A3FB:2ADF:C616 19:59, 26 August 2017 (UTC)

Improvement suggestions[edit]

Some suggestions for things that would make this tool even more useful for some queries:

  1. An option to only select pages that are (not) subpages.
  2. Each line of the output could include a link to the CatCycle tool to find out how that page is in that category.
  3. A "Get the categories containing the individual page" option like Quick-Intersection has.
  4. If the page is a (hard) redirect then the link in the output should go to the redirect page (rather than follow the link to its target).
  5. The ability to select pages (in particular namespaces) that are in no categories (e.g. any pages in the Template namespace that are not in Category:Wikipedia_templates - this always returns zero results).

DexDor (talk) 06:59, 31 March 2016 (UTC)

Feature requests[edit]

  1. Generate reports on all pages linked from an arbitrary page
  2. Collect and report pageview metrics

Right now PetScan presents reports based on categories. A problem with this is that this creates incentive for organizations to game categories in a way that conflicts with Wikimedia community volunteers. Organizations, including every organization that has a Wikipedian-in-residence or a partnership with a Wikimedia chapter, want reports of this sort. If reports can only be generated through categories, then that creates pressure to adapt the public category system to reflect financial pressures from organizations. This in turn positions all outreach projects including GLAM, Wikimedia chapters, and all institutional partnerships to be against Wikimedia community processes. To fix this, then as an alternative to generating reports through categories, also give an option to generate reports from pages which are private lists of links in userpages.

Going further - this tool already does a great job at listing all Wikipedia pages in categories. Given a list of articles, it would be extremely useful to be able to get metrics on those Wikipedia articles. Steiner's Wikipedia Tools for Google Chrome already does everything PetScan does but in Google Sheets, but still, that tool is data overkill and it is hard for people without good spreadsheet skills to get only the right amount of data. If this PetScan could be combined with the output of en:User:Vipul's WikipediaViews.org then this would be immensely useful to developing institutional partnerships with the Wikipedia community.

Blue Rasberry (talk) 13:56, 17 April 2016 (UTC)

new pages[edit]

Option "Only pages created during the above time window (overrides "last revision")" leads to crash, the result is not formed. Игорь Темиров (talk) 06:30, 21 April 2016 (UTC)

Manual localizations[edit]

Link "Manual" in the tool open PetScan/en, ok. But if in the tool change "Interface language" e.g to "ru", then "Manual" open dead page PetScan/ru. I think could make redirectes for all inter-subpages to main PetScan or to this english page. Or set in internalization templates, like set on top Wikidata/Development etc. --Vladis13 (talk) 10:12, 26 May 2016 (UTC)

Add free text search[edit]

Would it be possible to cross-search on a category and a piece of text on all articles in that category? For instance: have PetScan search all articles in en:Category:American racehorses for the word "California" and produce a list of those articles. I see no way to do that as it is. Gorthian (talk) 05:00, 5 June 2016 (UTC)

Does not work for me[edit]

I don't get it, and my tests fail. I see (say) three tabs "cats, props, templates", but it is unclear whether they work "and" or "sole". I cannot even replicate the regular "this template's transclusion pages" [WLH] ("do it!" result: 0; WP result: 2500). -DePiep (talk) 19:10, 7 June 2016 (UTC)

User:Magnus Manske, I may be having the same problem as User:DePiep. I'm getting "0 results" on this query, which I believe is identical to the one that worked for me last week. I want the list of pages that are in w:en:Category:Unknown-importance_medicine_articles, but not in w:en:Category:Unassessed medicine articles. There should be hundreds (maybe ~1,000) pages in the results. WhatamIdoing (talk) 01:45, 23 July 2016 (UTC)
I don't understand what DePip even tries to do; an example would be nice. As for WhatamIdoing, your query looks for articles, but the categories contain talk pages. Extend your query to talk pages (on "Page properties"), and it works as expected. --Magnus Manske (talk) 14:39, 23 July 2016 (UTC)
Thanks. I've saved a copy of this link. WhatamIdoing (talk) 06:44, 24 July 2016 (UTC)
  • In regular enwiki, I can create a 'What links here' list for a template page. It lists transclusions and links for that page. However, with PetScan I can not create such a list. -DePiep (talk) 16:39, 25 July 2016 (UTC)

2 years old file returned even though max_age=96[edit]

This query returns me all pictures uploaded via the Android app in the last 4 days.

It works well, except I noticed this false positive which was uploaded in 2014 (but someone changed its categories yesterday).

Could someone add to the manual a description of how max_age works? And is there another keyword to get only files that have been modified in the last 4 days, excluding updates? Thanks! Syced (talk) 04:22, 16 June 2016 (UTC)

Add/Remove Statements on wikidata[edit]

I used Autolist 2 to add/remove Statements on wikidata items but can not find how to do the same on Petscan. Its too complicated. I could generate list of items by category but cant understand that how to add/remove statements to that list items. Please let me know.--Nizil Shah (talk) 06:36, 2 August 2016 (UTC)

+1. I cannot figure out in which cases I have editing form, and in which I have only wikified list. For example, how should I edit PSID=121595? --Infovarius (talk) 23:13, 4 August 2016 (UTC)
Hello User:Nizil Shah, User:Infovarius, I was facing the same problem, I wanted to add a wikidata property for a petscan result for already existing items. User:Vesihiisi (thanks again) found the solution: On the "Other sources" tab, select "Wikidata" in the "Use wiki" section. It's set to "Automatic" by default, but "Wikidata" will make the editing form appear. What I am still looking for is a possibility to remove items from the Petscan result which already have the property you want to add already set (e.g. if the petscan result is a long list, but only a few items actually might have to be changed). It seems, that filtering with SPARQL (...FILTER NOT EXISTS { ?item wdt:P463 wd:Q299015 }....) does not work, because this selects ALL items, not only those from the petscan result, therefore the query stops with an timeout error. --M2k~dewiki (talk) 12:40, 16 August 2016 (UTC)
User:Vesihiisi also had a solution to this problem: This one's kinda tricky -- I do these sorts of queries like this (autorun). I.e. I put the value in the "Uses items/props : None" field on the Wikidata tab, but I don't actually use P463 anywhere in the query... It means that any items that link to Q414163 from any property will be excluded, not only those that pair it with P463. It just happens to work great in this particular case --M2k~dewiki (talk) 13:47, 16 August 2016 (UTC)
Hi User:Nizil Shah, User:Infovarius, User:M2k~dewiki & User:Vesihiisi. I figured out a way to do this using a combination of Vesihiisi's methods and the SPARQL box. For instance, for the PSID here [1], it finds people in category "American accordionists," who DO have instrument = accordion but DO NOT have gender = male. In this example I don't use the "Uses items/props" box, only the SPARQL box under "Other sources" and enter SELECT ?item WHERE { ?item wdt:P1303 wd:Q79838 . MINUS { ?item wdt:P21 wd:Q6581097 } }. So the first part of the SPARQL is the property/item pair you want, and the second is what to exclude. Sweet kate (talk) 17:03, 24 October 2016 (UTC)
To have mutliple conditions, you can string them together like this: SELECT ?item WHERE { ?item wdt:P1303 wd:Q79838 . ?item wdt:P31 wd:Q5 . MINUS { ?item wdt:P21 wd:Q6581097 } MINUS { ?item wdt:P1303 wd:Q5994 } }. Sweet kate (talk) 17:11, 24 October 2016 (UTC)
How to add qualifiers to property by Petscan? Sweet kate, User:Infovarius, User:M2k~dewiki & User:Vesihiisi, help me. I have no technical knowledge.--Nizil Shah (talk) 06:58, 9 December 2016 (UTC)

Missing pages[edit]

As I see, PetScan should be able to return a list of missing pages (red links) if checkbox «Show only redlinks to main (article) namespace» has been checked at the «Output» tab. But returned results are just the same as without checking the box at all, e.g. pages returned are existing articles. Did I do something wrong?

Second, I should enter «be_x_old» code into the «Language» field to run on be-tarask.wiki. It's ok, but after each run the text in this window is replaced by «be x old» («_» replaced by « ») which is quite uncomfortable. --Renessaince (talk) 15:33, 5 August 2016 (UTC)

@Renessaince: I just had the same problem. Most likely it is because you did not click the "Show redlinks" option first. I do have one request for redlink searching: can the output include number of missing links as a column? This is how the Missing Topics tool used to behave and it's a really key thing to include when building these kinds of links (because a missing article with 100 links is of higher priority than one with, say, 20). Thanks. Sillyfolkboy (talk) 23:59, 29 August 2016 (UTC)
Got it now, thanks.
Now there is another issue about this tool: for w:be-tarask: it works only if I enter value «be_x_old» into the «Language» field. Neither «be-x-old» nor «be-tarask» fits for this tool, and using «be_x_old» looks inappropriate because of the wrongly generated links in the output, e. g. be_x_old instead of be-x-old. --Renessaince (talk) 08:41, 31 May 2017 (UTC)

Further question. In the generated list of missing articles there are two articles which are actually exist: Пераклады Бібліі на беларускую мову and ВНУ Беларусі (second one is a redirect). What's wrong with them? --Renessaince (talk) 13:34, 5 June 2017 (UTC)

Adding coordinates from templates[edit]

I was hoping that I could use PetScan to find pages with coordinates and WD entries that don't have coordinates and fill P625 with those values easily. The only option I fund was to to extract the coordinates from the used template and manually copy/paste from the PetScan result list. See [2]

If copy/paset is the only option, it would be good to use a formatting in PetScan that is accepted in WD directly. Currently I have to reformat manually for WD to accept teh values.

I would love to do the same for Coats of Arms, location maps and other template fields of course.

Thanks. --Aeroid (talk) 06:56, 26 August 2016 (UTC)

Change statements[edit]

Is it possible to change statements using Petscan? --Epìdosis 12:59, 4 November 2016 (UTC)

has no claim[edit]

I cannot filter a list by "noclaim": https://petscan.wmflabs.org/?psid=590599. How to do it? --Infovarius (talk) 14:08, 14 November 2016 (UTC)

Default Namespaces[edit]

When calling with parameters from a link the Namespaces default of article is not switch on. I also cannot see how to add a parameter to force just articles to be listed. This is causing problems with statistics pages at Wikivoyage. --Traveler100 (talk) 19:59, 7 December 2016 (UTC)

Labels not in English[edit]

How to make labels of items to be displayed in other language? I've tried to change language of interface, language of wiki at first page but in vain... --Infovarius (talk) 10:29, 12 December 2016 (UTC)

В схожей теме выше #Manual localizations почти год нет ответа. Похоже локализация авторов не интересует. --Vladis13 (talk) 23:39, 3 January 2017 (UTC)

Can anyone write the steps to find out label which is not available in ml language. For eg : Category:American feminist writers.

I would like to get the names in english which has no label in malayalam language. --Akbarali (talk) 06:11, 30 August 2018 (UTC)

Wikidata + Sitelink - Template[edit]

I am maybe being dim here, but I can't seem to do the following. What I want is to combine:

To get the first it seems I can't just use the Wikidata tab (which is only a filter?) but I need to write a query. Which is fine -- I can even specialise to extract items which have one and only one P1367.

The second seems to be done automatically, whether I want it or not. (Actually it's probably easy enough to turn off or change with the right checkboxes on the "Other sources" tab.

The third I am having more trouble with. I can generate a list of pages which *do* have the template easily enough, using the Templates tab. But I can't seem to use the tab to filter away pages which *don't* have the template.

What's the right way to do this? Jheald (talk) 12:56, 22 February 2017 (UTC)

  • I now know how to do this. Some things I've learned:
    1. Template results can be excluded by saving them as a Pagepile, then using "Sparql NOT Pagepile" in 'Other sources' -- 'Combination' to exclude them. ("Sparql NOT Templates" doesn't work, because "Templates" isn't understood as an input source). So if this is what you need, first use the 'Templates' screen to produce a list of all pages that do have the template, save it as a Pagepile, then use 'Combination' to exclude it.
    2. It is important that the SPARQL query does not include the underscore character, particularly not in variable names. Such queries will successfully run once, but when PetScan stores them when you switch to another screen, it turns all the underscores into spaces, and the query then no longer works. So "?membershipStmt" is an okay name for a variable; but "?membership_Stmt" is not.
    3. It is important to use DISTINCT in the SELECT statement in the query. If there are two hits to the same item, they are not merged, but Petscan only finds a matching wiki page once -- the other hit is returned unlinked.
    4. Choose 'Use wiki' = 'From categories' to get the output as a list of Wiki pages (suitable eg to put into AWB), otherwise the list will be of Wikidata items. This works (and is needed) even if you have made no other use of the categories screen -- it still specifies the reference wiki.
-- Jheald (talk) 12:27, 20 January 2018 (UTC)

Multiple projects ?[edit]

Hi,

Is it possible to cross categories from multiple projects ? For instance s:en:Category:Authors and s:fr:Catégorie:Auteurs.

Reasoning : as PetScan is often used to import data to Wikidata, it could be useful to check the consistency of the data in different projects before, in order to importing contradictory data into Wikidata.

Cdlt, VIGNERON * discut. 15:49, 28 February 2017 (UTC)

Anniversaries[edit]

I want to make a list of people, connected with Ukraine, who celebrate anniversaries.

For that I have to make a lot of PetScan requests (10 for every century) from the Ukrainian Wikipedia like:

Українці Ukrainian people
Народились 6 березня March 6 births
Народились 1907 1907 births
Українці Ukrainian people
Народились 6 березня March 6 births
Народились 1917 1917 births
Українці Ukrainian people
Народились 6 березня March 6 births
Народились 1927 1927 births
Українці Ukrainian people
Народились 6 березня March 6 births
Народились 1937 1937 births

Instead of many requests I would like to make one like

Українці Ukrainian people
Народились 6 березня March 6 births
Народились 1907 .or. Народились 1917 .or. Народились 1927 .or. Народились 1937 .or. Народились 1947 … 1907 births .or. 1917 births .or. 1927 births .or. 1937 births .or. 1947 births …

What can be recommended? Probably data from Wikidata could be used? If Yes, then How?

--Perohanych (talk) 07:57, 6 March 2017 (UTC) P.S. I am aware that in English and in German Wikipedias there are no categories like March 6 births, but in the Ukrainian Wikipedia we do have such categories.

You can get this directly from Wikidata, provided there items for the people, they have a birthday and nationality set. Query is here, it will automatically use the current year, let me know if you need other years. I tried a mixed Wikipedia/Wikidata PetScan query but is does not return any results. I guess the Wikidata birthdays are incomplete. This is a list of Ukrainian people without birthdays on Wikidata. --Magnus Manske (talk) 10:01, 7 June 2017 (UTC)

Categories[edit]

Hi,

I am having problem, when I set more categories into Categories:Categories getting zero results. I follow the manual, which say one per line (eg. Towns in Kladno County). Am I doing something wrong, that it doesnt work? When I place just one category it works, while I set more it doesnt.--Juandev (talk) 19:10, 18 April 2017 (UTC)

Perhaps you selected Combination: Subset instead of Combination: Union? --FriedhelmW (talk) 20:08, 18 April 2017 (UTC)

Cool, thx. Now it works.--Juandev (talk) 12:47, 19 April 2017 (UTC)

I tried several time to get one list by using category. But it does not work. Anyone can help me. I need to get "List of schools in the United Arab Emirates". I have given this (List of schools in the United Arab Emirates) in Categories box. Wikipedia link is given as follows. https://en.wikipedia.org/wiki/List_of_schools_in_the_United_Arab_Emirates --Akbarali (talk) 02:57, 7 September 2017 (UTC)

The term you entered into the Categories box is not a category, it is a Wikipedia article. Go to the "Templates&links" tab and put it into the field "Linked from / All of these pages", and it will work. OMHalck (talk) 11:43, 28 November 2017 (UTC)

Not updating[edit]

Results do not appear to be updating. Values of some enquires are returning same results for last 2 days even though should have changed. --Traveler100 (talk) 08:22, 29 April 2017 (UTC)

As bug reports go, this is indeed better than "stuff doesn't work", but not significantly so. --Magnus Manske (talk) 10:02, 7 June 2017 (UTC)

Creator[edit]

When creating new items, PetScan doesn't fill labels... It's a step back from http://tools.wmflabs.org/wikidata-todo/creator.html. --Infovarius (talk) 08:58, 21 June 2017 (UTC)

Orphans[edit]

Is it possible to find orphan articles in a category? For example, articles from Spanish Wikipedia in the "Matemáticas" category with level 4 whithout articles linking to them. The problem is that in Spanish Wikipedia is going to be deleted the Spanish template for "orphan" and I want (if it is possible) a similar method for doing this. Thanks, Juan Mayordomo (talk) 17:11, 10 July 2017 (UTC)

Modules used[edit]

At the moment when you do a search on petscan, you can check for the use of a template. Would be great to implement this same behavior with Modules. --Zackmann08 (talk) 21:24, 5 September 2017 (UTC)

Article and talk page[edit]

How could I look for articles belonging to a category, whose talk page don't belong to another categorie? I would need this query to list all articles belonging to a Portal, whose talk page doesn't have any associated wikiproject assessments. Any help will be appreciated. Djiboun (talk) 22:04, 26 September 2017 (UTC)

Suggested feature[edit]

The ability to tell PetScan to ignore the contents of certain templates e.g links in NavBoxes would be a good feature to have. As you can probably guess, my results are being polluted by NavBox links. The reverse would be good to, to only scan inside a specified template and ignore the contents of the rest of the article. - X201 (talk) 08:14, 20 October 2017 (UTC)

Gallery output[edit]

Where can i find the gallery output from Catscan? Do i have to use a different tool now? --Ailura (talk) 18:53, 26 November 2017 (UTC)

Wishlist item: Sorting by number of languages[edit]

Thanks for a truly awesome tool! If I were to have one wish for further functionality, it would be the ability to sort the results by the number of language links from each article, ie the size of the language list in the sidebar when viewing Wikipedia on a desktop. Like the number of incoming links and the size of the article (which are among the current sort criteria), this would be a useful proxy for the importance of a topic, but with an added weight on how internationally known a subject is. OMHalck (talk) 11:51, 28 November 2017 (UTC)

Parameters to call[edit]

So this is almost what I want

call PetScan

But how do I get Combination to be Union and page property Namespace tick switch on (only articiel)? --Traveler100 (talk) 20:37, 17 June 2018 (UTC)

Using magic words?[edit]

I have found more than a few pages for different organisms where the talk page is a redirect and the article is not. I thought to run a petscan to search for articles including various taxonomy templates which have talk pages that include #REDIRECT. However, I can't seem to add magic words anywhere. is there a way to do this? --NessieVL (talk) 18:53, 16 August 2018 (UTC)

How do I find new pages only?[edit]

Can anybody tell how I can find new pages only with this tool? Tagging "Only pages created during the above time window" does not work for that matter, in fact. Ymnes (talk) 19:00, 18 August 2018 (UTC)

out of order[edit]

Petscan doesn't work since at least one week. There's only a white page without any content. Does anybody know why? Many thanks, best regards, Aspiriniks (talk) 11:01, 10 November 2018 (UTC)

It's working fine for me. Nihlus 11:09, 10 November 2018 (UTC)
Maybe depending on the browser? In my case: Iceweasel, which is nearly identical with Firefox. -- Aspiriniks (talk) 11:26, 10 November 2018 (UTC)

I'm trying it for the first time, in Firefox, and either I'm doing something wrong or it isn't working for me. To take a simple example: https://petscan.wmflabs.org/?psid=6710931 isn't finding Commons:Alaska_Theatre_of_Sensations,_A-Y-P,_1909.jpg (or anything else).

Would whoever responds to this please ping me, since I don't keep a watchlist on Meta? Thanks in advance. - Jmabel (talk) 05:51, 7 December 2018 (UTC)

be-tarask[edit]

"be-tarask" does not work --Чаховіч Уладзіслаў (talk) 19:39, 9 December 2018 (UTC)

Limit output?[edit]

Hi, is there a way to limit /paginate the output? If i try to search for images in certain categories and select "Thumbnail" output, i encounter performance problems with big result sets (file count > 1000). Paginating the output may be a solution, but i didn't find a way to do this. --Fl.schmitt (talk) 11:25, 17 December 2018 (UTC)

@Fl.schmitt:, Yes. I think there is a way. In the tab "output" select "limit"=100. It will show the 100 first images. If you want the 100 last, select "Sort order"=descending. So you can only select 200 images. It would be nice if there were an "offset" parameter to be able to select 100 images starting on a given images. (ex: limit=100, offset=100 , would select the images n. 101 to 201). Perhaps @Magnus Manske:can develop this functionality? --JotaCartas (talk) 19:59, 31 May 2019 (UTC)

API[edit]

Powerful tool. Does it have any API with documentation, so I can pass from my tool a list of 90 000 articles and filter it?--Alex Blokha (talk) 21:55, 13 March 2019 (UTC)

Variables[edit]

It would be nice that it works with variables if possible. Like the Google's * (asterix) which stays for whatever string.--Juandev (talk) 16:40, 16 April 2019 (UTC)

Previously working query now returning 0 results[edit]

It seems like starting ~10 minutes ago, all my queries have been (promptly) returning 0 results. For example here is a simple query that keeps all settings at their default and just searches for pages in the category "Individual eagles". It should return 6 results (and was doing so earlier today), but now it's returning 0. I'm wondering if this is some temporary server issue? Are others seeing the same thing? Colin M (talk) 17:47, 3 June 2019 (UTC)

@Colin M:,Yes, I have having the same problem, but after 3 or 4 attempts I get the correct result. --JotaCartas (talk) 17:58, 3 June 2019 (UTC)
Fixed Ah, cool, it's working again for me too. I guess it was just gremlins. Colin M (talk) 18:05, 3 June 2019 (UTC)

Zero results[edit]

PetScan used to work flawlessly, but now returns no results for me, similar to the situation described in the comments directly above me. I thought the situation might resolve itself like it did for Colin M, but it has not. Any insight into why this might be happening?--MainlyTwelve (talk) 16:03, 13 June 2019 (UTC)

Yes, the problem has been getting worse from day to day. At the moment I launch PetScan ... I make some attempts and I will go treat another subject for 5 minutes ... I return ... some more tries ... and I leave again. Sometimes it's only after 30 minutes that I get the result. --JotaCartas (talk) 17:14, 13 June 2019 (UTC)
@MainlyTwelve: , please read ... Some tools on Toolforge may break on or after 3 June because of database changes. Maintainers should update their tools to use the new schema ... in Commons:Commons:Village pump/Technical#Tech News: 2019-24 --JotaCartas (talk) 19:36, 13 June 2019 (UTC)
@JotaCartas: Thank you! I will read it now.--MainlyTwelve (talk) 19:41, 13 June 2019 (UTC)
@JotaCartas:Forgive my ignorance, does that mean we're waiting on the Maintainers? Is there anything I can do to help?--MainlyTwelve (talk) 19:43, 13 June 2019 (UTC)
@MainlyTwelve: The maintainers are expert software developers that maintain sites like Toolforge that allocate some tools (like PetScan) used in all the wikipedias, so .. we have to wait. --JotaCartas (talk) 20:41, 13 June 2019 (UTC)
@JotaCartas: Ah, I see. I will wait. Thanks again!--MainlyTwelve (talk) 20:44, 13 June 2019 (UTC)
@MainlyTwelve:, not at all, here is the correct link to the Techinal News where I read the report of the problem - Commons:Commons:Village pump/Technical#Tech News: 2019-23, regards--JotaCartas (talk) 20:57, 13 June 2019 (UTC)
Any idea when this will be fixed? --Traveler100 (talk) 08:26, 12 July 2019 (UTC)