PetScan
PetScan on tehokas luetteloinnin apuväline. Halutun luettelon ehdot yksilöidään PetScanin taulukkoon. Katso myös aivot tämän takana.
Johdanto
PetScan on työkalu, jonka avulla käyttäjät voivat louhia tietoja, laatia listoja Wikipedian (ja sen sivuprojektien) sivuista tai Wikidatan tietueista, jotka täyttävät annetut kriteerit, sekä louhia ja analysoida tietoja Wikimedia-projekteista. Haku voi kohdistua tietyn luokan kaikkiin sivuihin tai kaikkiin tietueisiin, joilla on jokin tietty ominaisuus. PetScanin avulla pystyy myös yhdistelemään väliaikaisia listoja (tässä: "lähteitä") monella tavalla ja näin luoda uusia listoja. Lähteisiin kuuluu:
Wiki(m/p)edian sivut
Hakuperusteet määritellään sivulehdillä "Categories, Page properties ja Templates&links. Voit esimerkiksi hakea tietyn luokkapuun artikkeleita - tietyn mallineen tai linkityksen perusteella, rajoitettuna tiettyyn nimiavaruuteen, viimeisen muokkausajankohdan tai sivun luomisen perusteella jne.
Muut lähteet
Tässä välilehdessä voit lisätä lähteitä, esim. Wikidatan SPARQL (WDQS) -kyselyitä tai PagePile-listoja. Voit myös määrittää miten useampia lähteitä yhdistetään; oletusarvoisesti vain niiden leikkauskohta (eli sivut, jotka esiintyvät kaikissa lähteissä) palautetaan lopullisena tuloksena. Voit myös määrittää, mihin wikiin haluat listasi viittaavan, jos esimerkiksi yhdistät Wikipedian ja Wikidatan tuloksia.
Wikidata
Tässä välilehdessä voit kommentoida tai suodattaa tuloksiasi edelleen, esim. palauttaa vain ne Wikidatan tietueet, joilla ei ole esityksiä. Näiden suodattimien käyttö muuttaa listasi Wikidataksi.
Tuloste
Tässä voit määrittää valintoja listallesi, esim. formaatin (web-sivu, wiki, PagePile jne.). Voit myös suodattaa tuloksiasi edelleen, esim. sivujen tai tietueiden otsikkoihin sovellettavilla säännöllisillä lausekkeilla. Voit myös korvata tuloslistan järjestetyllä listalla puuttuvista aiheista ("punaiset linkit").
Kyselyn määritteleminen
Kyselylomakkeessa asetettavat kentät ovat seuraavat:
| Kenttä | Tarkoitus | Oletusarvo | Huomioita |
|---|---|---|---|
| Kieli | Valitse projektin kieli, esim. "en" on englanti ja "de" on saksa. Select "commons" for Wikimedia Commons | "en" | |
| Projekti | Wikimedian projekti josta etsitään (wikipedia, wiktionary, wikiversity jne.) | "wikipedia" | NOTE: If you choose "Commons", be sure to go to the "Page properties" tab and check the "File" namespace to get useful results. |
| Syvyys | Syvyys johon asti luokkapuussa mennään. Arvolla 0 mukaan ei tule yhtään alaluokkaa. | "0" | |
| Luokat | Lista luokista kukin lueteltuna omille riveilleen ilman "Luokka:"-tunnistetta. | Tyhjä | Kirjoittamalla '|':n ja numeron luokan nimen perään voit asettaa tämän luokkapuun syvyyden ylittäen yleisen syvyysasetuksen. Määrittämällä luokan voit tarkentaa hakutuloksia ja keskittyä tiettyyn aiheeseen liittyviin sivuihin. |
| Negatiiviset luokat | Luettelo luokista kuten yllä. Hakutuloksiin hyväksytään vain artikkelit, jotka eivät kuulu näihin luokkiin. | Tyhjä | |
| Yhdistelmä | How above categories should be used:
Options available currently are "subset" or "union". |
"subset" | |
| Namespaces | The namespaces to use as potential pages | Artikkelit | |
| Ohjaukset | Either | ||
| Mallineet | Use only pages that
Enter one template per line, without "template:" prefix. Each box may be qualified by selecting "Use talk page instead" |
Tyhjä | This option seems only compatible with templates defined in "template:" namespace. It cannot be used with templates defined in "User:" namespace. It cannot be used in the "Creator:" or "Institution:" namespaces that are used at Wikimedia Commons |
| Linked from: | |||
| Viimeinen muokkaus | Show pages whose last edit was or was not made by a bot, by an anonymous user, or is flagged | Either, either, either | |
| Viimeinen muutos | Date or time period of the last change on the page in the format YYYYMMDDHHMMSS (shorter allowed) | "Only pages created during the above time window" allows you to look for first change instead | |
| Koko | Tiedoston koko tai haarukka tavuissa | Tyhjä | Allows selection of articles whose files are greater than one cutoff and/or less than another cutoff |
| Linkit | Sisäisten linkkien määrä tai haarukka sivulla | Tyhjä | Allows selection of articles with many or few links |
| Redlinks | |||
| Top categories | Feature which is not yet available. | ||
| Sort | Feature which is not yet available, which would set sorting criteria for output. | ||
| Manual list | Sallii luettelon (nimitilan etuliitteistä) sivunimistä tai Wikidata-kohteista määritetystä projektista | The tricky part is specifying projects the correct codes are:
| |
| Wikidata | Hanki Wikidata, jos saatavilla. | ||
| Formaatti | Output format of the search results: HTML: webpages CSV: values in quotation marks, separated by commas TSV: Tab Separated Values WIKI: as Wikitable PHP: as a PHP file XML: as an XML file |
||
| Hae! | Hit this to run the submission you have defined. |
Know-how
PetScan ID (PSID)
As of 2016-04-04, every query that gets run in PetScan is recorded (anonymously!) and assigned a unique, stable, numeric identifier called PSID. You can use the PSID to
- run this PetScan query as an input in tools that support PSID (such as WD-FIST)
- fill in a "short URL":
https://petscan.wmflabs.org/?psid=PSIDwill run the query with PSID, with all its settings - expand programmatically on a previous query, by "overwriting" parameters:
https://petscan.wmflabs.org/?format=wiki&psid=PSIDwill run the same query as before, but the output format will be wiki (instead of default HTML, or whatever was chosen originally).
Notes:
- Only the query will be stored, not its results!
- Large queries (e.g. with many manual items) will not be stored. In that case, no PSID will be shown.
- Results with an empty checkbox have possible matches within the Wikidata set.
- the interwiki link petscan: can be used to generate shortcuts for permanent queries, eg. [[petscan:PSID]]
- queries recorded are not deduplicated, so a new PSID will be generated each time unless an existing PSID is called without modification.
Create Wikidata items for Wikipedia articles that don't have one yet (Creator functionality)
- Set up a query that returns a list of Wikipedia (or other, non-Wikidata project) pages, or paste a list into "Other sources/Manual list"
Under the "Page properties" tab, you should select "Redirects=No"This is done automatically now; you can change it back if you really want redirects in your list!- Under the "Wikidata" tab, select "Only pages without item" for the "Wikidata" option
- Run query
- Your results will have additional elements next to the "results" header (unless you are not logged into WiDaR, in which case you will see an appropriate link instead)
- All pages for which there is no exact match in any label or alias on Wikidata are checked by default.
- You can check/uncheck boxes manually now, if required.
- You can add default statements into the statements box, which will be added to all your new items. So, if you only create items for people, add
P31:Q5. You can add multiple statements this way (one per line). Do note that the case of P/Q needs to be in upper case – otherwise it will fail quietly. - You can add default descriptions to new items, such as
Dde:"some description"for a German description. - Click the green "Start QS" button. This will open a new page.
- You can click "Run" to run a batch in your browser, or "Run in background" to run them from a Wikimedia server. See Help:QuickStatements for more information.
As of July 2020, "Run in background" has various bugs (for example, duplicated items may be created). Use frontend mode if possible! |
Add/remove statements for Wikidata items
It is possible to add or remove statements for Wikidata items with PetScan. For this it is crucial that you choose "Wikidata" in "Other sources -> Use Wiki". Then you will see the command box next to the number and can continue as described in the previous section.
Referrer
(V2 only) If you open PetScan from another tool to let the user create a query, you can pass the referrer_url and referrer_name (defaults to referrer_url) parameters. referrer_url should have a {PSID} string which will be replaced with the PSID the user sees. Once a query was run, a box at the top of the page will prompt the user to return to the original tool, using the PSID-modified referrer_url.
Esimerkkejä
Artikkeleja wikiprojektissa
Pyyntö tämän oppaan keskustelusivulta: Löydä kaikki verkkotila-artikkelit "WikiProject UK geography" -osiosta. Aloita PetScan-oletuslomakkeella lisäämällä "WikiProject UK maantiede" Mallit-rivin ensimmäiseen ruutuun ja valitsemalla alapuolelta "Käytä keskustelusivuja sen sijaan". Tässä kysely on täytetty. Hit "Tee se!" alareunassa. Kun kysely suoritettiin 16. elokuuta 2015, kysely vaati 1,5 sekuntia, ja se tuotti luettelon 21408 artikkelista. Luettelo näkyy lähetyslomakkeen alla (joka pysyy ruudulla), joten sinun on vieritettävä alas nähdäksesi tulokset.
Dablinks within a WikiProject
Editors working on disambiguation seek to enlist members of a content area WikiProject, specifically WikiProject Canada, to help. A PetScan report is designed to find all articles having ambiguous links that are within the given WikiProject. Criteria applied:
- Articles having ambiguous links are within "Category:All articles with links needing disambiguation", so paste "All articles with links needing disambiguation" into the PetScan Categories field.
- Depth is set arbitrarily to 9, meaning that articles as far as 9 subcategories down from the "needing disambiguation" parent category will be found. (Searching to that depth is not necessary in this case but doesn't hurt.)
- Articles within WikiProject Canada have "Template:WikiProject Canada" on their talk pages, so paste "WikiProject Canada" into PetScan's "Has any of these templates" field, and just below select "Use talk pages instead" as a qualifier.
- Only regular articles, not disambiguation pages, are wanted, and disambiguation pages are distinguished by having template:disambiguation, so paste "Disambiguation" into PetScan's "Has none of these templates" field, and make sure "Use talk pages instead" is not selected.
- These criteria are implemented by this PetScan submission form, filled out. To submit the query, select "Do it!" at the bottom.
- When submitted on 16 August 2015, the query took 31 seconds to run, and results were a list of 255 articles. The results show BELOW the PetScan submission form, which remains in place, so you may see no change on your screen. You have to know to scroll down to find the results! That request was run with default Output format "HTML".
- To obtain the results in a Wikitable, in order to share them at a subpage of the WikiProject, the request could be revised to select Format "WIKI". This time the results, in wikitable markup, replace the PetScan submission form on your screen.
- To make a more useful list for disambiguators, set up so that DabSolver will open up on any item clicked, a several step process can be followed. Here the results were saved to Tab-Separated format instead, then brought into Excel, then a column was composed which concatenated simple text strings with the results, then that resulting column was copy-pasted. The results were pasted over to the English language Wikipedia page w:Wikipedia:Canadian Wikipedians' notice board/ArticlesNeedingDisambiguation2015-08-17 and were posted also within a scrolling window in discussion at the WikiProject Canada talk page. --Doncram (talk) 19:50, 24 August 2015 (UTC) link adjusted. DexDor (talk) 06:58, 29 March 2016 (UTC)
Detecting pages that have an anomalous combination of namespace and category/ies
PetScan can be used to find pages that are in a category (or combination of categories) that is not appropriate for pages in a particular namespace - e.g. Wikipedia administration pages that are in a category that should only contain encyclopedic articles. This can then be fixed (e.g. by moving an article to the correct namespace or by editing a discussion to insert a missing ":" where a category is being referred to). The first step in this process is to identify (using PetScan) categories that cause incorrect categorization (e.g. Wikipedia administration categories that are in article categories).
Find uncategorized photo contributions in Commons in a given language
(Based on Grants:Learning patterns/Treasures or landmines: detecting uncategorized, language-specific uploads in Commons. See the motivation and full explanation there! Thank you to wikimedia user User:Spiritia and other contributors/commenters there for contributing this! )
Run a query using PetScan with the following settings:
Language = commons Project = wikimedia Depth = 1 Categories = Uncategorized files Combination = ☑ Subset Namespaces = ☑ File Templates : Has all of these templates = <kielikoodisi> Format: ☑ Extended data for files ☑ File usage data
The English language code is "en"; the Romanian language code is "ro". To find uncategorized photos uploaded by users using Romanian language, a version of the query (with html output, and without autorun) is:
As of 15 March 2016, after hitting "run" the query requires about 105 seconds to finish, and yields 1748 uncategorized photos.
Notes:
- The "Language =" field is not used to select the desired language; the desired language code is set in the "Template" field instead.
- The language code is case-sensitive in the query! So for example use "ro" not "RO".
- To generate the results there, Format: ☑ Wiki was chosen, instead of the default output of Html.
Enjoy! Thanks again to User:Spiritia especially!
Items with no statements
The option "Has no statements" can be used to find:
- items without statements for a category at Wikipedia (sample: en:Category:United States geography stubs)
- items without statements for an entire Wikipedia language version (sample: "sowiki")
Steps to import the template, some with PetScan.
Get the sitelinks for a certain project from a SPARQL query
- Indicate the project on the 'Categories' tab. E.g.
defor Language andwikipediain Project to use the German language edition of Wikipedia. - In Other sources enter your SPARQL query
- Make sure to select From categories from the Use wiki options
- Press Do it
This could be useful to get the pageviews of a specific set of pages, based on a SPARQL query. You can save this to a Pagepile (check the Output tab), then enter that Pagepile ID in Massviews Analysis (select 'Page Pile' from the Source dropdown).
Get a list of Wikidata items with exclusions based on a SPARQL query
Let's say you got a list of people with Wikidata ID's (QIDs) that you want to add an occupation (P106) of 'jewellery designer' (Q2519376) to, maybe with a tool like QuickStatements. However, you don't want to add this occupation to items that already have that occupation. Here's how to do that with PetScan:
- Have your list of QIDs in a text file, with each QID on a new line
- In the tab 'Other sources', paste this text into the field called 'Manual list'
- In the form 'Wiki' enter the string
wikidatawiki - In the field 'SPARQL' enter your SPARQL query. In this example, this query will give all humans with an occupation of 'jewellery designer':
select ?item where { ?item wdt:P31 wd:Q5; wdt:P106 wd:Q2519376. }
- Lopuksi haluat tehdä poissulkemisen, joten lisää "Yhdistelmä" -kenttään merkkijono
manual NOT sparqlsaadaksesi kaikki QID:t manuaalisesta luettelosta, mutta "ilman" SPARQL-kyselyn kohteita. - Napsauta "Tee se!"
Add your example here...
Bug reports, feature requests, code base
Katso myös
Aiheesta muualla
- Training video from EduWiki 2023
- Wiki World Heritage User Group: Capacity Building PetScan Training 2021