Talk:Mix'n'match/Manual

From Meta, a Wikimedia project coordination wiki

Translate[edit]

Can I mark this for translation? :) (My understanding is that it originally comes from the Italian, so not sure how well in sync these two are)

Jean-Fred (talk) 13:53, 21 June 2015 (UTC)[reply]

@Jean-Frédéric: Yes! I originally marked for translation the Italian version especially because I wanted it to be translated in as much languages as possible. I'll need anyway to expand both English and Italian versions, something is still missing and the two versions are not synced between themselves. --Sannita (ICCU) (talk) 13:16, 10 July 2015 (UTC)[reply]
@Sannita (ICCU):: Done! I moved the page to avoid /en/en pages ;) Jean-Fred (talk) 17:25, 14 July 2015 (UTC)[reply]

How does one get a new dataset into the tool?[edit]

I have a new dataset in table format where each row may pertain to a Wikidata item. In the description for Mix'n'match I cannot find information on how to upload this data and in which format it should be. How does one do it? Is it only Magnus Manske that is able to do it? — Finn Årup Nielsen (fnielsen) (talk) 15:13, 25 October 2016 (UTC)[reply]

The dataset is related to Runeberg https://github.com/fnielsen/dasem/blob/master/notebooks/runeberg.ipynbFinn Årup Nielsen (fnielsen) (talk) 15:26, 25 October 2016 (UTC)[reply]
If you have date in a tabbed file (or can copy from an Excel sheet etc.), please try the import function. This is the preferred method. If you can't get it done, contact me. --Magnus Manske (talk) 13:22, 26 October 2016 (UTC)[reply]
That import.php page was what I was looking for. Thanks! — Finn Årup Nielsen (fnielsen) (talk) 13:44, 26 October 2016 (UTC)[reply]

How should one treat items where the relevant property already has been set? Must it be excluded or can it be included (so that the statistics with the completion is correct)? — Finn Årup Nielsen (fnielsen) (talk) 13:46, 26 October 2016 (UTC)[reply]

You can include them in your list, and Mix n Match will automatically link them via the "Sync" process. At least that's been my experience. ArthurPSmith (talk) 18:02, 13 June 2017 (UTC)[reply]

Etiquette after uploading a catalog?[edit]

Hi, I recently uploaded the LDS Women's Manuscripts catalog. I approved the automatic matches after checking them, and I went through the other pages and matched the ones that have pages. I'm fairly certain that most of the other items will never have a Wikidata item. Should I go through and manually click the "N/A" button for them? There are a few people that may be able to have their own Wikidata page, mostly because they are connected to famous people. Should I go ahead and make a new wikidata item for them (even if they may never get their own Wikipedia page)? Or should I wait to see if other catalogs might have entries on them as well? Thanks, Rachel Helps (BYU) (talk) 22:30, 20 November 2017 (UTC)[reply]

See d:Project:Scope. It doesn’t matter which tool you use. --Mormegil (cs) 13:43, 24 November 2017 (UTC)[reply]
I guess I've been working on Wikipedia so long that I'm not sure I understand the notability criteria for Wikidata. "Can be described using publicly available references" would describe every individual in the database. Is that right? Rachel Helps (BYU) (talk) 18:13, 30 November 2017 (UTC)[reply]
In theory yes, but you can use your judgement. The parent, spouse or child of someone already in Wikidata is certainly appropriate to add, even if the new person will never meet a Wikipedia notability requirements. In these cases it's especially helpful if you build the links between the items in both directions ("father" on the daughter, "child" on the father) when you create the new item. - PKM (talk) 20:07, 3 December 2017 (UTC)[reply]
Thanks, I'll use links to other items as a guideline. Rachel Helps (BYU) (talk) 16:52, 4 December 2017 (UTC)[reply]

New way to add datasets[edit]

User:Magnus Manske describes a new way to add datasets by to Mix'n'Match using web page scraping in this blog post from 26 November 2017. - PKM (talk) 20:15, 3 December 2017 (UTC)[reply]

Changes to the Mix'n'Match landing page[edit]

There are changes to the Mix'n'Match landing page to improve performance. Instead of a long alphabetical list, the landing page now shows Catalog Groups and Catalogs by Class. You can also search by a keyword (for example, "fashion") to find catalogs that interest you.

We should update the manual to show these changes. - PKM (talk) 20:20, 3 December 2017 (UTC)[reply]

How to fix a catalog?[edit]

SANU member has a linking error, because the SANU site changed links to their records. I updated the property, but it does not reflect mix'n'match. How to chenge the catalog setup? Милан Јелисавчић (talk) 13:11, 16 May 2018 (UTC)[reply]

Widar tool not available[edit]

At this precise moment the Widar tool i snot available and as a consequence I'm not able to use mix'n'match by now --Robby (talk) 19:58, 6 June 2018 (UTC)[reply]

Actually all tools on labs are down now for some reason. Stryn (talk) 20:02, 6 June 2018 (UTC)[reply]
Everything seems to be back to normal.Robby (talk) 21:22, 6 June 2018 (UTC)[reply]
The tools outages were expected due to planned server maintenance, but did last a bit longer than we had hoped. --BDavis (WMF) (talk) 23:01, 6 June 2018 (UTC)[reply]

Error when uploading tsv file: ERROR: Can not find user in mix'n'match. Perform at least one action there![edit]

Hi - attempting to upload a vocabulary as a tsv file. Please advise on next steps.

Resolved. Edit an existing catalogue to be known as a user. For example, create a new item. To do this, choose a catalogue, choose unmatched, click on create new item, then remove that item.

visual match try to call https[edit]

Hi,

Visual match on isfdb [1] doesn't work because it tries to call it on https, even though they do not have a certificate. Eru (talk) 06:47, 14 October 2018 (UTC)[reply]

Update : ISFDB will add certificat, but no ETA. Eru (talk) 07:20, 21 October 2018 (UTC)[reply]

Removing a catalog[edit]

Is there any way to remove a catalog I just created?

It seems the "edit catalog" is not working as expected!

Please remove catalog 2362[edit]

Please delete https://tools.wmflabs.org/mix-n-match/?#/catalog/2362 please. I created it but created better one immediately after. Thank you. Trilotat (talk) 18:54, 7 April 2019 (UTC)[reply]

Follow[edit]

Hi,

I’m strugling to use “Follow” − I now failed twice to create a catalogue, 2433 and 2403 where I only got the first page :)

I think I’m being confused with should go in the Level URL “A URL pattern with $1 as a placeholder for a partial URL match from the RegEx” and into « URL pattern » “A URL pattern, with $1 for the value of level 1, $2 for level 2 etc.” − are these the same $1 then?

  • For RegioWiki, I did not save the config (and I don’t there’s a way to retrieve the config, short of getting it from the database), but fairly sure I did put a list-like URL in the “URL Pattern”

Anyone more successful than me? Ping @Trivialist and 99of9: who also discussed this on the Project chat some time ago.

I’d be more than happy to document it, once I have a working example.

Jean-Fred (talk) 19:07, 15 May 2019 (UTC)[reply]

@Jean-Frédéric: I wouldn't use a follow on indiedb. I'd use a first range index page=(1,1254) (usually I would go further to allow for more to be scraped later, but here page=9000 returns the same as 1254 again, so the list would end up with lots of duplicates). You should think of follow as the stem of a tree rather than the start of a chain. --99of9 (talk) 11:37, 16 May 2019 (UTC)[reply]
That’s also what I usually use in such circumstances, but I thought it could be an easy testbed for 'follow' ^_^ Jean-Fred (talk) 12:46, 16 May 2019 (UTC)[reply]

Regex issue[edit]

I’m trying to make a Mix’n’match for Anime News Network anime ID (P1985) and I’m going crazy.

It’s a simple A-Z keys-based from to yield pages like this one, which has all entries for that key on one page.

I want to exclude the ones in italic because they are alternate titles (so while Mix’n’match will not duplicate the entries, the name might end-up with one of the alternate-not-so-well-known titles).

Here’s my regex: (?:<b>)?(?!<i>)<a class="HOVERLINE" href="/encyclopedia/anime\.php\?id=(\d+)"><font color="#[\w\d]{6}">(.+?)\s*?\((.*?)\)</font></a>(?!</i>)(?:</b>)?<br>

But Mix’n’match preview gives:

ID Name Description
6862 009-1 TV
10468 07-Ghost TV
7242 10 Chikan Association OAV) 10 count (TV) 10 Little Gall Force (OAV
1600 10 Tokyo Warriors OAV) 100 Sleeping Princes and the Kingdom of Dreams: The Animation (TV
2765 100 stories TV) 100% (OAV
19094 100% Pascal-sensei TV
  • The first entry is passed
  • The regex is greedy in ways I don’t fully understand.

I tried to play with various negative-lookahead constructs, which all work in my tests but not on Mix’n’match (I’m not too sure which precise regex implementation Mix’n’match uses, so that may be that too). Also, it seems that Mix’n’match does not allow start/end of line markers (^ and $)

(Also, interestingly, near the Regex field there is “This regular expression matches 41 entries in the HTML below” but the test yields 39 entries ¯\_(ツ)_/¯)

If anyone has any idea… Jean-Fred (talk) 17:20, 6 August 2019 (UTC)[reply]

Setting a scraper-based catalog to automatically update?[edit]

Last week I made three catalogs with (what I thought was) the exact same process. They all completed the first scrape nicely and are working well. However, two of them now say "This catalog is regularly updated through automated web page scraping" and one (LezWatch.TV actors) is missing that line. Can I change a setting somewhere to have this one auto-update as well? Sweet kate (talk) 16:04, 12 August 2019 (UTC)[reply]

prefers-color-scheme: dark[edit]

Hi Magnus, in wikimedia.css - there is media query - "prefers-color-scheme: dark" and - it is not possible to read. Its very difficult on chrome at macos. Much better will be with (in prefers-color-scheme:dark query) .entry_row { background-color: #000; } (or similar) - white odd div background is not so good (very ligth font color is not good at white background :). Thanks! --Frettie (talk) 23:10, 11 November 2019 (UTC)[reply]

User:-1[edit]

Hi, who is User:-1? They have apparently e.g. imported the Oxford Reference Authority. But this wikidata user is not registered. Dsp13 (talk) 11:43, 4 February 2022 (UTC)[reply]

How do I update a catalog?[edit]

I would like to edit Adventure Game Studio Game so that it only matches adventure games. Matthias (talk) 18:28, 23 May 2023 (UTC)[reply]

Hey @Matthias M.:
  • You can update a scraper catalog in scraper/new
  • You can update non-scraper catalogs in import
That said, as far as I know, you can only hint matches using the type, typically for us P31=Q7889. I don’t think you can hint it to have P136=adventure game. I don’t think that would be desirable anyways − we have quite the gap in data there, with 35% of Q7889 items not having any P136, so you would potentially miss quite a few matches. Jean-Fred (talk) 09:44, 25 May 2023 (UTC)[reply]
https://mix-n-match.toolforge.org/#/scraper/new and entering an existing catalog ID does not pre-fill anything. Can someone with admin rights please enter d:Property:P11909 into https://mix-n-match.toolforge.org/#/catalog_editor/5954 please? Matthias (talk) 09:38, 23 July 2023 (UTC)[reply]

Missing AI[edit]

Is it possible to add some more AI to this feature, for the field of occupation badminton player I deleted around 100.000 suggestions, whilest accepting only around poor 30 items. So please, if the solely occupation is badminton player, delete erverything of the suggestions except Prabook and CWG. What would be needed is babelnet, flashscore, thesportsorg, globalsportsarchive, Google knowledge. --Florentyna (talk) 19:12, 19 August 2023 (UTC)[reply]