Mix'n'match/Manual

From Meta, a Wikimedia project coordination wiki
This page is a translated version of the page Mix'n'match/Manual and the translation is 23% complete.
Outdated translations are marked like this.

Mix'n'match ye una ferramienta de Magnus Manske, que contien delles llistes d'elementos de fontes esternes. Permítete que les cases con entraes de Wikidata, identificando les qu'esisten en Wikidata y les qu'entá nun tienen elementos - "considéralo una llista d'enllaces bermeyos enantada".

Anguaño contien más de 2500 catálogos, como'l Diccionariu Oxford de Biografíes Nacionales (completáu), el Diccionariu Australianu de Biografíes (completáu), o'l catálogu de la National Portrait Gallery (52.5% casáu).

D'esta manera, va ser fácil ver los elementos faltantes d'una determinada Wikipedia, o qué idioma tien la meyor cobertoria d'una tema determinada.

¿Cómo funciona?

Mix'n'match estrema los elementos en cinco categoríes:

Exemplu de les estadístiques d'unu de los catálogos.
  1. Enllazaos manualmente: un usuariu casó esta entrada del catálogu con un elementu de Wikidata (esto incluye entraes importaes de Wikidata);
  2. Enllazaos automáticamente: el sistema albidró una posible correspondencia pa la entrada de Wikidata, pero precísase qu'una persona la confirme o la refugue;
  1. Nun ta en Wikidata: sábese qu'esta entrada de catálogu nun casa con nenguna entrada de Wikidata;
  2. Nun ye aplicable a Wikidata (N/A): la entrada marcóse como non relevante pa Wikidata (por casu, ye un duplicáu, un acuto d'espaciu, una redireición o cenciellamente una tema desaveniente);
  3. Ensin casar: esta entrada entá nun ta casada, y nun hai nenguna suxerencia automática.

L'oxetivu ye, naturalmente, marcar el mayor númberu posible d'entraes como «casaes manualmente» (o bien confirmar que nun hai nenguna correspondencia posible con Wikidata). Pa usar la ferramienta, precises tener una cuenta rexistrada en cualquier proyeutu de Wikimedia y dar permisu a la ferramienta WiDaR.

Agora qu'autorizasti a WiDaR, puedes escoyer dos moos: semiautomáticu o manual.

  • Search for a specific name using the search box in the header bar. This will bring you to a search result page.
    • See also List mode below for how to use the list of results.
    • In the search result page, you can also limit the search to a specific catalog.
    • You may also search a Qid; this will return all entries that the item is matched to. Searching external ID is not supported.
    • It is not a guarantee that the search result page contain all entries matching a specific name; in particular the list may be truncated if there are too many results.
  • Select a specific catalog and then go to a catalog page.

In a catalog page, you will see the number of entries in each category, and the history of number of matches. Clicking a specific category brings you to the List mode. You will also see an "Action" menu, including the following:

  • Fully matched, Preliminarily matched, Unmatched, No Wikidata, Not applicable to Wikidata – links to List mode for all entries in this specific category.
  • Multiple matches – links to List mode for all preliminarily matched entries with multiple automatically-suggested matches.
  • Site stats
  • Download
  • Match mode - see below.
  • Recent Changes in this catalog
  • Aliases
  • Jobs
  • Search only in this catalog
  • Names in other catalogs
  • Manually sync catalog
  • Catalog editor
  • Mobile matching
  • Visual tool
  • Find images
  • Changes last week
  • Catalog report

Mou semiautomáticu (Mou xuegu)

Exemplu del mou xuegu.

Si escueyes el mou semiautomáticu, na parte cimera de la páxina tienes l'identificador del catálogu (Catalog ID), el títulu del catálogu (Catalog Name) y posiblemente una descripción mínima sacada siempres del catálogu (Catalog description). Esto tendría d'ayudate a pescanciar qué o quién ye.

Más abaxo, tienes cuatro opciones:

  • Afitar Q (botón azul): Si identificasti qué elementu de Wikidata casa cola entrada del catálogu, puedes apegar el númberu Q nesti caxellu.[1]
  • Ensin entrada en Wikidata (botón naranxa): si tienes la seguridá de que nun casa con nengún elementu de Wikidata.
  • N/D (botón coloráu): pa los casos nos que nunca va haber un elementu apropiáu de Wikidata pa esta entrada.
  • Saltar (botón gris): en casu de dulda o inseguridá, dir al siguiente elementu.

In case of doubt or uncertainty, or there is no matching Wikidata item but you do not want to create an item immediately, you may skip this go to the next element by clicking "Next entry".

If the entry is preliminarily matched, you have two choices:

  • Confirmed (green button): Confirms that the proposed entry is correct.
  • Remove (red button): Confirm that the proposed entry is incorrect. The entry will then become unmatched and may be matched to another (potentially new) item.

If there are multiple automatically-suggested matches, only the first of them is shown, and will be used if "Confirmed" is clicked. You may browse or select other matches using the link to the right of the entry name.

Further down are some suggested links from en.wikipedia, with its link to the item on Wikidata. In the event that the correct item is present there, you can just click on the link to the right (eg. "Q384941") and this will register a match. If the correct item is not among the suggestions, you still have the chance to search through Google on all versions of Wikipedia or Wikisource or on Wikidata.

Cuando faigas una conexón entre una entrada del catálogu y un elementu de Wikidata, el sistema anovará Wikidata automáticamente. Esto apaecerá como una edición na to llista de contribuciones.

(Ten en cuenta que delles entraes de Mix'n'match inda pueden nun tener configurada una propiedá; si tas trabayando n'una d'estes, la correspondencia va guardase y va actualizase sero, en casu apropiáu.)

Mou manual (Manual)

Exemplu del mou manual.

A list of entries will be shown when:

  • You click a specific category (e.g. "Unmatched") in a catalog page - all entries in this category will be shown with fifty entries per page.
  • You browse a search result page.

Formerly it is known as manual mode and it may show fifty entries among all categories; this option was removed.

On the first line of the list, you will see the name and (where available) the description of the entry. Each card will also show the status of the entities.

Elementos por validar manualmente (en bermeyu)

For items with no suggested match, the second line will present various links that will allow you to make an automatic search on Wikipedia, on Wikidata or Google (limiting the results only to Wikipedia or Wikidata), or even create the item. In the right column, you will have three choices:

  1. Set Q (green link): clicking here brings up a dialog box where you can enter the number of the Wikidata item (with or without the Q in front of the number).
  1. New item (red link): clicking here will create a new item on Wikidata for that entry, that will automatically get name, description (if present) and ID from the catalogue.
  1. N/A (yellow link): clicking here will confirm that the entry should not exist on Wikidata, and can be discarded.

If you have provided a Wikidata item number, the system will automatically update the corresponding Wikidata entry using WiDaR, as in match mode.

Elementos combinaos automáticamente (en lila)

Para los elementos con una suxerencia automática d'asociación, la segunda llinia va tener un enllaz a Wikidata xunto con un resume autoxeneráu de la entrada de Wikidata. Na columna de la derecha, vas tener trés opciones:

  1. Confirmar (enllaz verde): al pulsiar equí confirmes que la entrada propuesta ye correuta.
  1. Desaniciar (enllaz bermeyu): al pulsiar equí confirmes que la entrada nun esiste en Wikidata (pero podría esistir nun futuru).

Sometimes, a list of alternative matches is available.

Nuevamente, el sistema va faer la edición correspondiente per aciu de WiDaR en Wikidata, si confirmasti una asociación.

Elementos validaos manualmente (en verde)

For items which have already been matched, the second line will have a link to Wikidata along with an auto-generated summary of the Wikidata entry, or have "Not applicable to Wikidata" shown.

On the right column will be the name of the user who made the link, along with a red "Remove" This link should be used only if you believe that the combination made by someone else is wrong. When combined properly, leave everything as it is and move on.

Note that while making a match causes the Wikidata item to be updated, removing a match (currently) does not. If you remove a match on an item, you may want to open that Wikidata item in a new tab and remove the property there as well - otherwise, it may find its way back into mix'n'match in the future.

Candidatos pa crear

Many entries from catalogs are not (yet!) on Wikidata. Some may not meet the criteria for a Wikidata item, but others are listed in several catalogs, and thus have several external sources, which helps their "noteworthiness" significantly. Entries that have the same name in multiple (>=3) catalogs, but have no associated Wikidata item, can be found via Creation candidates.

An example of creation candidates.

The listed entries have the usual search options, to ensure that no item already exists on Wikidata. One can then create a new Wikidata item, with the (English) label pre-filled. Then, the new item can be matched to the applicable entries via Set Q. One can also search Commons for that label; sometimes, an image of that person already exists there!

Caution: Just because these entries have the same name, does not mean they all refer to the same entity. Please check carefully with the individual catalogs!

Conseyos pa faer asociaciones

When matching entries to Wikidata items please bear the following tips in mind:

  • Don't guess: guessing will introduce errors into the data. If in doubt follow the link on the catalogue entry, check other catalogs at the bottom of the entry or other information (e.g coordinate location). You can always skip entries and let someone else match it, you can even move to a different catalogue you have more knowledge of.
  • Don't be afraid to create new items: If it isn't exactly the same concept please create a new item. It is much easier to merge two items after the matching has finished than separate an item into two separate items. E.g a World Heritage site for a city often does not cover the same area as the city itself, so a new item should be made.
  • Don't match to disambiguation items: Wikidata items exist for Wikipedia disambiguation pages. These items act as a list of links, rather than a concept to be matched to. Eg Bambaia (Q4853316) should not be matched, Agostino Busti (Q395600) should be.
  • Don't match from disambiguation items: some authority databases have disambiguation or alias pages.
    • Eg RKD Artists used to have an entry for "Bambaia" that was wrongly mapped to Wikidata. (Now RKD Bambaia properly redirects to RKD Augustino Busti)
    • Never match to GND "undifferentiated names"
  • Check the automatic matches: Whilst the automatic matching is often correct it can still get confused between similarly named items.
  • N/A status is exclusively for entries that can never, ever be a Wikidata item, or for known duplicates within the same catalog.
  • Use the 'jobs' option: The 'action' drop-down menu on any catalogue has a 'jobs' option. This gives you a list of tasks that will help with matching. For example, 'auxiliary matcher' will check the dataset for additional identifiers such as VIAF IDs and check them against existing records in Wikidata. If the automatching process has thrown up a lot of low-quality matches, there is the option to 'purge automatches'.

Ordenar la llista d'un catálogu

By default, the catalog list is sorted alphabetically. The sort_mode parameter can take one or several keywords to alter this:

  • sort_mode=groups groups catalogs by type/subject area, largest groups first, sorted alphabetically within the respective group. Completed catalogs have their own group at the end
  • sort_mode=groups,by_easiest same as above, but "easiest" (#auto-matched+2*#unmatched) to complete first
  • sort_mode=by_easiest,no_complete ungrouped sorting, but "easiest" to complete first, hiding completed catalogs (as they would be "easiest" by default)
  • sort_mode=groups,complete_inline grouped, but with completed catalogs in their respective subject area.

If your favourite catalog is "unknown" or in the wrong group, please let Magnus Manske (talk) know.

Crear catálogos nuevos

You can create a new catalog and either provide a list of mapping candidates (best to paste them from a spreadsheet) or create a scraper to automatically harvest mapping candidates. Otherwise, ask Magnus Manske (talk) to import a catalog for you.

Tips

  • The field Wikidata property is for when a property exists for external identifiers. You can propose an external identifier property at Wikidata:Property proposal.
  • Create detailed descriptions for the Entry description field where possible, it will often make it much easier for people to match the catalogue, leading to less incorrect matches and higher data quality.
  • You can add aliases to items to help with the matching process. To import aliases, go to the catalogue and use the drop down 'action' menu in the top right. The 'aliases' option takes you to a page where you can import alternative labels for entries in the mix'n'match dataset. It will need to be in a tab separated format, and will use the dataset's external IDs for matching.

Managing catalogs

There is a catalog editor, accessible at mix-n-match/#/catalog_editor/<id> for the catalog creator and a subset of users (“catalog editors”). There it is possible to change some of the catalog properties (name, description, URL, type, language and Wikidata property) and to disable a catalog.

Scraper-based catalogs can be updated by following the catalog creation process, and entering an existing ”Catalog ID”.

Referencies

  1. Puedes apegar el númberu Q como «Q123» o bien como «123». El software tamién aceuta otros caráuteres, como paréntesis o comes, mentanto'l númberu Q proporcionáu sía válidu.

Enllaces