Content Partnerships Hub/Helpdesk/WLM in Ukraine/Monument data in Ukraine – import to Wikidata
Monument data in Ukraine, import to Wikidata
[edit]Done by Wikimedia Sverige in September 2023, in collaboration with Wikimedia Ukraine.
Summary
[edit]96496 Wikidata items have the property WLM ID with the prefix UA-, marking them as belonging to the Wiki Loves Monuments Ukraine dataset. [Query]
Out of these:
- 3362 (3.4 %) have an article in Wikipedia in Ukrainian. [Query]
- 30025 (31 %) have a sitelink to Wikimedia Commons [Query]
- 35719 (37 %) have an image [Query]
- 17100 (17 %) and counting have coordinates [Query]
In this page we can see the distribution of their instance of statements as well as the coverage of some properties. Note that there's an ongoing import of coordinate statements over the weekend of September 22.
How this was done
[edit]The work was done in two major stages. First, the existing Commons categories containing a {{Monument Ukraine}} template were identified, the ID was extracted from the template and added to the Wikidata item linked to the category. If no such Wikidata item existed, a new one was created.
In the second stage, a local copy of the Monuments Database was ingested into OpenRefine. The database contains 95760 Ukrainian monuments. The dataset was reconciled to Wikidata and used as a base to either add the WLM ID to existing items or create new ones. By far, the most common course of action was to create a new item.
Note that we found a number of Commons categories with ID's that do not currently exist in the Monuments database, e.g. https://commons.wikimedia.org/wiki/Category:Mass_grave_of_Jews_killed_by_nazis_in_Dnipro, hence the difference in numbers between the Wikidata items and the Monuments database.
Observations and issues
[edit]While we did the "rough" work (setting up basic items for the 90k-ish monuments), a lot of polishing could be done by members of the Ukrainian community to make the dataset even more useful. There are also a number of issues that the reader should be aware of when judging our work.
Instance of
[edit]The assignment of instance of statements (church building, building, kurgan…) was done in batches, based on WMSE's ability to read Ukrainian. Items that didn't fall into clearly identifiable batches got assigned instance of = cultural property.
Coordinates
[edit]We've had a dialogue with Wikimedia Ukraine on how to best handle the geographic coordinates of the monuments, considering the ongoing destruction of cultural heritage. We've been advised to not import the coordinates of monuments located in the territories occupied by Russia. Based on this, the monuments in following areas have had / are currently having their coordinates imported from the Monuments database to Wikidata (whenever available, of course – a large number of monuments do not have coordinates on Wikipedia / in the Monuments database):
- Вінницька область
- Волинська область
- Житомирська область
- Закарпатська область
- Івано-Франківська область
- Київська область
- Кіровоградська область
- Львівська область
- Одеська область
- Полтавська область
- Рівненська область
- Тернопільська область
- Хмельницька область
- Черкаська область
- Чернігівська область
- Чернівецька область
Heritage designation
[edit]No heritage designation statements were added, as they were not present in the Monuments database. Ideally, we would like to add them, but this would require advice from someone well versed in the organization of cultural heritage in Ukraine. For example, there already seem to exist some applicable Wikidata items, such as historic heritage of National importance in Ukraine and historic heritage of local significance in Ukraine.
SDC
[edit]Now that we have the monuments on Wikidata, this makes it possible for us, and everyone, to add depicts statements to the photos from the competitions! This is a great application for the work that's been done.
Other
[edit]There's a large number of Commons categories serving as a bucket for multiple monuments, e.g. https://commons.wikimedia.org/wiki/Category%3AVolodymyrskyi_Park
Maintenance queries
[edit]- [Query] Items with instance of = cultural property. These should get a more concrete statement.
- [Query] Items that have a Commons category statement but no image statement.
- [Query] Multiple items that share the same WLM ID. These can exist due to issues such as multiple Commons categories containing the same ID, either due to error or actually duplicate categories etc. With less than 250 problematic item pairs, this is possibly a doable task for local Wikimedians to tackle.