Wikimedia CEE Meeting 2019/Programme/Submissions/Structured Data on Wikimedia Commons

From Meta, a Wikimedia project coordination wiki
Title of the submission

Structured Data on Wikimedia Commons

Type of submission (lecture, panel, workshop, lightning talk, roundtable, poster)
  • workshop
Author(s) of the submission

Jan Lochman

Username(s)

Juandev

Affiliation

-

Topic(s)
  • Education
Abstract (up to 100 words)

Recently Wikimedia Foundation developers have been deploying more and more structured data units on Wikimedia Commons. The main reason is to break the barrier in English-only categories on WMC. This workshop will explain what are Structured Data, why they are deployed to Wikimedia Commons and how to benefit from them. Participants where learn how to fill structured data on Wikimedia Commons and how to search for files.

How will this session be beneficial for the communities in the region of Central and Eastern Europe?

Participants of CEE meeting may share this knowledge with their communities, which will help with the improvement of Wikimedia Commons.

Special requirements
Slides or further information

syllabus

Documentation

Interested attendees[edit]

If you are interested in attending this session, please sign with your username below. This will help reviewers to decide which sessions are of high interest. Sign with a hash and four tildes. (# ~~~~).

Notes[edit]

  • What are Structured Data?
    • caption and structured data on Wikimedia Commons (can be found under every file)
    • this is happening as it has been requested to improve categories (not all users are English users) Better categorization can improve the search
    • they have decided to use Wikidata infrastructure to fix that problem (better search option, multilingual friendly)
  • Why are you here and what do you expect?
    • learning how to teach people to search if they do not understand English (majority of files on Wiki Commons are uploaded in English)
    • learning how to solve technical barrier
    • possible usage of the bots
    • how to upload photos in multi languages in a short time (major issue when uploading mass photos)
    • categories are more systematic, but can be confusing
    • using AI tool that is still being developed. It will find text for the photo in several languages and then you can choose which ones you want.
  • issues on using merged things rather than using them separately. Structural issues. merging robots and tools
  • Difference between caption and description - caption should be short and easily translatable whereas description can be longer
    • it is expected to have easier search, AI support with tagging files (all languages), more properties (this is expected in near future)
  • Future of the search will use query search with query helper (should be working by the end of 2019)
  • In order to work in Wikidata you need to learn the Wikidata language (if you type in your own language, Wikidata will automatically find it in English)
  • Tagging on Wikimedia Commons:
    • do it like we do the categories for now
    • tag the major details only
    • be as specific as you can be
    • you can use similar tags if the ones that you need are not there
    • the ones that do not have translation in their own languages will be still shown in English. Others will be automatically translated into the language of preference
  • the question of defining what is relevant and what is not. It needs to be defined because of the tags.
  • Wikidata is being a database for many other projects not just Wikipedia.
  • keep some categories as required
  • if you click on something it should copy to the description page. Maybe when you are uploading mass photos that you can add languages (if you want multiple) and that it is automatically added on all the photos, so you do not have to do it for each one manually.
  • Wakeup: AC/DC – Highway to Hell from https://www.youtube.com/watch?v=gEPmA3USJdI ;-)
  • Tool for deploying faster:
    • AC/DC Gadget - mass use of the same tag/s for more photos of the same category
    • cat-a-lot gadget - you can select which images you want to mass manipulate. Chosen files can be put or removed from a certain category
  • Feedback:
    • the principals of the projects and the discussion was productive
    • a long way to go to be useful for the photographers
    • depending on the audience maybe to be a bit simpler
    • explaining what Wiki Commons is first
    • explaining briefly what Wiki data is (items, properties, value)
    • advise that you can put more than one property on Commons
    • the question, whether the AI will tag it for users? - yes, will be there somebody to check it
    • useful, when we can add more property
    • AI will be used for already uploaded images as well (tags will need to be approved by the user still)
    • darktable/dtMediaWiki-plugin: concerning structured data support: problem of way of data entry. difference between entering categories and data structure; make it more user-friendly (not just P and the number)