Grants:Project/Fjjulien/Modelling and Populating Performing Arts Data in Wikidata/Final

From Meta, a Wikimedia project coordination wiki


Report under review
This Project Grant report has been submitted by the grantee, and is currently being reviewed by WMF staff. If you would like to add comments, responses, or questions about this grant report, you can create a discussion page at this redlink.



Welcome to this project's final report! This report shares the outcomes, impact and learnings from the grantee's project.

Part 1: The Project[edit]

Summary[edit]

Project Goals[edit]

Goal 1: An expanded and validated model for populating performing arts related items in Wikidata.

Based on committe work and on population activities, we defined sets of recommended properties and sample items reflecting good practices in Wikidata. We enhanced the class hierarchy for organizations and we worked on a typology of roles for persons. Whenever possible we stated equivalent classes and equivalent properties in classic RDF ontologies.

Goal 2: An increase in the accuracy and the number of performing arts items in Wikidata.

We uploaded more than 1,600 theatre items. We delivered 18 workshops to 324 participants and we monitored a high level of engagement among many participants who had never contributed to Wikidata before. While there is still a lot of performing arts information to populate in Wikidata, we observed many anticipated and unanticipated impacts suggesting a growing enthousiasm about Wikidata in the arts community.

Project Impact[edit]

Targets[edit]

  1. In the first column of the table below, please copy and paste the measures you selected to help you evaluate your project's success (see the Project Impact section of your proposal). Please use one row for each measure. If you set a numeric target for the measure, please include the number.
  2. In the second column, describe your project's actual results. If you set a numeric target for the measure, please report numerically in this column. Otherwise, write a brief sentence summarizing your output or outcome for this measure.
  3. In the third column, you have the option to provide further explanation as needed. You may also add additional explanation below this table.
Planned measure of success
(include numeric target, if applicable)
Actual result Explanation
100 person items uploaded 1129 person items uploaded
100 organization items uploaded 129 organization items uploaded
300 work items uploaded 379 work items uploaded While the work items were created, it was not possible to add as many statements about contributors and cast members as we would have hoped because too many person items didn't exist yet in Wikidata.
4 presentation decks, in English and French
  • 4 presentation decks, in English and French
  • 37 video tutorials, in English and French
  • presentation decks and tutorials are accessible on the Linked Digital Future website and on Youtube.
Grant advisors had expressed a desire for video tutorials. We delivered them.
  • 12 workshops


  • 190 total participants
  • 18 workshops
  • 1 webinar
  • 4 LODEPA workshops and working group events
  • 1 gathering with Indigenous artists
  • 2 office hours events
  • 394 total participants
  • 324 total participants at workshop series
  • 171 unique participants at workshop series
We underpromised and overdelivered.
Enrichment of items Workshop participants created 1450 items and made 12,700 edits
  • 8 participants in core modelling advisory group
  • Participants from 4 different countries
  • 14 participants in core modelling advisory group
  • Participants from 4 different countries
  • One Indigenous advisor joined the group
  • LaCogency hired two additional Wikidata experts to help with both the advisory group and the workshops
  • Enhancements to WikiProject Performing arts:
    • 29 recommended properties for persons, with examples from well modelled items and with a set of notes for usage.
    • 34 recommended properties for organizations, with examples from well modelled items and with a set of notes for usage.
    • typology of occupations, positions and roles created
    • The WikiProject was translated to French
  • Enhancements to WikiProject Cultural venues
    • 32 recommended properties for cultural venues, with examples from well modelled items and with a set of notes for usage.
    • The WikiProject was translated to French
  • The community decided that WikiProject Theatre should be merged with the WikiProject Performing arts. Recommended properties for persons and organizations were carried over from WikiProject Theatre to WikiProject Performing arts.
Other modelling outcomes
Unanticipated outcomes We are observing spillover effects from the project:
  • The Performing Arts Aoteara group wants to undertake similar work.
  • Dance+Words, a Canadian initiative dedicated to enhancing dance-related articles in Wikipedia now plans to integrate Wikidata in their activities.
  • In addition to French, the WikiProject Performing arts is now being translated in Turkish, Polish, and Bahasa Indonesia.
  • Associations in other sub-sectors of the arts are undertaking equivalent activities – in Québec, Productions Rhizome recently received a provincial grant to enhance the presence of writers and their works across Wikimedia sister projects.
  • Discussions are being held with Statistics Canada and other stakeholders about uploading the Open Database of Cultural and Art Facilities to Wikidata.


Story[edit]

Wikidata is an "acquired taste". At first, it felt rather unappealling and foreign to workshop participants and to domain experts in the advisory committee: structured data and performing arts creation felt like two very different worlds to most performing arts practitioners. But then, as they started making their first statements, they realized how a RDF triple resembles a sentence, how one-to-many relationships (i.e. multiple "instances of" values) are apt at describing the multi-facetted aspects of social constructs, and how these endless series of relationships mimick how we structure information in our own brains. Midways through the workshop series, participants had moved beyond the geeky features of Wikidata and were starting to enjoy its profoundly human nature and way of fonctioning. By the end of the series, their questions were at times touching upon complex modelling issues.

The following testimony from one committee participant offers a beautiful illustration of the initiatory journey experienced by some participants:

« Je n’interprète plus du tout les choses de la même façon. Wikidata, c’est ce qu’on fait dans la vie de manière intuitive. Et qu’on fait, là, en toute conscience. »
("I no longer see things the same way. [Organizing information in] Wikidata, that's the kind of thing we do all the time unconciously. And that we now do in full awareness.")

Survey(s)[edit]

Motivations of the respondants
Motivations of the respondants
Level of knowledge prior to participation
Level of knowledge prior to participation

The survey was sent to participants who had attended at least one of the 9 English and French workshops. The quantitative value of the responses is low (12) but we were able to verify that the qualitative values expressed by the respondents were quite representative of the majority of the participants during the last workshops and various email exchanges at the end of the project. We put the low response rate down to the particular time we are going through, which causes a lot of fatigue in the face of the digital overload we are all experiencing. Despite this, the unique number of participants and the recurrence of participants demonstrate a very high level of commitment from the community gathered around the workshops throughout the program. In initiating this series of workshops, our goals were to raise awareness within the performing arts community of the importance of taking ownership of their metadata, to increase digital literacy in terms of metadata and digital discoverability, and to encourage as many people as possible to contribute to the Wikidata knowledge base themselves on a regular basis.

From the survey responses, we can draw the following general findings:

  • Respondents recognize the importance of mastering and contributing to metadata.
  • Respondents associate Wikidata as an effective way to support their digital discoverability
  • Wikidata is increasingly known in the performing arts community
  • 83% of respondents have acquired the useful and necessary skills to continue the effort to maintain and produce quality performing arts-related metadata by the community itself.
Description of user's experience with the workshop series and Wikidata
Description of user's experience with the workshop series and Wikidata


Three interesting outputs or outcomes that the survey revealed:

  • The performing arts community in Quebec confirms a rapidly evolving digital maturity
  • Respondents greatly appreciate having access to online resources from the workshops to support them in their actions related to their organization.
  • 100% of respondents say they want to continue contributing to Wikidata.


For more details, here is the link to the survey details: Survey report.

Other[edit]

Stacked area chart showing increases in seven performing arts occupations. The increase among actors/actress is particularly important.
This chart shows the increases in the number of Wikidata person items that have performing arts occupations between March 11, 2019 and March 31, 2021.

Over the course of the project, the total count of performing arts person items increased by 17.9%. While this increase cannot be directly attributed to the project, we are proud to have contributed to it.

Methods and activities[edit]

Modelling activities[edit]

  • We assembled a committee of Wikidata and domain experts. The committee met six times between June 2020 and March 2021. Specific tasks were delegated to committee members in between meetings.
  • We modelled performing arts persons, with a focus on occupations, positions in organizations, and roles in works.
  • We modelled performing arts organizations, with a focus the class hierarchy, on legal information, and on key personnel.
  • We modelled cultural venues, highlighting the conceptual distinction between buildings and the organizations managing them.
  • We based our modelling decisions on:
    • Current usage in Wikidata
    • Concept(s) described in linked Wikidata articles
    • Experience (and challenges) populating performing arts information in Wikidata
    • Modelling approaches in classic RDF ontologies (aiming for harmonization)
  • We documented recommended properties and classes, along with usage notes and examples in the related Wikiprojects.
  • We notified Wikiproject participants of significant modelling decisions or of proposals requiring broader commmunity imput.

Data population activities[edit]

  • The upload of Conseil québécois du théâtre's dataset was performed in October 2020.
  • Here is a sample organization item.

Training activities[edit]

  • We designed and delivered four introductory workshops and five hands-on workshops, in English and French.
  • Presentation slides and workshop recordings were made available on the project website.
  • Workshop recordings were edited for dissemination on CAPACOA's Youtube channel. A short recap of each workshop was also produced.

New activities[edit]

  • Since September 2020, CAPACOA has been co-chairing a Wikipedia/Wikidata Working Group as part of the Linked Open Data Ecosystem for the Performing Arts initiative. This international working group provides a forum for discussing use cases for Wikidata. It is also an opportunity for us to promote the project at the international level.
  • Members of the advisory committee undertook a consultative process with Indigenous arts and culture practitioners to assess whether and how their information should be populated in Wikidata. This process began towards the end of the project and is still ongoing at the time of submitting this report.

Project resources[edit]

Please provide links to all public, online documents and other artifacts that you created during the course of this project. Even if you have linked to them elsewhere in this report, this section serves as a centralized archive for everything you created during your project. Examples include: meeting notes, participant lists, photos or graphics uploaded to Wikimedia Commons, template messages sent to participants, wiki pages, social media (Facebook groups, Twitter accounts), datasets, surveys, questionnaires, code repositories... If possible, include a brief summary with each link.

Project resources
Name Description Link
Main WikiProject:Performing_arts Page of the WikiProject (EN) Web pages
Main WikiProject:Performing_arts the French translation Project in French
Related WikiProject for Cultural_venues Page of the related WikiProject Web pages
WORKSHOPS
Survey related to the workshops Detailed survey report Report's link
Outreach dashboard CAPACOA workshops campaign Link to the detailed dashboard
Wikidata Atelier 1: Initiation (FR) Part 1 on 3 Video
Part 2 on 3 Video
Part 3 on 3 Video
Wikidata Workshop 1: Introduction (EN) Part 1 on 3 Video
Part 2 on 3 Video
Part 3 on 3 Video
RECAP Video
SLIDES Slides
Wikidata Atelier 2: Contribuer (FR) Part 1 on 4 Video
Part 2 on 4 Video
Part 3 on 4 Video
Part 4 on 4 Video
RECAP Video
SLIDES Slides
Wikidata Workshop 2: Contributing to WD (EN) Part 1 on 4 Video
Part 2 on 4 Video
Part 3 on 4 Video
Part 4 on 4 Video
RECAP Video
SLIDES Slides
Wikidata Atelier 3: introduction au Service de requête de Wikidata (FR) Part 1 on 3 Video
Part 2 on 3 Video
Part 3 on 3 Video
RECAP Video
SLIDES Slides
Wikidata Workshop 3: Introduction to Wikidata Query Service (EN) Part 1 on 3 Video
Part 2 on 3 Video
Part 3 on 3 Video
RECAP Video
SLIDES Slides
Wikidata Atelier 4: Wikidata et Wiki Commons (FR) Part 1 on 5 Video
Part 2 on 5 Video
Part 3 on 5 Video
Part 4 on 5 Video
Part 5 on 5 Video
RECAP Video
SLIDES Slides
Wikidata Workshop 4: Wikidata and Wiki Commons (EN) Part 1 on 5 Video
Part 2 on 5 Video
Part 3 on 5 Video
Part 4 on 5 Video
Part 5 on 5 Video
RECAP Video
SLIDES Slides
Wikidata Workshop 5 to 9 (EN) Agenda and references for the workshops Workshop 5 to 9 - support
Wikidata Workshop 5 to 9 (FR) Agenda and references for the workshops Workshop 5 to 9 - support
WEBSITES
CAPACOA/LDF Website Pages related to the project (FR) Web pages
CAPACOA/LDF Website Pages related to the project (EN) Web pages
Blog on the LDF website Blog post Post (EN) + Post (FR)
The CQT website Pages related to the project Web pages only in French
ADVISORY COMMITTEES
1st Committee Agenda and minutes Document
2nd Committee Agenda and minutes Document
3rd Committee Agenda and minutes Document1+ Document2+ Document3
4th Committee Agenda and minutes Document1+Document2
5th Committee Agenda and minutes Document1+ Document2 (spreadsheet)
6th Committee Agenda and minutes Document
LODEPA Wikidata/Wikipedia Working Group Agenda and minutes September 22+

November 19+ February 11th

MAIN MODELLING ACTIVITES
The class item for performance (Q35140) Moved references to distinct concepts, mapped it to external ontologies, and referenced the subclass statements. performance (Q35140)
The performing arts group superclass Subclass statements were added to 11 performing arts class items Performing arts group (Q105815710) + Resulting Graph
Subclasses of "performing arts buildings" and "subclasses of "event venue" The typology of performing arts venues and buildings is fairly clean Resulting Graph + Resulting Graph2

Documented discussion on the WikiProject Cultural venues talk page

Artistic director (P8938) We proposed a new property. The artistic director is a key executive in performing arts organizations and it needed to be represented with its own property Artistic director + Discussion page
Canadian Business Number (P8860) We proposed a new property. Although the Canadian BN is currently not easy to retrieve, its high prevalence makes the BN a useful identifier for disambiguating legal entities, especially in domains where global unique persistent identifiers do not exist or do not have broad adoption in Canada Canadian Business Number + Discussion page
DATASET INGESTIONS
Final count of the CQT's ingestion Specifics queries aroud the CQT dataset

Learning[edit]

What worked well[edit]

What didn’t work[edit]

  • The modelling and upload of works didn't go as anticipated. We were still able to upload works and to deliver a workshop on this topic. However, we were not able to deliver a mature modelling strategy for works/productions.
    • First of all, IFLA undertook consolidation of LRMER with CIDOC-CRM and FRBRoo, and their draft documentation announced the deprecation of FRBRoo F20 Performance Work and F25 Performance plan – two fundamental classes around which all our modelling of works was supposed to happen. Considering we had promised to deliver modelling outputs harmonized with RDF ontologies, this presented a problem. We got in touch with the Consolidation Editorial Group for more information, but their initial answers have not provided any indication of how they intended to model the relationship between core WEMI works and performance works.
    • Second, a W3C community group dedicated to Performing Arts Information representation started their own modelling activities. It seemed wise not to duplicate efforts, but rather to work along with this PAIR W3C community group to model works.
    • Third, even though we had previously uploaded person items, we were short of existing person items to make statements about cast members and contributors to works/productions.
  • Interwiki links between Wikidata and Wikipedia can be quite difficult to reconcile. Wikidata and Wikipedia have different cultures. Whereas Wikidata requires conceptual clarity, Wikipedia does not care as much if an article describes more than one concept. This leads to the following problems.
    • Wikidata class items linked to many Wikipedia articles are very difficult to apprehend. We found a lot of conceptual divergences in Wikipedia articles that relate to the same item (see, for example, theatrical troupe (Q742421) and this documentation). This makes it quite challenging to harmonize descriptions across languages and to define the right class hierarchy. In the case of performing arts organizations, the easiest solution to establishing a unified class hierarchy was to create a superclass (performing arts group (Q105815710)) that was harmonized with RDF ontologies rather than linked with Wikipedias.
      • One potential solution to enhance conceptual clarity in Wikipedia articles for classes of entities, is the use of Wikidata-powered authority control-templates. This practice should be encouraged among Wikipedians, since it opens up an opportunity for Wikipedians to consider authority files stated in Wikidata.
    • Wikidata named entity items linked to Wikipedia articles also present challenges. For example, many Wikipedia articles describe both a building and the organization that manages it. This results in Wikidata items that describe two entirely distinct named entities - and includes links to external identifiers for both. Separating these Wikidata items into two distinct items is difficult and time-consuming. While certain statements can easily be attributed to the right entity, others require verification. Among other things, checking external identifiers and asserting which entity each one relates to can be difficult, since some base registers can be hard to make sense of for humans.
    • Sometimes, it is necessary to split a Wikidata item describing both a building and an organization in two distinct items. But then, there’s an issue about which entity keeps the original Q number and which one has the new one. This can break interwiki links or create inaccurate links. For example, this can be a problem when Wikimedia Commons holds content about the building, but the building is assigned a new Q number: links are broken.
      • One potential solution to enhance conceptual clarity in Wikipedia articles about named entities is the use of infoboxes. Whether or not these infoboxes are powered by Wikidata, the creation of infoboxes require the user to choose a template for a specific type of articles. This forces a conceptual choice between one type/class or another.
    • Other potential solutions to achieve greater ontological precision in Wikipedia could include:
      • Wikipedia articles should be clear in their introductory sentence as to what concept they are describing.
      • Consider constraints to make certain categories mutually exclusive (the same would be needed for Wikidata "instance of" statements).
      • Make it possible to flag conceptually ambiguous Wikipedia articles that describe more than one concept or named entity.
      • Articles that aren't conceptually clean should not be linked to Wikidata or else should be linked with a warning.
      • One-to-many relationships between Wikipedia and Wikidata could also be considered when a Wikipedia article describes distinct entities. Many-to-one relationships are also needed when different articles describe the same authority notice or economic classification.

Next steps and opportunities[edit]

  • While modelling activities around performing arts works/productions are being undertaken by IFLA and by the W3C PAIR Community group, further efforts should be made to populate person, organization, and venue items in anticipation of structural needs for work statements (i.e., stating the producing company, the cast members, the contributors, etc.). Unions and industry association sometimes hold large datasets that could be uploaded.
  • There are opportunities to use more Wikidata-powered infoboxes in Wikipedia. We have started exploring this opportunity as part of LODEPA (Linked Open Data Ecosystem for the Performing Arts) working group meetings. Work on infobox templates is being proposed for a future project.
  • Wikidata/Wikipedia performing arts communities could benefit from some form of consolidation and better coordination. The LODEPA community could be a vehicle for this.
  • The vast majority of performing arts workers will never become regular Wikidata editors. For these cultural workers, monthly Wikidata training workshops such as we've offered during this project grant represent a much too big time commitment. In order to get the maximum number of Wikidata edits out of one-time participants, we think we could try out a data “clinic” approach in partnership with industry associations (loosely inspired by the model of blood giving clinics in the workplace, but held online). These clinics could provide step-by-step instructions for creating or editing a specific type of item so that participants could create/edit their own person or organization item over the course of the clinic. Such a clinic approach may offer a stronger value proposition ("what's in it for me") for cultural workers and for their associations than introductory workshop series: it may provide us with a better carrot to get the discussion started (and hopefully recruit a larger number of regular editors). Organization items in particular could benefit from a clinic approach. There are few databases of performing arts organizations. And basic organization information – legal name, legal form, business number, date of inception, etc. – can be quite difficult to find over the Web. The clinic approach may help fill that particular information gap.
  • It is necessary to continue the consultative process to define which aspects of Indigineity should be represented in Wikidata and how. This need to be done with Indigenous people, in ways that respect their right for self-determination and the principle of data sovereignty. A first gathering happened over the course of this project. Among our next steps, we intend to employ the user acceptance testing methodology to understand what is working and not working for Indigenous when going through the process of creating and editing a Wikidata item (see this call for participation). We will make sure to share our findings back with the Wikidata community.

Part 2: The Grant[edit]

Finances[edit]

Actual spending[edit]

Revenues (CAD)

Revenue Initial budget Revised budget Actuals (March 31, 2021)
Province of Québec $25,000.00 $25,000.00 $25,000.00
Wikimedia Foundation $33,500.00 $33,500.00 $33,500.00
Canada Council for the Arts $50,000.00 $50,000.00 $50,000.00
CAPACOA $13,100.00 $15,812.00 $12,384.00
Conseil québécois du théâtre $3,000.00 $3,000.00 $3,000.00
In-kind - Bern University of Applied Sciences $0.00 $0.00 $2,428.00
Total Revenues $124,600.00 $127,312.00 $123,884.00

Expenses (CAD)

Expense Initial budget Revised budget Actuals (March 31, 2021) WMF Grant (CAD)
Research and working group coordination $33,300.00 $33,300.00 $33,300.00
Fees for working group members $16,410.00 $22,630.00 $20,810.00 $11,500.00
Fees for modelling and implementation in Wikidata $10,400.00 $10,400.00 $10,400.00
Synchronization and ingest of data $5,000.00 $5,000.00 $5,000.00 $5,000.00
Knowledge transfer: documentation and development of training materials $14,720.00 $14,720.00 $14,720.00 $14,000.00
Knowledge transfer: workshops over web conference $17,290.00 $17,290.00 $17,290.00 $3,000.00
Fees for implementation in Conceptual model for linked data in the performing arts $6,000.00 $6,000.00 $6,000.00
Translation costs $3,000.00 $3,000.00 $1,164.00
Travel expenses - domestic $500.00 $0.00 $0.00
Travel expenses - international $3,280.00 $0.00 $0.00
Fees for participants in consultation with Indigenous artists $0.00 $0.00 $500.00
In-kind - Bern University of Applied Sciences $0.00 $0.00 $2,428.00
Salaries - CQT $5,000.00 $5,000.00 $5,000.00
Administration - CQT $500.00 $500.00 $500.00
Salaries - CAPACOA $8,200.00 $8,200.00 $8,200.00
Administration - CAPACOA $1,000.00 $1,000.00 $1,000.00
Total Expenses $124,600.00 $127,040.00 $126,312.00 $33,500.00

Remaining funds[edit]

Do you have any unspent funds from the grant?

  • No

Documentation[edit]

Did you send documentation of all expenses paid with grant funds to grantsadmin(_AT_)wikimedia.org, according to the guidelines here?

  • Yes

Confirmation of project status[edit]

Did you comply with the requirements specified by WMF in the grant agreement?

  • Yes

Is your project completed?

  • Yes

Grantee reflection[edit]

We’d love to hear any thoughts you have on what this project has meant to you, or how the experience of being a grantee has gone overall. Is there something that surprised you, or that you particularly enjoyed, or that you’ll do differently going forward as a result of the Project Grant experience? Please share it here!

This project has been an exciting journey for the project team and for advisory committee members. If at times Wikidata felt a bit chaotic, we found this chaos to be creative rather than destructive. Wikidata allows for structure and clarity as well as flexibility. It is both rigourous and organic.

We also very much enjoyed the possibility of sharing monthly updates on our progress tab. This facilitated our reporting, internally as well as externally with other funders.

Thanks again for the support of the Wikimedia Foundation! Fjjulien (talk) 20:39, 30 April 2021 (UTC)