Grants:Project/Fjjulien/Modelling and Populating Performing Arts Data in Wikidata

From Meta, a Wikimedia project coordination wiki
statusselected
Modelling and Populating Performing Arts Data in Wikidata
summaryThis project will improve performing arts institutions’ capacity to populate performing arts data in Wikidata. The project will involve modelling, translation, populating and training activities.
type of grantonline programs and events
amount$33,500 CAD
type of applicantorganization
granteeFjjulien
contact• frederic.julien@capacoa.ca
volunteer24.204.153.203
this project needs...
volunteer
affiliate
advisor
join
endorse
created on22:42, 28 January 2020 (UTC)


Project idea[edit]

What is the problem you're trying to solve?[edit]

Because of the intangible nature of the live performance, the performing arts have resisted being modeled and translated as data. Recent initiatives in Belgium, Switzerland and Canada have led to the development of a performing arts ontology, parts of which have been implemented in Wikidata. However, there remain outstanding modelling issues, there are inconsistencies in the use of Wikidata classes and properties for performing arts items, and there are still very few items about the performing arts in Wikidata.

What is your solution?[edit]

We will strike a working group tasked to resolve known modelling issues and to identify a set of recommended properties and values for key performing arts entities. We will then develop learning resources to teach performing arts institutions how to use Wikidata, and we will deliver training in a series workshops via web conference.

Project goals[edit]

  1. An expanded and validated model for populating performing arts related items in Wikidata.
  2. An increase in the accuracy and the number of performing arts items in Wikidata.

Project impact[edit]

How will you know if you have met your goals?[edit]

Goal 1: An expanded and validated model for populating performing arts related items in Wikidata.[edit]

Modelling Outputs/Deliverables[edit]
  • Sets of recommended Wikidata properties, constraints and values for: performing arts venues, organizations, artists and performance works.
  • A robust super-class/sub-class schema for the above entities (including controlled vocabularies for the most common attributes of the above entities)
  • Accurate English and French labels and descriptions for the above entities and properties
  • Alignment of Wikidata with classic RDF ontologies, with Schema.org, and with the Conceptual model for Linked Data in the Performing Arts.

Goal 2: An increase in the accuracy and the number of performing arts items in Wikidata.[edit]

Population Outputs/Deliverables[edit]
  • At least 300 performance works are added to Wikidata (as part of batch upload)
  • At least 100 performing arts organizations are added to Wikidata (as part of batch upload)
  • At least 100 people with occupations in the performing arts are added to Wikidata (as part of batch upload)
  • Qualitative enrichment of items for performance works, organizations, artists and venues (as a result of our training activities)

Note: We will run a series of SPARQL queries at the beginning of the project to set benchmarks against which we can measure population and enhancement of items resulting from our training activities.

Training Outputs/Deliverables[edit]

Outcomes[edit]

Our two goals converge towards the same long-term outcome:

  • Performing arts institutions have an increased capacity to populate data in Wikidata according to a standardized conceptual model.

The standardization resulting from our modelling activities is likely to have ripple effects beyond Wikidata, as other knowledge bases adopt the same model. This semantic interoperability will in turn make it easier for performing arts institutions to donate data to Wikidata and to keep datasets in synch.

Moreover, as the same sets of classes, properties and values are used in a consistent fashion, Wikidata will deliver more value to data users as well.

Ultimately, if we succeed, Wikidata will become a key base register in a decentralized linked data ecosystem for the performing arts.

Do you have any goals around participation or content?[edit]

Both goals directly relate to increasing participation in Wikidata.

Here are our metrics for measuring participation:

  • Number of participants in core modelling working group: 8
  • Number of participating countries: 4
  • Number of participants at training and outreach activities: 190
  • Number of these participants who will have created a Wikimedia account as a result of our activities: 40

Other metrics relative to the number of Wikidata items will be added after we've run queries to establish benchmarks.

Project plan[edit]

Activities[edit]

Tell us how you'll carry out your project. What will you and other organizers spend your time doing? What will you have done at the end of your project? How will you follow-up with people that are involved with your project?

Modelling Activities[edit]

  • Strike an international Working Group on Wikidata and Performing Arts;
  • Research and address modelling issues; identify recommended properties, values and constraints;
  • Implement data modelling deliverables into Wikidata;
  • Edit the WikiProject Performing arts, WikiProject Cultural venues, and WikiProject Theatre to include the modelling deliverables;
  • Translate the WikiProjects in French.
  • Synchronize the Conceptual Model for Linked Data in the Performing Arts with the WikiProjects.

The working group will include a mix of domain experts and ontology development experts. This working group will be tasked to address modelling problems and to deliver a set of modelling deliverables (see above). In addition to their advisory role, working group members will also assist with outreach at the international level. The working group will meet 6-8 times over web conference to model key performing arts entities, one by one. They will be assisted by a team of consultants who will do preparatory work prior to each meeting. Working group members will be expected to read briefing documents prior to each meeting. They will also be expected to validate final decisions after action items have been carried out by the consultants.

Modelling activities will focus on four key classes of performing arts entities: venues, organizations, people and works.

Many modelling issues need to be addressed. The issues related to items confounding architectural structures (buildings, venues) and organizations are well documented in the WikiProject Performing Arts: in the Flanders Arts Institute series alone approx. 500 organization/venue items still need to be "separated". There are also many unaddressed questions around the performing arts work itself. For example, performing arts production (Q43099500) conflates work and event sub-classes whereas some institutional datasets model these as distinct classes of entities. This presents a challenge for properties, such as cast member (P161), whose values can differ from one performance to another. We also need to ensure that all main performing arts genres, occupations, and organization types fit within robust superclass/subclass hierarchies to enable more accurate query results. Finally, alignment with FRBRoo and with Schema.org will need to be carefully considered to ensure that the recommended classes and properties foster interoperability of performing arts knowledge bases and discoverability of works being performed at live events.

Translation activities will include translation of WikiProjects as well as labels and descriptions of any classes and properties recommended in the WikiProjects.

We will finally implement all modelling decisions of the working group into the Conceptual Model for Linked Data in the Performing Arts. By ensuring that the conceptual model and Wikidata are fully mapped, we will set an international standard for conceptual and data modelling in the performing arts.

Validation and Population Activities[edit]

  • Acquire and populate sample data from the theatre sector, dance sector and presenting sector to validate and pressure test data modelling decisions.
  • Ingest the database of productions by Conseil québécois du théâtre (CQT) into Wikidata.

The CQT's Coup d'oeil dataset includes information about 236 their theatre productions (performance works), as well as on the organizations and stage directors behind these productions (we do not have an exact count of unique organizations and unique people in the dataset). There is currently very little information in Wikidata about theatre companies and even fewer about performance works. This batch upload will therefore fill a critical gap in Wikidata. The batch upload will be performed by LaCogency and Antoine Beaubien, who have previous experience with ingests of movies datasets. Once the ingest is complete, we will provide training to theatre stakeholders so they can enrich the uploaded items with statements linking to items about dramatic works. This will be done as part of our training activities. Engagement with the uploaded data may take different forms: event listings may query the data to enrich their event descriptive metadata; research may be conducted on the theatre sector in Quebec.

Training and Outreach Activities[edit]

  • Develop training materials informed by the working group activities and deliverables.
  • Hold 12 interactive web conferences (6 in English and 6 in French) to help performing arts institutions build a capacity to populate Wikidata (and Wikimedia Commons).
  • Present project outcomes at one international conference.

The training materials will be presentation decks in English and French. These presentation decks will be made available under CC-0 licence.

The web conferences will be offered on a monthly basis between September 2020 and March 2021. The format will be a short introductory presentation, followed by one-on-one guidance as participants create Wikidata items and add/edit statements. Each web conference will focus on a single class of entities but questions on any class of entities will be welcome. In addition to serving training and outreach purposes, these web conferences will also be an opportunity to validate our model with data provided by participants.

Budget[edit]

Revenues[edit]

Revenue Amount (CAD)
Province of Québec $25,000.00
Wikimedia Foundation $33,500.00
Canada Council for the Arts $50,000.00
CAPACOA $12,670.00
Conseil québécois du théâtre $3,000.00
Total Revenues $124,170.00

Expenses[edit]

Expense Amount (CAD) WMF Grant (CAD)
Research and working group coordination $33,300.00
Fees for working group members $15,980.00 $11,500.00
Fees for modelling and implementation in Wikidata $10,400.00
Synchronization and ingest of data $5,000.00 $5,000.00
Knowledge transfer: documentation and development of training materials $14,720.00 $14,000.00
Knowledge transfer: workshops over web conference $17,290.00
Fees for implementation in Conceptual model for linked data in the performing arts $6,000.00
Translation costs $3,000.00
Travel expenses - domestic $500.00
Travel expenses - international $3,280.00 $3,000.00
Salaries - CQT $5,000.00
Administration - CQT $500.00
Salaries - CAPACOA $8,200.00
Administration - CAPACOA $1,000.00
Total expenses $124,170.00 $33,500.00

Grant amount requested from Wikimedia Foundation: CAD$33,500

Community engagement[edit]

The WikiProject Performing Arts and the WikiProject Theatre will be our main forums for engaging with the Wikidata community. Two active contributors are already part of our core working group. Other contributors will be invited to review and to comment upon modelling recommendations.

One of our team members chairs the Wikidata Club for Wikimedia Canada. This community will be invited to our training and outreach activities.

The réseau des agents de développement numérique is a particularly active Canadian community of practice around culture and digital transformation. They will be closely involved in all training and outreach activities.

While many working group members and team members are based in Canada, we are firm in our intent of giving this project an international scope. Core working group members are active participants at international performing arts and semantic web conferences, and they will use these networks to raise awareness of the project. In addition, we will seek a presentation opportunity at one international conference (tbc).

Blog posts[edit]

Get involved[edit]

Participants[edit]

Confirmed Core Working Group Members

Participant Organization Country Role User
Frédéric Julien CAPACOA Canada Project Lead Fjjulien
Joana Neto Costa Conseil québécois du théâtre Canada Project Lead Joananetoc
Jean-Robert Bisaillon Espace Temps Canada Field expert Youyouca
Anju (Singh) Christofferson CultureBrew.Art Canada Field expert
Denise Bolduc Creative director Canada Field expert
Jenny Fewster AusStage / Flinders University Australia LOD/Ontology expert Fewster
Beat Estermann Bern University of Applied Sciences Switzerland LOD/Ontology expert Beat_Estermann
Bart Magnus Packed Belgium LOD/Ontology expert Beireke1
Gregory Saumier-Finch Culture Creates Canada LOD/Ontology expert Saumier
Kathleen Smith DancePlusWords Canada Field expert

Beat Estermann and Bart Magnus have extensive experience in modelling performing arts information and populating items in Wikidata. Both have been key contributors to the WikiProject: Performing Arts and the WikiProject: Theatre.

Jenny Fewster has worked on performing arts databases since the early 90’s. She is the project manager for AusStage, the Australian national online resource for live performance research since 2003.

Gregory Saumier-Finch is Chief Technical Officer at Culture Creates, a company developing RDF-based solutions for arts organizations. Culture Creates is leading the development of Artsdata.ca, a Canadian knowledge graph for the performing arts.

Frédéric Julien is Director of Research & Development at the Canadian Arts Presenting Association (CAPACOA). He leads the Linked Digital Future initiative, which involves a range of research, prototyping and digital literacy activities to foster discoverability, digital collaboration and digital transformation in the performing arts.

Joana Neto Costa is agente de développement numérique at Conseil québécois du théâtre. She is part of a network of agents de développement numérique responsible for assisting the arts sector in its digital transformation.

At least two more domain experts will be brought into the working group. Additional domain experts may be invited to work on specific entities.


Supporting Team

Participant Organization Country Role User
Véronique Marino LaCogency Canada Project coordinator Vero Marino
Antoine Beaubien Consultant Canada IT/Wikidata Expert Antoine2711
Simon Villeneuve Consultant Canada IT/Wikidata Expert Simon Villeneuve
Miguel Tremblay Consultant Canada IT/Wikidata Expert Dirac


Véronique Marino, Antoine Beaubien, Simon Villeneuve and Miguel Tremblay will provide support with coordination, research and implementation. They have previous experience modelling and ingest independent movie data into Wikidata, Miguel lead the WikiProject Weather observations too.

  • Volunteer

Hi there, This project seems incredible and although I have little experience with Wikimedia and the Wiki platform, I'm very interested into learning more. I work with artists and cultural institutions for better discoverability. Maxime 24.204.153.203 18:23, 8 July 2020 (UTC)

Hi Maxime,
Thanks for your interest in the project. If you would like, we invite you to participate in our progressive workshop series. We are holding workshops in English and French every month. You may register to upcoming workshops or watch recordings from past workshops on this website: https://linkeddigitalfuture.ca/wikidata/.
Fjjulien (talk) 19:08, 23 September 2020 (UTC)

Community notification[edit]

  • Post to Wikidata Chat (English), also notifying the Wikiproject "cultural heritage".
  • Post to GLAMwiki Global Telegram group.
  • Post to two Canadian arts + digital transformation Facebook groups.
  • CAPACOA's newsletter (2,400 subscribers).

Endorsements[edit]

Do you think this project should be selected for a Project Grant? Please add your name and rationale for endorsing this project below! (Other constructive feedback is welcome on the discussion page).

  • The Quebec council of theater (CQT) believes in this projet. CQT believes that this project will contribute to the establishment of international data standards for the performing arts and their adoption by anyone who publishes data in Wikidata anywhere in the world.
    Ultimately, this project will also contribute to the discoverability of the performing arts, since Wikidata is a knowledge base prized by search engines and recommendations.
    At the time of filing of this project (February 2020), there are only 21 organizations working in the performing arts in Canada in Wikidata. We are rather confident that this number will increase by at least 100% over the course of the project. Joana Neto Costa
  • Support Support Great initiative and good to see a proposal that uses WMF funds to attract a greater amount in matched funding. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:50, 18 February 2020 (UTC)
  • Support Support I endorse this project which promises substantial advances in establishing a linked open data ecosystem for the performing arts, drawing both on Wikidata and the classical linked open data approach. I personally see the following reasons to support the project:
  • it proactively addresses current data modelling issues by involving key players at the international level;
  • it is driven by the arts sector itself and has an important outreach component that aims to involve many arts organizations in curating Wikidata entries;
  • it puts data quality above quantity, thus laying a solid foundation for further work to build upon it;
  • it forsees a substantial amount of co-funding alongside the WMF grant.
Once it is up and running, there are several ways to expand on this project:
  • We can promote the use of Wikidata-driven infoboxes on Wikipedia. In the context of the Sum of All GLAMs project, we have started to implement performing arts related infoboxes (example). This practice can be further promoted throughout the Wikipedia communities when we have high-quality, properly sourced Wikidata entries for relevant organizations and venues.
  • Once the Swiss Archive of the Performing Arts has published its performance history database as linked open data (coming soon™), I can give out one or several students assignments that directly plug into this project and complement it from the Swiss side.
  • The outreach part of the project can be transferred to other countries; at the Bern University of Applied Sciences efforts are under way to build up a network of institutions and organizations that are able to roll-out such activities across a variety of countries.
--Beat Estermann (talk) 22:58, 18 February 2020 (UTC)
  • Support Support This project can give a boost to a more standardised modelling of performing arts data. Wikidata is a great place to search consensus on this issue that has long been a topic of discussion in the field. It can bring "real" linked open data a step closer in this area. It can show the benefits of a layer of shared data on top of existing databases, a layer that can be enriched by the community at large. The international cooperation in this project is crucial because it brings together different perspectives and languages that need to be aligned. Beireke1 (talk) 08:32, 19 February 2020 (UTC)
  • hugely important! Hope this comes through.
    Best wishes from Flanders Arts Institute, Belgium. --tom Ruettet (talk) 14:00, 19 February 2020 (UTC)
  • Support Support This project fulfills a need for performing arts and cultural heritage in Canada and for performing arts as a whole. It builds into existing initiatives in Wikimedia projects and I full endorse and support this work. Smallison (talk) 15:23, 24 February 2020 (UTC)
  • Support Support I support the project and can help as part of the working group, with data harmonization, and with data munging/upload/OpenRefine/bots etc. I'm a long-term ontology, GLAM and LOD expert and an active Wikidata contributor --Vladimir Alexiev (talk) 15:23, 1 March 2020 (UTC), Chief Data Architect at Ontotext, Bulgaria
  • Support Support As a venue active in presenting in the performing arts field, the need for improved access to usable open data that can connect each discipline and role within the performing arts industry is great. A universal platform such as wikidata offers far more promise than initiatives developed within geographic or specific disciplines. Glenn Brown Sanderson Centre for the Performing Arts

COVID-19 planning[edit]

This project was originally designed to take place almost exclusively online. All core components of our work plan - modelling, population and trainging activities - are planned to take place online.

We were only planning two offline activities.

1- a kick-off meeting with Canadian participants

This meeting will now be held online rather in person. This won't alter or harm our project goals.

2- a presentation at an international conference

Presentations at conferences are valuable outreach activities but they are in no way the only means to share the project's outcome with an international audience. We can share project outcomes in journals, magazines and newsletters. If the restrictions on mass gatherings last beyond the duration of the project, we can reasonably anticipate that online gatherings will be organized in lieu of live events, in which case we might have more opportunities to reach an international audience at a lower costs.

If anything, the COVID pandemic may positively impact the project. The performing arts sector is deeply affected by public health orders on gathering and social distanciation. As a result, many performing arts workers are short of work and are looking for productive ways to use their confinment time. CAPACOA received several inquiries about this Wikidata project from people and organizations who are eager to start populating data in Wikidata. We intend to kick off the project as quickly as possible and we may start to deliver training web conferences as early as June (rather than in September as initially planned). In such a scenario, we would deliver a larger number of web conferences and this would increase our knowledge transfer fees.