GLAM Wiki 2023/Program/Tags/Content uploads & workflows

From Meta, a Wikimedia project coordination wiki
Logo GLAM Wiki Conference 2023 ID: 2026 Open Refine Wikimedia Commons training session
Facilitators/Speakers: Spinster Time block: Morning Beginning: 10:30
Location: 401 Duration: 1 hr 15 min
Description:

A workshop on using OpenRefine to edit and upload files on Wikimedia Commons, with a focus on structured data. This workshop is for participants with an intermediate level: some prior knowledge of / experience with structured data on Wikimedia Commons is expected, and some experience with OpenRefine is desired.

After this session, attendees will have basic knowledge about using OpenRefine to edit and upload files to Wikimedia Commons, with a focus on structured data.

Experience level: Intermediate
Keywords: Capacity building & training, Linked data, Content uploads & workflows
Notes: #GLAMWiki232026
Next session: Music on Wikimedia projects


Logo GLAM Wiki Conference 2023 ID: M001 Music on Wikimedia projects
Facilitators/Speakers: Akbar Ali, Rute Correia Time block: Morning Beginning: 11:45
Location: Auditorium Duration: 30 min
Description:

In this session, you will learn about the Malayalam Mala Project (from Kerala, India) and One Year of Wiki Loves Música Portuguesa: how a thematic campaign helped create a strategic path for GLAM at Wikimedia Portugal.

The Malayalam Mala Project is a Wikimedia initiative to collect and preserve the lyrics of Malayalam malas, which are traditional songs of praise sung by Muslims in Kerala. Malas are a rich part of Kerala's cultural heritage, and they provide a unique way to learn about the history and traditions of the region. The Malayalam Mala Project is part of the GLAM project in Kerala, which is a larger initiative to digitize and preserve Kerala's cultural heritage. The Malayalam Mala Project collected the lyrics from a variety of sources, including manuscripts, published works, and oral tradition. The lyrics will be transcribed into Unicode and uploaded to Wikisource.To bring these malas to life, the project encouraged singers from the local community to record and share their renditions of these odes. These audio recordings will be made available in Wikimedia Commons, ensuring wider accessibility and preserving these traditional melodies for future generations.

The campaign Wiki Loves Música Portuguesa launched in November 2022. The project was developed as an aggregator: on one hand, it would leverage the ongoing Wikimedian residency at NOVA FCSH; on the other hand, the wide scope of the project could enhance other partnerships for Wikimedia Portugal, particularly regarding cultural heritage. Indeed, the initiative sparked multiple partnerships with other institutions, including Portugal's national public service broadcaster, amongst others, revolutionizing the work at WMPT. In this session, we will showcase our biggest achievements, reflect on the biggest challenges we faced and get feedback from the GLAM Wiki community on how we can do better.

Experience level: Begginer
Keywords: Content uploads & workflows, Wikimedia campaigns, Partnership building (GLAM Wiki collaborations, etc.), Good practices on digital heritage, Oral history and documentation
Notes: #GLAMWiki23M001
Next session: Co-Creating an AI Responsive GLAM-Wiki Curriculum


Logo GLAM Wiki Conference 2023 ID: 2306 DPLA's Digital Asset Pipeline: How we uploaded 4 million images of cultural heritage to Commons (so far)
Facilitators/Speakers: Dominic Byrd-McDevitt Time block: Beginning:
Location: Duration:
Description:

This talk will be an in-depth technical treatment of the Digital Public Library of America's digital asset pipeline—which is responsible for uploading 4 million images (by November, estimated) to Wikimedia Commons, as well as adding nearly 100 million structured data statements. DPLA is a national aggregator of cultural heritage metadata in the United States. DPLA's project has allowed it to become the largest overall contributor to Wikimedia Commons, and generate hundreds of millions of pageviews for its participating institutions. This presentation is a companion to DPLA's other proposal, which is primarily a discussion of the issues of strategy and movement capacity relating to the program—and this proposal is specifically to provide detailed information about how the technology actually works.

I will provide an overview of DPLA's organizational structure and its aggregation initiative, which makes all this possible. I will give a walkthrough of the DPLA Wikimedia Commons project and how it works. I will then spend the bulk of the presentation discussing the actual operation of our Wikimedia account and how we have accomplished what we have. Our bot is a set of scripts written in Python, which use pywikipediabot. We run the bot on an AWS server, with a script that use aggregated metadata from our partners to determine items that are eligible for Commons and downloads them to S3. We must also transform the data from DPLA's data model to wikitext for upload, using a crosswalk. This wikitext is becoming increasingly ephemeral (hopefully someday unnecessary) as we transition to Structured Data on Commons. A separate data synchronization script is run periodically across all of DPLA's uploads, and adds/updates the metadata from the source in the form of structured data statements, so that the data can be displayed in Lua-powered templates.

I hope this case study will provide insights for others trying to replicate any piece of this workflow on their own project.

Participants will leave the session with:

1. Technical aspects of bulk Wikimedia Commons upload from GLAM collections 2. Adding cultural heritage metadata as Structured Data on Commons, and running continuous updates 3. How iterative approaches allow technical projects to scale up over time

Experience level: Advanced
Keywords: Content uploads & workflows, Free, Libre & Open Source Software (FLOSS) for cultural heritage, Tech, platforms & tools
Notes: #GLAMWiki232306
Next session: Wikisource workshop


Logo GLAM Wiki Conference 2023 ID: 2257 Using Wikidata integration on the Wikimedia projects to enhance GLAM-WIKI content sharing
Facilitators/Speakers: Mike Peel, João Peschanski Time block: Morning Beginning: 9:00
Location: 401 Duration: 45 min
Description:
Slides
Releasing media content from GLAM on Commons has been really successful and important over many years. However, it's important it doesn't exist there in a vacuum, but instead gets integrated into the rest of the Wikimedia projects so that it is visible and used widely. We highlight the use of Wikidata as an excellent strategy to do this. In particular, the way that content is then automatically reused and visible across many projects, e.g., various language Wikipedia infoboxes and lists (using Listeria), as well as Commons category infoboxes. The same applies also to metadata added directly to Wikidata, which can be used e.g., in references (using Cite Q), dramatically increasing its visibility. We will cover other tools that can be used to manipulate and display media using Wikidata and its query service - both describing them and providing a how-to guide for how to implement them for your projects. We also mention how developers can also integrate with Wikidata within their applications and websites. We invite questions and conversations to identify future opportunities for Wikidata integration.

Participants will gain an understanding of Wikidata and its uses for distributing, querying and visualising GLAM-WIKI content

Experience level: Beginner
Keywords: Capacity building & training, Content uploads & workflows, Tech, platforms & tools
Notes: #GLAMWiki232257
Next session: DPLA's Digital Asset Pipeline: How we uploaded 4 million images of cultural heritage to Commons (so far)


Logo GLAM Wiki Conference 2023 ID: 2355 Using the Campaign Product Event Registration Tools
Facilitators/Speakers: Alex Stinson Time block: Morning Beginning: 10:30
Location: 411 Duration: 45 min
Description:

The Campaign Product team has begun deploying features of Event Registration and Discovery to Meta Wiki. With the first deployment of features in 2023, this workshop will show how to use the tool, give folks a chance to express interest in it, and help give feedback on key features. To learn more you can visit the Campaign Product tools.

Participants will understand how the Event Registration feature works, learn about upcoming event discovery features, and learn how to ask for access to the first version of the tool.

Slides

Experience level: Beginner
Keywords: Content uploads & workflows, Tech, platforms & tools, Wikimedia campaigns
Notes: #GLAMWiki232355


Logo GLAM Wiki Conference 2023 ID: 2306 DPLA's Digital Asset Pipeline: How we uploaded 4 million images of cultural heritage to Commons (so far)
Facilitators/Speakers: Dominic Byrd-McDevitt Time block: Beginning:
Location: Duration:
Description:

This talk will be an in-depth technical treatment of the Digital Public Library of America's digital asset pipeline—which is responsible for uploading 4 million images (by November, estimated) to Wikimedia Commons, as well as adding nearly 100 million structured data statements. DPLA is a national aggregator of cultural heritage metadata in the United States. DPLA's project has allowed it to become the largest overall contributor to Wikimedia Commons, and generate hundreds of millions of pageviews for its participating institutions. This presentation is a companion to DPLA's other proposal, which is primarily a discussion of the issues of strategy and movement capacity relating to the program—and this proposal is specifically to provide detailed information about how the technology actually works.

I will provide an overview of DPLA's organizational structure and its aggregation initiative, which makes all this possible. I will give a walkthrough of the DPLA Wikimedia Commons project and how it works. I will then spend the bulk of the presentation discussing the actual operation of our Wikimedia account and how we have accomplished what we have. Our bot is a set of scripts written in Python, which use pywikipediabot. We run the bot on an AWS server, with a script that use aggregated metadata from our partners to determine items that are eligible for Commons and downloads them to S3. We must also transform the data from DPLA's data model to wikitext for upload, using a crosswalk. This wikitext is becoming increasingly ephemeral (hopefully someday unnecessary) as we transition to Structured Data on Commons. A separate data synchronization script is run periodically across all of DPLA's uploads, and adds/updates the metadata from the source in the form of structured data statements, so that the data can be displayed in Lua-powered templates.

I hope this case study will provide insights for others trying to replicate any piece of this workflow on their own project.

Participants will leave the session with:

1. Technical aspects of bulk Wikimedia Commons upload from GLAM collections 2. Adding cultural heritage metadata as Structured Data on Commons, and running continuous updates 3. How iterative approaches allow technical projects to scale up over time

Experience level: Advanced
Keywords: Content uploads & workflows, Free, Libre & Open Source Software (FLOSS) for cultural heritage, Tech, platforms & tools
Notes: #GLAMWiki232306
Next session: Wikisource workshop


Logo GLAM Wiki Conference 2023 ID: 2304 Wikidocumentaries for collaboration
Facilitators/Speakers: Tuukka Hastrup Time block: Morning Beginning: 11:15
Location: 410 Duration: 45 min
Description:

Wikidocumentaries is a website that anyone can use to navigate Wikimedia content in a visual way. It brings together data, images and texts related to any topic based on connections through Wikidata. Materials from connected external media repositories are displayed together with Wikimedia content, and they can be showcased in many ways, for example as photo collections, maps and graphs.

We invite collaboration to integrate Wikidocumentaries with other Wikimedia and GLAM projects. For some years already, Wikidocumentaries displays upload banners on pages about items in the Finnish and the Indian Wiki Loves Monuments contests. This practice requires that the cultural heritage designations have been added to Wikidata.

This year we have worked with Structured Data on Commons as part of the Google Summer of Code program. Wikidocumentaries now makes it possible for anyone to select and upload suitable images from external media repositories to Wikimedia Commons. The workflow places the image in the correct Commons category and adds some basic Structured Data on Commons statements. This functionality was implemented by our GSoC intern Zexi Gong and it integrates with the Finnish national aggregator Finna. The workflow can be extended to more cultural heritage repositories with public APIs.

Faceted browsing and filtering of Commons images based on SDC statements was enabled by Tuukka Hastrup.

Wikidocumentaries started with the aim to also serve knowledge from the margins. Such information isn’t always in the scope of the more established Wikimedia projects, but Wikidocumentaries can fetch such data from other Wikibases in addition to Wikidata and visualize all the data together. This gives e.g. heritage communities an opportunity to overlay and aggregate their data with that of Wikimedia projects. One direction planned with our partners is to develop and deploy password-protected Wikibases that would allow a heritage community to share or protect assets at will.

We wish to propose Wikidocumentaries as a platform for collaborative reuse scenarios. With that in mind, we invite the participants to identify potential opportunities for collaboration as well as blocker issues. This is also an opportunity to discuss media reuse scenarios within the Wikimedia projects more generally.

Meta page: Wikidocumentaries

Experience level: Intermediate
Keywords: Content uploads & workflows, Re-use & re-interpretation of digital heritage, Tech, platforms & tools
Notes: #GLAMWiki232304


Logo GLAM Wiki Conference 2023 ID: M006 Wikimedians in Residence: Sharing experiences
Facilitators/Speakers: Alice Kibombo, Gorana Gomirac, Federico Colman Time block: Afternoon Beginning: 16:00
Location: 410 Duration: 1 hr
Description:

With the growing acknowledgment of the contribution of cultural institutions and allied partners to the Wikimedia Movement, many of the former have opted for the expertise of a Wikimedian in Residence for a number of reasons. From the outside looking in, a Wikimedian in Residence is an abstract figure and the need for them may raise more questions than answers. From helping to create and foster sustainable relationships between the host institution and the Wikimedia Movement, to customising the nature of participation and contribution by and to the community, no two residences are the same. Through a moderated session. at least 4 WiRs will share their experiences at host institutions and what motivated the need for their presence. This will help the audience:

  • Understand who is a WiR, what is the need and role of Wikimedians in Residence (WiR) in their work with partner institutions?
  • Review the goals, outcomes, and lessons learned from specific WiR projects in their respective talks
  • Discover and share best practices for contributing quality content at scale
  • Discuss and propose future types of collaboration with identified external partners.
Experience level: Beginner
Keywords: Partnership building (GLAM Wiki collaborations, etc.), Content uploads & workflows
Notes: #GLAMWiki23M006