Jump to content

Grants:Programs/Wikimedia Community Fund/Rapid Fund/Open Speaks:Building Wikimedia Project-Specific Media from Film Archives (ID: 22766407)/Final Report

From Meta, a Wikimedia project coordination wiki
Open Speaks: Building Wikimedia Project-Specific Media from Film Archives
Rapid Fund Final Report

Report Status: Under review

Due date: 17 March 2025

Funding program: Rapid Fund

Report type: Final

Application

This is an automatically generated Meta-Wiki page. The page was copied from Fluxx, the web service of Wikimedia Foundation Funds where the user has submitted their midpoint report. Please do not make any changes to this page because all changes will be removed after the next update. Use the discussion page for your feedback. The page was created by CR-FluxxBot.

General information

[edit]
  • Applicant username: Psubhashish
  • Organization name: N/A
  • Amount awarded: 4985.69
  • Amount spent: 6150.92 USD, 8856.62

Part 1: Project and impact

[edit]

1. Describe the implemented activities and results achieved. Additionally, share which approaches were most effective in supporting you to achieve the results. (required)

The grant helped launch OpenSpeaks/Archives, a digital repository for low-resource languages. By bringing to life 20 video files in five Indigenous and/or low-resource languages, the Archives enriched Wikipedia and Wiktionary (and other Wikimedia projects) in nearly 25 languages. It contributed to co-creating a language diversity-focused edit-a-thon, Wiki Loves Languages, led by a Wikimedian colleague on my home wiki. The insights gained will enrich OpenSpeaks, a resource that has supported many archivists, including a WMF-funded project. Six previously unsupported languages are now available for subtitles on Commons and other Wikimedia projects by identifying and engaging the Wikimedia developer community. Several small tools were developed, offering the potential for independent projects to aid language archivists.

The grant contributed to Wikipedia, Wikidata, Wiktionary, and Commons. Archived documentary footage was relicensed (CC BY-SA 4.0), edited, subtitled by language experts, and translated into at least one neighbouring major language and English, enabling multilingual dissemination. Among the languages, Kusunda was near-extinct and this project contributed to the ongoing revival, Bonda is endangered, Ho and Van Gujjari are vulnerable, and Baleswari-Odia is largely undocumented in audio-visual media. Additionally, the project co-led an edit-a-thon on Odia Wikipedia, creating 75 new articles about diverse languages and their communities.

2. Documentation of your impact. Please use space below to share links that help tell your story, impact, and evaluation. (required)

Share links to:

  • Project page on Meta-Wiki or any other Wikimedia project
  • Dashboards and tools that you used to track contributions
  • Some photos or videos from your event. Remember to share access.

You can also share links to:

  • Important social media posts
  • Surveys and their results
  • Infographics and sound files
  • Examples of content edited on Wikimedia projects

Links:

OpenSpeaks Archives project page (https://meta.wikimedia.org/wiki/OpenSpeaks/Archives) on Meta-Wiki, detailing outcomes, list of Wikipedia/Wikidata/Wiktionary entries created and improved

Media files uploaded on Commons (in 2024 — https://commons.wikimedia.org/wiki/Category:Files_uploaded_from_OpenSpeaks_Archives_in_2024 and 2025 — https://commons.wikimedia.org/wiki/Category:Files_uploaded_from_OpenSpeaks_Archives_in_2025)

Additionally, share the materials and resources that you used in the implementation of your project. (required)

For example:

  • Training materials and guides
  • Presentations and slides
  • Work processes and plans
  • Any other materials your team has created or adapted and can be shared with others

OpenSpeaks (https://en.wikiversity.org/wiki/OpenSpeaks), a set of resources and practical guides

BBC Accessibility guidelines for subtitles: https://www.bbc.co.uk/accessibility/forproducts/guides/subtitles/

3. To what extent do you agree with the following statements regarding the work carried out with this Rapid Fund? You can choose “not applicable” if your work does not relate to these goals. Required. Select one option per question. (required)

Our efforts during the Fund period have helped to...
A. Bring in participants from underrepresented groups Strongly agree
B. Create a more inclusive and connected culture in our community Strongly agree
C. Develop content about underrepresented topics/groups Strongly agree
D. Develop content from underrepresented perspectives Strongly agree
E. Encourage the retention of editors Neither agree nor disagree
F. Encourage the retention of organizers Agree
G. Increased participants' feelings of belonging and connection to the movement Agree
F. Other (optional)

Part 2: Learning

[edit]

4. In your application, you outlined some learning questions. What did you learn from these learning questions when you implemented your project? How do you hope to use this learnings in the future? You can recall these learning questions below. (required)

You can recall these learning questions below: How do do improving Wikipedia entries related to Indigenous and low-resourced languages and speakers with audiovisual recordings contribute to long-term linguistic assertion?


How audio-visual linguistic resources help language activists?


How open knowledge projects like Wikipedia help low-resource languages and communities

Q: How do improving Wikipedia entries related to Indigenous and low-resourced languages and speakers with audiovisual recordings contribute to long-term linguistic assertion?

A: Improving Wikipedia entries about Indigenous and low-resourced languages and speakers with audiovisual recordings draws more attention to these languages and cultures, helps disseminate them, and encourages the speakers to promote them.

The five languages this project focused on have limited resources for their use, sustainment, and growth. Wikipedia and Wikimedia projects can contribute to their growth in two key ways: a) by becoming informational and educational resources for readers, and b) by acting as a community activation tool for potential new Wikimedians.

There was a significant dearth of information about these focus languages, their speakers and the speakers' cultures on Wikipedia and other Wikimedia projects. Larger language dominance and the knowledge gap on Wikipedia are directly related and must be addressed[1][2]. For instance, there was not a single article about the Van Gujjar people or their language, Van Gujjari, as we started this project. Wikipedia is widely regarded as a curated information compendium, and such a lack of information discourages communities. New Wikipedia, Wikidata, and Wiktionary entries were created in a volunteer capacity, and this project contributed images and videos to enrich these entries. This aside, languages are beyond text. Audiovisual information adds significant depth and context to encyclopedic knowledge, bringing more attention and support to languages. We hypothesize that information about communities, their languages, and the cultures represented in Wikimedia projects creates a ripple effect. First, language-speaking communities and their diasporas and others find more reliable cultural and linguistic information about the communities. On the other hand, underrepresentation of Indigenous and vulnerable communities leads to disempowerment.[3] This project addressed this issue by inviting native, fluent speakers to use subtitle media to have conversations with other native speakers and some fluent and semi-fluent non-native interviewers. We argue that native language content on Wikimedia projects can immensely grow media representation, considering Wikipedia's content is reused thanks to their popularity and openness (Creative Common license), thereby contributing to Indigenous and other low-resourced language community's assertion.

Footnotes:

1. Vrana, Adele Godoy; Sengupta, Anasuya; Bouterse, Siko (2020-10-13). "16 Toward a Wikipedia For and From Us All" (https://wikipedia20.mitpress.mit.edu/pub/myb725ma/release/2). PubPub. Retrieved 2025-01-18.

2. Crampton, Katie (2021-04-03). "What does decolonisation mean for Wikipedia?" (https://wikimedia.org.uk/2021/04/what-does-decolonisation-mean-for-wikipedia/). WMUK. Retrieved 2025-01-18.

3. Pollard, Bryan. "More Than News: Indigenous Media Empowers Native Voices and Communities" (https://www.americanindianmagazine.org/story/Indigenous-media). NMAI Magazine. Retrieved 2025-01-17.

Q: How audio-visual linguistic resources help language activists? A: Language activists use audio-visual linguistic resources in several ways: as language learning and teaching aids, advocacy tools, and to complement language and cultural information online. Audio-visual resources hosted online are easily disseminated widely and with less effort, especially when shared via large WhatsApp, Telegram, or Facebook groups. Illustrating how a language is used in a natural environment immensely helps with language learning and teaching, and videos are quite effective as pedagogical tools[4]. This project helped create and enhance Wiktionary word lists (see Bonda — https://en.wiktionary.org/wiki/Appendix:Bonda_word_list, and Kusunda — https://en.wiktionary.org/wiki/Appendix:Kusunda_word_list) with descriptive videos that help understand how certain words are pronounced and their use in sentences.

Footnotes:

4. "Using Videos for Teaching Language". Academic Technology Solutions. 2018-10-24. Retrieved 2025-01-18.

Q: How open knowledge projects like Wikipedia help low-resource languages and communities? A: Open knowledge projects like Wikipedia help promote low-resource languages and communities, bringing more attention to and contribution to related topics. Such projects also help the speaker communities disseminate trustworthy information within the community and diaspora, thereby helping promote their languages. Activists, advocates, and scholars use such resources for language promotion, dissemination, and research and to organise and fetch more resources for their work.

Except in some contexts, readers widely trust information provided via open knowledge projects like Wikipedia. Wikipedia is also a popular online resource—Wikipedia articles often appear as the first link in a web search. However, Wikipedia and similar projects lack adequate information about many low-resource languages and communities. Hosting videos on Wikimedia Commons, subtitling them in multiple languages, and embedding them in external projects has become easier, opening new opportunities beyond Wikipedia. For instance, the videos are from previously published, notable movies and can be used on Wikisource or Wiktionary, as well as on Wikipedia and Wikidata, independently without citing additional reliable sources and not merely as representational media.

5. Did anything unexpected or surprising happen when implementing your activities? This can include both positive and negative situations. What did you learn from those experiences? (required)

The implementation became much more complex and elaborate than originally planned as the project progressed. The challenges include coordinating with four translators and training them about subtitling, creating subtitles offline and in two to three languages for each video, syncing subtitle track changes with a video editing workflow, converting extremely large video files to lighter files to be able to send via messenger apps, and cleaning up audio from noisy, outdoor recordings, complex, unpredictable editing workflow exhausting the budget, and extra overall work leaving less time for promised volunteer activities. The editing workflow demanded two to three times more than the budgeted time for several videos. I also could not edit as many Baleswari-Odia videos as I had planned to edit—I plan to volunteer my time and upload them slowly. On the other hand, the opportunities include creating new and improving existing Wikipedia, Wikidata, Wiktionary and Wikidata entries in a volunteer capacity using the uploaded media files, collaborating with scholars of four Indigenous and low-resource languages, and finding ways to address on-ground challenges.

One unanticipated event was a highlight—an edit-a-thon focused on low-resourced languages and speaker communities on Odia Wikipedia that Wikimedian Aliva Sahoo led. While brainstorming with her, I shared how so many languages are spoken in Odisha, yet there is little information on Odia Wikipedia.

I plan to write in detail about my strategy to address the above-mentioned issues on Diff. In short, I should have been conservative in proposing fewer videos and voluntary contributions. Since no automatic subtitling or machine translation is available for the target languages, manual subtitling, multilingual translation, and editing and review were cumbersome. I did not consider how challenging and time-consuming editing and syncing subtitles (time-coded transcriptions) into videos would be. I was unaware that one of the translators could only subtitle offline. This unanticipated challenge required creating a makeshift solution. There were many small yet critical technical functions, such as converting large video files into smaller ones to send to translators via messenger apps to review and identify gaps. Many of the solved issues, as well as the identified ones, will help improve OpenSpeaks, an open and public repository for language documentation.

6. What is your plan to share your project learnings and results with other community members? If you have already done it, describe how. (required)

a. Diff post (in draft)

b.I plan to improve OpenSpeaks , an award-winning resource I founded in 2017, to support language activists and archivists in documenting Indigenous and low/medium-resource languages.

c. I plan to apply for further grants to hire a developer and fine-tune some of the tools I created for specific functions. They are not production-ready and are mostly command-line-based. These tools include an offline subtitle editor, a video converter to send production media files for review via messaging apps, and dummy subtitle creation by identifying gaps in speech recording.

Part 3: Metrics

[edit]

7. Wikimedia Metrics results. (required)

In your application, you set some Wikimedia targets in numbers (Wikimedia metrics). In this section, you will describe the achieved results and provide links to the tools used.

Target Results Comments and tools used
Number of participants 40 25 Shared with various communities individually, discussed with participants of an in-person event, reached various Wikipedia communities via village pump as the videos were uploaded
Number of editors 6 6 Three Wikimedians involved in the Odia Wikipedia edit-a-thon, both advisors who are Wikimedians in two Wikipedias, one translator who is a Wikimedian.
Number of organizers 5 5 Five translators for five languages, including the project lead
Wikimedia project Target Result - Number of created pages Result - Number of improved pages
Wikipedia 10 7 30
Wikimedia Commons 30 22 25
Wikidata 20 15 10
Wiktionary 2 3
Wikisource 0 0
Wikimedia Incubator 1
Translatewiki
MediaWiki
Wikiquote
Wikivoyage
Wikibooks
Wikiversity
Wikinews
Wikispecies
Wikifunctions or Abstract Wikipedia

8. Other Metrics results.

In your proposal, you could also set Other Metrics targets. Please describe the achieved results and provide links to the tools used if you set Other Metrics in your application.

Other Metrics name Metrics Description Target Result Tools and comments
2 GLAM collaborations: Two major GLAM collaborations are underway: all projects to be acquired by an archive with international acquisition and another online language archive
2 New GLAM project creation:

An open knowledge-focused language archive launched; Wiki Loves Languages (https://meta.wikimedia.org/wiki/Wiki_Loves_Languages) co-launched with maiden edit-a-thon

3 Training provided:

Three participating community-based researchers trained for offline subtitling following EBU-TT 1.0, a widely used international subtitling standard

5 Tools created:

Multiple command-line and graphic user interface tools created (though not production ready and requires additional) and used for collaborative editing, subtitling, video production and distribution

8 Language support on-wiki an other sites:

a. 2 Phabricator bugs reported: Reported the missing languages in various Wikimedia lists, resulting in Wikimedia community developers solving them. This fixing benefitted both this and another WMF-supported project. (See https://phabricator.wikimedia.org/T381934 and https://phabricator.wikimedia.org/T383785)

b. Amara.org language addition: Project's focus languages added to the Amara.org language list, benefitting all future subtitles in these languages

9. Did you have any difficulties collecting data to measure your results? (required)

No

9.1. Please state what difficulties you had. How do you hope to overcome these challenges in the future? Do you have any recommendations for the Foundation to support you in addressing these challenges? (required)


Part 4: Financial reporting

[edit]

10. Please state the total amount spent in your local currency. (required)

8856.62

11. Please state the total amount spent in US dollars. (required)

6150.92

12. Report the funds spent in the currency of your fund. (required)

Provide the link to the financial report https://docs.google.com/spreadsheets/d/17G6T6DzX2G8jxPuDKs7RjzGjJeXFfrAg/edit?usp=sharing&ouid=102044103954646713033&rtpof=true&sd=true


12.2. If you have not already done so in your financial spending report, please provide information on changes in the budget in relation to your original proposal. (optional)

The financial report has updated the actual number of hours required to finish the project:

a) 45 hours instead of the estimated 40 hours for experts b) 55 hours instead of the estimated 40 hours for live audio/video subtitle editing c) 25 hours instead of the estimated 10 hours of live audio/video editing d) 10 hours to create technical tools for audio, video, and subtitle editing that was not planned. The project demanded these but didn't have any provision. Their needs didn't arrive during the planning, either.

All other fields include actual expenses, which are slightly different in some cases than the estimated ones as the actual expenses, as well as forex rates, changed in reality at a different time than originally planned.

The WMF grants staff is aware of the extra out-of-pocket expenses due to budget constraints.

13. Do you have any unspent funds from the Fund?

No

13.1. Please list the amount and currency you did not use and explain why.

N/A

13.2. What are you planning to do with the underspent funds?

N/A

13.3. Please provide details of hope to spend these funds.

N/A

14.1. Are you in compliance with the terms outlined in the fund agreement?

Yes

14.2. Are you in compliance with all applicable laws and regulations as outlined in the grant agreement?

Yes

14.3. Are you in compliance with provisions of the United States Internal Revenue Code (“Code”), and with relevant tax laws and regulations restricting the use of the Funds as outlined in the grant agreement? In summary, this is to confirm that the funds were used in alignment with the WMF mission and for charitable/nonprofit/educational purposes.

Yes

15. If you have additional recommendations or reflections that don’t fit into the above sections, please write them here. (optional)


Review notes

[edit]

Review notes from Program Officer:

N/A

Applicant's response to the review feedback.

N/A