Grants:Programs/Wikimedia Community Fund/Supporting existing infrastructure and developing new tools for Wikisource
This is an automatically generated Meta-Wiki page. The page was copied from Fluxx, the grantmaking web service of Wikimedia Foundation where the user has submitted their application. Please do not make any changes to this page because all changes will be removed after the next update. Use the discussion page for your feedback. The page was created by CR-FluxxBot.
Have you received grants from the Wikimedia Foundation before?
- Did not apply previously
Have you received grants from any non-wiki organization before?
Which organization(s) did you receive grants from?
What is your organization or group's mission?
- 1. To improve the index page content on Wikisource by integrating and using the structured data from Wikidata and avoid data duplication over the Wikibase.
- 2. To increase the searchability of video content over the internet.
- 3. To improve the comprehension of video content available in unknown languages for users.
- 4. To make the video content accessible to users who are facing hearing impairment.
If you would like, please share any websites or social media accounts that your group or organization has. (optional)
Please state the title of your proposal. This will also be a title for the Meta-Wiki page.
- Supporting existing infrastructure and developing new tools for Wikisource
Where will this proposal be implemented?
Indicate if it is a local, international, or regional proposal and if it involves several countries? (optional)
If you have answered regional or international, please write the country names and any other information that is useful for understanding your proposal.
What are the main challenges you are trying to solve and your proposed solution?
- Earlier, Wikidata - Wikisource Integration modules were created as a part of the Improving Wikidata-Wikisource Integration project. These integration modules display the metadata and generate categories of the book on the Wikisource index page by pulling necessary information from the respective Wikidata item. These modules were deployed on Punjabi, Tamil, Bengali, and Indonesian Wikisources. But these modules should be deployed manually into the MediaWiki configuration page of each language Wikisource as certain modifications need to be done depending on the constraints of that language Wikisource. The central idea of the project is to improve the integration between Wikisource and Wikidata. Through this project, we aim to convert the existing modules into MediaWiki extensions, which makes it easy to install on Wikisource for all languages. As a part of the project, we would like to develop a few more features like categorizing index pages of the book on Wikisource and automatically adding header template information and license template on the main page of the book on Wikisource by using relevant information available on respective Wikidata items.
Our second challenge is, many videos on Wikimedia Commons don’t have transcript files. Also, there are many endangered languages that don’t have written literary texts but have content in video format. Using a third-party transcription tool is not feasible as tracking edits might get difficult and require multiple logins. Wikimedia doesn’t have a tool that can help its editors in transcribing these videos. We aim to build a transcription tool that can be integrated with the workflows of Wikisource and Wikimedia Commons. This tool will let editors transcribe a video that is present on Wikimedia Commons and generate a transcript file on Wikisource along with a subtitle file within TimedText on Wikimedia Commons. Transcribing a video helps in preserving and documenting the contents of endangered languages. This will in turn create engagement with these language communities and help Wikisource to grow in many languages. It also provides a better understanding to users and improves their comprehension. Transcription also increases the searchability of the video as search engines have the capability of identifying text. The time-synced text on videos makes the content accessible to users who are hard of hearing and provides an experience equivalent to that of viewing a video.
What is the main objective of your proposal?
- The objectives of the project are
- 1. Improve the Wikidata-Wikisource Integration by converting existing Wikidata-Wikisource Integration modules to extensions.
- 2. Develop features that can categorize index pages of books on Wikisource, add header template information, and license template on the main page of books on Wikisource.
- 3. Create a transcription tool that can be used to add transcripts to videos files, that is integrated with the workflows of Wikisource and Wikimedia Commons.
Describe your main strategies to achieve this objective and the main activities you will be developing as part of these strategies.
- Implementation Strategy: This project consists of two parts. The first one is improving the Wikidata-Wikisource integration and the second one is creating a video transcription tool from scratch. These two parts are independent of each other and will be developed by two developers individually. we convert the existing integration modules to MediaWiki extensions. For the first part, we will be using beta Wikisource to avoid interruption to users and the workflow is as follows. We implement a feature flag on Wikisource and add necessary data to respective fields in the index page by retrieving data via Wikibase PHP API from the corresponding Wikidata item available. For easy adaptability of users and to integrate Wikidata base easily with its Index pages, we will be developing Lua-based modules to modify the existing template. We will also develop a tool for mass migration of old Index pages to the Wikidata integrated format and a User Interface to edit and modify Wikidata items inside the Index page itself.
Link to the detailed version of plan: shorturl.at/afkoC
In the second part, we will initially create the necessary helper libraries and integrate them with the backend framework (using Flask or express) and Wikimedia OAuth (for users to edit). In the later steps, we will encode the video and create the frontend framework (using ReactJS or VueJS) and use Material-UI to create the application user interface. We will also parse TimedText from Wikimedia Commons for videos with unfinished transcript files. To make the tool accessible to every Wikimedian user, we will integrate the tool’s i18n with TranslateWiki. Finally, we will use worker queues to serve the heavy requests received from users and deploy the tool on Cloud VPS. Link to the detailed version of plan: shorturl.at/dhxT2
Activities: The timeline & activities performed are as follows: Jan-February 2022: Scoping Discussion with advisors and developers regarding the strategy and make modifications if necessary. March 2022: First part development Start working on converting the modules to MediaWiki extensions. April 2022: Final development and sprint review Finish the making of extensions and features, and test them. May to July 2022: Second part development Start working on building the video transcription tool. August 2022: Final development and sprint review Finish the making of the tool and testing. September 2022: Wrap-up Communicating results, Final Report, tool & extension documentation.
Please state if you will be carrying out any of these activities within your programs? Select all that apply.
- Organizing Meet-up online, Organizing Meet-up offline
Are you running any in-person events or activities?
Please state if your proposal aims to work on any of the identified content knowledge gaps?
Please state if your proposal includes any of these areas or thematic focus.
- Culture, heritage or GLAM , Open Technology
Will your work focus on involving participants from any underrepresented communities? Select all categories that apply.
- Linguistic / Language
Please tell us more about your target participants.
- The target participants for our project will be Wikisource communities. With regards to the MediaWiki extension, we will inform the administrators of the Wikisource communities and engage interested members to participate during the final development of the extension. The testing of the extension will be performed on beta Wikisource and feedback from participants will be noted. After considering the extension to be successful, the administrators of Wikisource communities will be informed about the extension, and the extension will be deployed across Wikisource platforms. For the transcription tool, we will engage with interested members from the Wikisource communities and provide access to the tool during the final development stage for testing purposes. Once the testing is done, we will consider the tool to be successful if the tool produces an output transcript on Wikisource and timed texts are in synchronous with the narration in the video.
Do you have plans to work with other Wikimedia communities, groups or affiliates in your country, or in other countries, to implement this proposal?
Please tell us about these connections online and offline and how you have let Wikimedia communities know about this proposal.
- We will work with a few Indic language Wikisource communities to test the tool and extension which are being developed under the project. We will be engaging with these communities is for the testing of tool. We are planning to conduct an outreach activity and communities will be notified about the outreach once the tool is in the final development stage.
Will you be working with other external, non-Wikimedian partners to implement this proposal?
Please describe these partnerships.
How do you hope to sustain or expand the work carried out in this proposal after the grant?
- Once the MediaWiki extension is deployed across all the Wikisource platforms, we will conduct research along with the community survey and take inputs to identify the opportunities for improving the integration extension by making necessary improvements and ideating new features to it.
In the case of the transcription tool, we will initially come up with a user policy for communities to request permission to access the tool for outreach activities. If the tool consistently receives a large number of requests for more than 6 months, we will scale up the tool by placing a request for better resources or migrating the tool to third-party cloud tools depending on the feasibility. Feedback from users who participated in our outreach activity will be taken into consideration to modify the tool.
Do you have the team that is needed to implement this proposal?
- List of project participants:
- 1. Nivas - Project Coordinator
- 2. Krishna Chaitanya Velaga and SGill (WMF) - Project Advisors
- 3. SWilson (WMF) - Technical Advisor
- 4. Jay Prakash - Technical Advisor or Developer (depending on his availability)
- 5. Sohom Datta - Developer
- 6. One Developer - To be hired.
In what ways do you think your proposal most contributes to the Movement Strategy 2030 recommendations. Select a maximum of three options that most apply.
- Increase the Sustainability of Our Movement, Improve User Experience, Innovate in Free Knowledge
Please state if your organization or group has a Strategic Plan that can help us further understand your proposal. You can also upload it here. (optional)
Learning, Sharing, and Evaluation
What do you hope to learn from your work in this fund proposal?
- 1. Plan and develop technical products that meet the specific project objectives.
- 2. Collect feedback on these products from the communities at the end of the project.
- 3. Identify opportunities to improve these products in the next iterations.
- 4. Develop necessary documentation for users to understand the working of the tool.
Enter a description of the metric and a number in the target field. If the metric does not apply to you, enter N/A for not applicable.
|Number of participants|
|Number of editors|
|Number of organizers|
If for some reason your proposal will not measure these core metrics please provide an explanation.
- Since the major focus of the project is to develop technical products, there are no metrics to measure. But, to measure the success of the MediaWiki extension, we will first communicate the test results of the extension performed on beta Wikisource and deploy the extension across all the Wikisource communities after informing administrators of those communities. For the testing of the transcription tool, we are planning to conduct an outreach program and engage editors from Indic Wikisource communities from the time of final development of the tool. The aim of the program would be to test the tool by transcribing videos and improving the tool through users’ feedback. If the final results are up to the mark, we will inform the communities with regular updates around the tool development. Once the tool is developed completely, we will create necessary documentation for the usage of the tool and conduct workshops to explain the usage of the tool by inviting other interested communities.
Enter a description of the metric and a number in the target field. If the metric does not apply to you, enter N/A for not applicable.
|Number of editors that continue to participate/retained after activities||N/A||N/A|
|Number of organizers that continue to participate/retained after activities||N/A||N/A|
|Number of strategic partnerships that contribute to longer term growth, diversity and sustainability||N/A||N/A|
|Feedback from participants on effective strategies for attracting and retaining contributors||N/A||N/A|
|Diversity of participants brought in by grantees||N/A||N/A|
|Number of people reached through social media publications||N/A||N/A|
|Number of activities developed||N/A||N/A|
|Number of volunteer hours||N/A||N/A|
What other information will you be collecting to learn about the impact of your work? (optional)
What tools would you use to measure each metric selected?
- Since the project is about developing technical products, there are no metrics and no tool is required.
How do you hope to share these results so that others can learn from them?
- Create a video of our experience, Make a short presentation of the experience, Create a training workshop to show others what we learned, Share results with our communities, Develop learning material for other users, Share it on Meta-Wiki
What is the amount you are requesting from WMF? Please provide this amount in your local currency. If you are thinking about a multi-year fund, please provide the amount for the first year.
- 2867040 INR
What is this amount in US Currency (to the best of your knowledge)?
- 39820 USD
Do you want to apply for multi-year funding?
If you have calculated it, please provide an estimate of the year 2 or year 3 request.
Please share your budget for this proposal.
- We will be hiring two developers, one for the conversion of modules to MediaWiki extensions and the other for the development of the transcription tool. We have considered 1200 hours for the development of these products over a period of 7 months. We have also planned two development sprints (one in the month of April and the other in the month of August) during the project period. The purpose of these sprints is to review the development work and determine additional changes by testing these products.
The detailed budget of the project is as follows: Developers - 1200 hrs * $ 21 (1200 hours of development over 7 months for two developers) = $ 25,200 Project Coordinator - 300 * $ 15 (300 hours over 8 months) = $ 3,500 Development sprints - 2 * $ 1,500 (Two sprints covering travel, stay, food costs) = $3,000 Unforeseen Expenses - $ 3,500 (For unexpected expenses during the project period) = $ 3,500 Fiscal Sponsor - $ 3,620 (10% of overall expenses) = $ 3,620 Total budget = $ 39,820
What do you do to make sure there is a good management of funds?
How will you contribute towards creating a supportive environment for participants using the UCOC and Friendly Space Policy?
- To create a safe and inclusive space by minimizing the risk of harm for participants, it is important to introduce the Friendly Space Policy and Universal Code of Conduct in events as participants come from diverse backgrounds. Participants will be informed about the Friendly Space Policy before the beginning of the event and we will encourage them to practice them strictly during the time of the event. We will introduce the participants to the Universal Code of Conduct and explain their responsibilities, expected behaviour and ask them to contact us in case of any inconvenience or trouble at the event. We will also display the short versions of Universal Code of Conduct and Friendly Space Policy at the event venue. Any participant violating the above guidelines will be immediately terminated from the event and necessary action will be taken against them.
Please use this optional space to upload any documents that you feel are important for further understanding your proposal.
- Staffing plan or organogram:
- Other public document(s):
By submitting your proposal/funding request you agree that you are in agreement with the Application Privacy Statement, WMF Friendly Space Policy and the Universal Code of Conduct.
- Please add any feedback to the grant discussion page only. Any feedback added here will be removed.