Community Resources and Partnerships/India Rapid Project/Improving User Experience, Code Quality, and Robustness in LinguaLibre V3
Applicant
[edit]- Main Wikimedia username. (required)
Pushkar7077
- Organization
N/A
- If you are a group or organization leader, board member, president, executive director, or staff member at any Wikimedia group, affiliate, or Wikimedia Foundation, you are required to self-identify and present all roles. (required)
N/A
- Describe all relevant roles with the name of the group or organization and description of the role. (required)
Main proposal
[edit]- 1. State the title of your proposal. This will also be the Meta-Wiki page title.
Improving User Experience, Code Quality, and Robustness in LinguaLibre V3
- 2. and 3. Proposed start and end dates for the proposal.
2025-12-13 - 2026-03-15
- 4. What is your tech project about, and how do you plan to build the product?
Include the following points in your answer:
- Project goal and problem you solve
- Product strategy or project roadmap
- Technical approach (infrastructure, tech stack, key tools and services)
- Integrations or dependencies (if any)
Project Overview
[edit]Lingua Libre will be live soon for editors to use but currently, it faces performance and usability issues such as unstable uploads, lack of clear feedback for users, and inconsistent code quality.
The project aims to improve the user experience of LinguaLibre and implement high-quality code standards to ensure fewer bugs, easier maintenance, and smoother future development.
Lingua Libre is a tool that allows editors to maintain an audiovisual corpus of words in multiple languages, helping preserve and expand linguistic knowledge—especially for minority languages.
Goals
[edit]- Implement resilient background uploads to address the current 380 words/hour rate limit of the Commons upload API.
- Improve user experience by adding pop-ups for error and success messages, resolving existing bugs, and refining the user interface.
- Establish and enforce code quality standards and best practices.
- Optimize application performance wherever possible.
Product Strategy and Roadmap
[edit]Below are the planned tasks with estimated hours. Completing these will significantly advance the project’s goals.
- Resilient upload using toolforge jobs: 70 hours
- This task arises due to Wikimedia strict rate limiting of 380 words/hour on upload
- User on lingualibre upload thousands of recording at once, but they have to often wait for limit to refresh and themselves upload in batches
- This causes very bas user experience and often lost recording
- The task is to allow user to upload all recordings at once, and server will itself manage uploading 380 words/hours through toolforge jobs.
- Recreate Create Locutor step: 30 hours
- Utilize proper inbuilt browser validation instead of current custom validations done to aggressively fit into design.
- Manage “create” and “update” flows correctly.
- Add robust error handling.
- Properly capture and store location data.
- Add error handling and user notifications: 60 hours
- Introduce pop-ups for user feedback (especially errors).
- Currently, the application hangs when something fails; fixing this requires significant frontend refactoring.
- Fix publish step (publishing recordings to Commons): 35 hours
- Introduce a bulk upload API instead of looping single uploads.
- Disable the “Publish” button during upload.
- Fix progress bar issues (currently splits into multiple mini-bars).
- Fix text encoding errors: 4 hours
- Resolve issues with special characters in word and recording names.
- Save recording settings locally: 3 hours
- Retain user preferences for the recording configuration between sessions to improve UX.
- Refactor backend: 35 hours
- Refactor serializers to only manage data formatting, with business logic moved to APIs and services.
- Add error handling throughout to avoid 500 status code errors.
- Implement DRY principles for word generators: 5 hours
Total estimated time: 242 hours. The project can be completed by dedicating approximately 20 hours per week from the proposal’s start to end date. I will personally execute all the work within this timeframe.
- 5. What is the expected impact of your project, and how will you measure success?
Include the following points in your answer:
- Milestones and progress tracking
- Project impact and success metrics
Expected Impact and Measurement of Success
[edit]LinguaLibre empowers speakers worldwide to record and share pronunciations in hundreds of languages, supporting linguistic diversity and Wikimedia’s mission to make knowledge freely accessible. By simplifying recording and improving platform reliability, the project will help scale contributions, especially from communities documenting minority and endangered languages.
Milestones and Tracking:
Progress will be tracked through Phabricator tasks, GitLab commits, and milestone reviews after each feature release, validated by community testing and feedback.
Success Metrics:
- Increased participation: Growth in number of active recorders and total recordings made per month.
- Linguistic diversity: Rise in the number of new or minority languages represented on the platform.
- Platform stability: Reduction in upload failures, error reports, and system downtime.
- Community satisfaction: Positive editor feedback through talk pages, mailing lists, or surveys.
These indicators will demonstrate tangible progress toward a more inclusive and accessible audiovisual knowledge base across Wikimedia projects.
- 6. Who is your target audience, and how have you confirmed there is demand for this project? How did you engage with the Wikimedia community?
Include the following points in your answer:
- Project demand and target audience description
- Links to interaction(s) with Wikimedia community
- Evidence from community consultation such as the [Community Wishlist]
Target Audience & Demand for the Project
[edit]The primary target audience of LinguaLibre includes:
- Speakers of under-represented, regional, minority and oral languages who currently have limited digital presence;
- Wikimedia editors and contributors who want to enrich Wikimedia projects (such as MediaWiki-based wikis, Wikimedia Foundation projects) with physical-pronunciation or audio/visual data;
- Language-learning communities, linguists and educators who rely on freely-licensed pronunciation corpora and audiovisual linguistic resources.
Evidence of demand / community consultation:
- The LinguaLibre meta page states that the platform allows editors to maintain an audiovisual corpus support the development of poorly endowed, minority, regional, oral or signed languages and addresses the gap that of the 7,000 languages in existence today, it is estimated that only 2,500 will survive to the next century and only 250 (less than 5%) will make their digital ascent.
- The project has generated large numbers of recordings (e.g., over 1.4 million recordings + and 315+ languages) as part of its community uptake.
- There is a specific proposal submitted to the Community Wishlist Survey 2017 on Meta (“LinguaLibre’s audio learning mobile app”) which indicates community interest and demand for additional media-/app-based uses of the LinguaLibre dataset. Link
Engagement with the Wikimedia community:
- The LinguaLibre Meta-Wiki page lists a “Community” section including volunteer contributors from many languages, showing that the project is embedded in the Wikimedia contributor ecosystem. meta.wikimedia.org
- LinguaLibre is listed as a Wikimedia community tool in the Wikimedia Foundation article “Many faces of Wikibase: Lingua Libre makes ‛laengwə̌ez-audible’” which highlights its reuse on Wikimedia platforms. wikimediafoundation.org
- The Gitlab of Lingualibre receives lots of contribution from new and experienced Wikimedia volunteers, showing the demand and engagement with the community.
- 7. How will your team predict and manage potential user security and privacy risks, and what risks do you currently see?
Include the following points in your answer:
- The level of in-house or consulted security and privacy expertise you will have available to you during delivery of this project
- How your development, testing, and deployment processes mitigate the introduction of unnecessary security or privacy risks
Security and Privacy Risk Management
LinguaLibre operates within the Wikimedia ecosystem, where user privacy and data integrity are fundamental. The platform only handles public recordings and minimal user metadata, ensuring no sensitive personal data is collected.
Security Expertise:
The project will leverage Wikimedia France’s technical guidance and the Wikimedia Cloud Services (Toolforge) team’s infrastructure support. I currently serve as a volunteer lead developer for LinguaLibre and have worked closely with Wikimedia France since my Google Summer of Code project, gaining familiarity with Wikimedia’s privacy, deployment, and security standards. Security-sensitive features will be reviewed in consultation with experienced Wikimedia developers and mentors.
Risk Management & Mitigation:
- Data handling: All uploads comply with Wikimedia’s privacy and licensing policies.
- Development process: Code changes are reviewed through GitLab merge requests and user safety and security is always prioritized during development.
- Dependencies: Only trusted, actively maintained open-source libraries are used, regularly updated to patch known security issues.
- Infrastructure: Toolforge provides secure, access-controlled hosting managed by Wikimedia Cloud Services.
Current Risks:Potential misuse of upload APIs or input-validation gaps will be mitigated through backend validation, rate limiting, and better error handling.
- 8. Who is on your team, and what is your experience?
Include the following points in your answer:
- Your experience as a developer, relevant past projects
- Wikimedia SUL (developer), Gerrit, Github, Gitlab or other relevant public account handles
- Other team members, their roles and expertise
Team and Experience
I am the sole developer responsible for implementing and delivering this project. User:Yug – Meta-Wiki, a senior volunteer and long-time contributor to LinguaLibre, collaborates with me by bringing community perspectives and reviewing final outcomes. He does not hold implementation responsibilities but ensures alignment with the project’s mission and community expectations.
I serve as the volunteer lead developer for LinguaLibre, contributing for over 19 months through both volunteer missions and funded work supported by Wikimedia France and the Google Summer of Code program. I also work full-time as a software developer experienced in Django, Vue.js, Docker, the technologies powering LinguaLibre. I work closely with Wikimedia France’s technical and community teams to align improvements with user needs.
Developer Profiles:
- GitLab: Pushkar707 · GitLab
- Wikimedia Account: User:Pushkar7077 - Meta-Wiki
- Phabricator: ♟ Pushkar7077
- 9. How will the project be maintained long-term?
Include the long-term maintenance plan with maintainer(s) in your answer. If you expect the long-term maintenance to incur expenses, please list those and the plan for long-term expense coverage.
The project will be maintained under the LinguaLibre technical infrastructure, with continued support from Wikimedia France and volunteer developers, including myself.
All project code will be hosted publicly on GitLab and mirrored on Wikimedia Gerrit/Phabricator to ensure transparency and collaborative maintenance. Documentation and onboarding guides have been/will be prepared to help future contributors easily continue development.
Since the project builds on existing Wikimedia infrastructure, no major new expenses are anticipated beyond standard maintenance efforts by community volunteers
- 10. Under what license will your code be released, and how will you ensure the product is well documented?
Include the following points in your answer:
- Code license and compatibility with Wikimedia projects
- Documentation plan
License : MIT, see LICENSE.md
Documentation: Project already has multiple detailed documentation, which will be updated as work is done
- 11. Will your project depend on or contribute to third-party tools or services?
The project mainly depends on existing Wikimedia infrastructure and doesn't require require any third-party tools or services other than reputed open source libraries
- 12. Is there anything else you’d like to share about your project? (optional)
LinguaLibre directly supports the Wikimedia mission of making knowledge free and accessible for all by preserving and sharing spoken content across languages, especially underrepresented ones. This project will significantly improve the platform’s reliability, accessibility, and contributor experience, helping more people record and share knowledge in their native languages.
The project is in great need of technical funding, and rapid grants can help us take it to new heights. However, I am not relying solely on grant funding. I’m committed to contributing volunteer time to ensure the project’s success. While the normal market rate for my work is around $30/hour, I have proposed only $20/hour to stay aligned with Wikimedia’s mission and to make the best use of the limited funds available.
Budget
[edit]- 13. Upload your budget for this proposal or indicate the link to it. (required)
- 14. and 15. What is the amount you are requesting for this proposal? Please provide the amount in your local currency. (required)
441600 INR
- 16. Convert the amount requested into USD using the Oanda converter. This is done only to help you assess the USD equivalent of the requested amount. Your request should be between 500 - 5,000 USD.
4984.44 USD
- We/I have read the Application Privacy Statement, WMF Friendly Space Policy and Universal Code of Conduct.
Yes
This is an automatically generated Meta-Wiki page. The page was copied from Fluxx, the web service of Wikimedia Foundation Funds, where the user has submitted their application. Please do not make any changes to this page because all changes will be removed after the next update. Use the discussion page for your feedback. The page was created by CR-FluxxBot.
