Grants:Programs/Wikimedia Community Fund/Rapid Fund/QA infrastructure and tools to fix problems on Wiktionary (ID: 22678255)/Final Report
Report Status: Accepted
Due date: 21 November 2024
Funding program: Rapid Fund
Report type: Final
This is an automatically generated Meta-Wiki page. The page was copied from Fluxx, the web service of Wikimedia Foundation Funds where the user has submitted their midpoint report. Please do not make any changes to this page because all changes will be removed after the next update. Use the discussion page for your feedback. The page was created by CR-FluxxBot.General information
[edit]- Applicant username: Tbm
- Organization name: N/A
- Amount awarded: 5000
- Amount spent: 5000 USD, 5000
Part 1: Project and impact
[edit]1. Describe the implemented activities and results achieved. Additionally, share which approaches were most effective in supporting you to achieve the results. (required)
I implemented a prototype of an abstraction layer for Wiktionary which allows to access various data from Python. I have done this for English Wiktionary and Swahili Wiktionary.
I have also implemented a way to store Wiktionary changes to disk for review in bulk; these changes can be applied with a script later. This is for changes that can't be fully automated in a bot and need manual review.
Finally I implemented a number of fixes for various issues found on English Wiktionary and Swahili Wiktionary.
Specifically, I implemented a number of fixes to the translation boxes on English Wiktionary. This includes fixing syntax errors, correcting wrong language names, and making some cosmetic changes. This has resulted in several hundred fixes, which all make the automatic parsing of translation entries easier.
The Swahili Wiktionary has a lot of issues. I started by cleaning up incorrect language headers and other syntax errors. I then also cleaned up translation information in a number of ways, such as converting plain-text text to wiki markup, fixing various syntax errors and correcting some cosmetic issues.
Finally, I implemented several other QA tools to fix common issues. One is to fix language codes in entries (i.e. where the language code of a template does not match the language of the entry). I also wrote some tools to correct common mistakes, such as a number of common typos I identified as well as duplicated words.
In summary, I have implemented QA infrastructure for Wiktionary and implemented a number of QA fixes. This code has already resulted in several hundred fixes and can be used as the basis for further QA fixes.
2. Documentation of your impact. Please use space below to share links that help tell your story, impact, and evaluation. (required)
Share links to:
- Project page on Meta-Wiki or any other Wikimedia project
- Dashboards and tools that you used to track contributions
- Some photos or videos from your event. Remember to share access.
You can also share links to:
- Important social media posts
- Surveys and their results
- Infographics and sound files
- Examples of content edited on Wikimedia projects
- English Wiktionary
- Translations: 945
- Fix language name: 80
- Fix separation of translations: 40
- Add missing trans-top or trans-bottom: 90
- Remove empty lines in translations: 25
- Fix syntax errors: 115
- Fix spacing between definitions: 250
- Fix spacing between language name and definitions: 200
- Fix cosmetics issues: 145
- Other
- Language code: 8
- Remove duplicate words: 5
- Fix typo: 16
- Swahili Wiktionary: 800
- related to language headers: 150
- unbalanced headers: 30
- misc syntax error: 12
- rest: clean-up of translations: fix syntax errors, convert language names to code, cosmetic cleanups
A list of all changes is available here: https://en.wiktionary.org/wiki/User:Tbm/Reports/QA_infrastructure_and_tools_to_fix_problems_on_Wiktionary#Metrics
Additionally, share the materials and resources that you used in the implementation of your project. (required)
For example:
- Training materials and guides
- Presentations and slides
- Work processes and plans
- Any other materials your team has created or adapted and can be shared with others
The Python source code is available on GitHub: https://github.com/tbm/wiktionary-tools
3. To what extent do you agree with the following statements regarding the work carried out with this Rapid Fund? You can choose “not applicable” if your work does not relate to these goals. Required. Select one option per question. (required)
A. Bring in participants from underrepresented groups | Agree |
B. Create a more inclusive and connected culture in our community | Agree |
C. Develop content about underrepresented topics/groups | Agree |
D. Develop content from underrepresented perspectives | Agree |
E. Encourage the retention of editors | Agree |
F. Encourage the retention of organizers | Not applicable |
G. Increased participants' feelings of belonging and connection to the movement | Agree |
F. Other (optional) |
Part 2: Learning
[edit]4. In your application, you outlined some learning questions. What did you learn from these learning questions when you implemented your project? How do you hope to use this learnings in the future? You can recall these learning questions below. (required)
You can recall these learning questions below: While there are a number of QA tools for Wiktionary, a lot of work is needed in this area. I'm curious if the creation of these tools will prompt the community to build more tooling. This aligns well with a similar effort: https://en.wiktionary.org/wiki/Wiktionary:Todo/Lists
Furthermore, I'd like to see if these tools will lead to more cooperation among the different Wiktionary communities.
Finally, we will see if this will prompt a discussion about moving some Wiktionary data to Wikidata in order to remove duplication among the different Wiktionary communities.
I believe it's too early to answer all of these three questions, although they should be revisited in six months or a year. Personally, working on this project has once again confirmed my belief that there needs to be more collaboration, that commons data should be moved to Wikidata and that this would in fact allow more collaboration between the different Wiktionary communities.
5. Did anything unexpected or surprising happen when implementing your activities? This can include both positive and negative situations. What did you learn from those experiences? (required)
I think the positive insight is that tooling can make a huge difference to the quality of Wiktionary.
There were two negative observations, both in terms of underestimating the effort.
I proposed to work on an abstraction layer and on QA fixes (while the emphasis was definitely on the latter as per the title). However, I quickly realized that an abstraction layer is a very elaborate effort that is best a separate project (in fact, several separate projects given the size of Rapid Grants). While I have created a prototype, this needs more work.
Similary, there are many QA fixes that could be created for Wiktionary. While this project has created many fixes and had an important impact, there's a lot left to do. I think I was slightly too optimistic how much can be achieved within one project. In any case, I hope to apply for another grant to continue this work.
6. What is your plan to share your project learnings and results with other community members? If you have already done it, describe how. (required)
I have documented the impact (e.g. metrics) of this work and published the source code. I intend to work with other community members to further refine this work.
Part 3: Metrics
[edit]7. Wikimedia Metrics results. (required)
In your application, you set some Wikimedia targets in numbers (Wikimedia metrics). In this section, you will describe the achieved results and provide links to the tools used.
Target | Results | Comments and tools used | |
---|---|---|---|
Number of participants | 10 | 2 | |
Number of editors | 10 | 2 | |
Number of organizers | 1 | 1 |
Wikimedia project | Target | Result - Number of created pages | Result - Number of improved pages |
---|---|---|---|
Wikipedia | |||
Wikimedia Commons | |||
Wikidata | |||
Wiktionary | 2000 | 0 | 1775 |
Wikisource | |||
Wikimedia Incubator | |||
Translatewiki | |||
MediaWiki | |||
Wikiquote | |||
Wikivoyage | |||
Wikibooks | |||
Wikiversity | |||
Wikinews | |||
Wikispecies | |||
Wikifunctions or Abstract Wikipedia |
8. Other Metrics results.
In your proposal, you could also set Other Metrics targets. Please describe the achieved results and provide links to the tools used if you set Other Metrics in your application.
Other Metrics name | Metrics Description | Target | Result | Tools and comments |
---|---|---|---|---|
9. Did you have any difficulties collecting data to measure your results? (required)
No
9.1. Please state what difficulties you had. How do you hope to overcome these challenges in the future? Do you have any recommendations for the Foundation to support you in addressing these challenges? (required)
Part 4: Financial reporting
[edit]10. Please state the total amount spent in your local currency. (required)
5000
11. Please state the total amount spent in US dollars. (required)
5000
12. Report the funds spent in the currency of your fund. (required)
Provide the link to the financial report https://docs.google.com/spreadsheets/d/1Nl0_yLlUIMcbD3CM83YIpnw9Aq-qaIeEKaiRWp79VRc/edit?gid=0#gid=0
12.2. If you have not already done so in your financial spending report, please provide information on changes in the budget in relation to your original proposal. (optional)
13. Do you have any unspent funds from the Fund?
No
13.1. Please list the amount and currency you did not use and explain why.
N/A
13.2. What are you planning to do with the underspent funds?
N/A
13.3. Please provide details of hope to spend these funds.
N/A
14.1. Are you in compliance with the terms outlined in the fund agreement?
Yes
14.2. Are you in compliance with all applicable laws and regulations as outlined in the grant agreement?
Yes
14.3. Are you in compliance with provisions of the United States Internal Revenue Code (“Code”), and with relevant tax laws and regulations restricting the use of the Funds as outlined in the grant agreement? In summary, this is to confirm that the funds were used in alignment with the WMF mission and for charitable/nonprofit/educational purposes.
Yes
15. If you have additional recommendations or reflections that don’t fit into the above sections, please write them here. (optional)
Review notes
[edit]Review notes from Program Officer:
N/A
Applicant's response to the review feedback.
N/A