Grants:Programs/Wikimedia Community Fund/Rapid Fund/Localization and Internationalization of Wikidata Taiwan Community (ID: 22972239)/Final Report
Application type: Standard application
Part 1: Project and impact
1. Describe the implemented activities and results achieved. Additionally, share which approaches were most effective in supporting you to achieve the results. (required)
Our project can be categorized into two parts: the first is the translation, which took place in the first half of 2025, and the presentation, which mainly took place in the second half of the year. As we identified that our biggest challenge as a community right now is the lack of support from documentation to strengthen our arguments and decision making, the main goal of this project is to bring in more such authoritative resources into the context of Taiwan communities. Since we are in close contact with local small-language communities, we expanded our scope beyond the Chinese audience to include the audiences from those languages, who face similar issues though their situations are far more complex than ours.
Focus back on the project itself, as mentioned above, the first part of this project is translation. We identified the OCLC white paper of Wikibase as a high-interest document because it's both a respected source and also reflects many aspects of our learning from years of community activities. Therefore, we started our translation process, first from English into Chinese (the result is open to the public, hosted on Wikidata Taiwan's HackMD notes); Early in the translation process, we soon understand that this is way too much for our partners to translate into their respective languages, for it is both too long and too "technical", we will expand on this topic under the learning section. To mend this issue, after the translation of the full report, we composed a shorter, easier to understand community summary as the material for the community partners to translate into their tongue, and in the end, we have curated 4 more versions of the translation: Taiwanese plus three branches of Seediq dialects.
End note for the translation part: the total translation process (EN-CH) took about 3 full months and a bit more, as for the community versions, estimated to be about 1.5 months. The translation process (EN-CH) has few nuances standing out beside the common translation struggle, but the community translation yielded some note-worthy observation (though not the center scope of this project, we will explore this part later in the learning section).
The second part of the project presentation. We participated in/held 4 talks: the first one is at the Coscup, which mainly face to public who may or may not have experience with Wikidata; the second is at Nantou, deep in the community center of Seediq people; third is at Chiayi, co-hosted with g0v, a small expedition to new communities; and last is at Taipei with the Light Box Library. Although the main subject is the same across each presentation, due to the special context of each event, the presentations are vastly different. Coscup, for its audience, is the most general, only covering the most basic idea of Wikidata and what OCLC's report pointed out. The second presentation mainly faced toward the indigenous partners, thus it is the most special of the four. The third presentation is also at the generic side, yet, we have a more concentrated discussion on the technologies and issues surrounding it. In the last presentation with the Light Box Library, we focused our attention on the image resources, which weren't discussed in any previous presentation. We will dive in deeper on what we've learned from these event later in the learning section.
End note for the presentation part: We record during each talk, and also at the end of each event, we post a brief report on Wikimedia Diff to serve both as a record of the event and as one of our main sharing channels. The highlight of the presentation concentrated mostly at the discussion session at the end of the event. One of the common issues keep sparking up in the event are questions surrounding AI technology; whether they are a threat or a blessing? how can we cooperate with it? how should we adapt to this emerging opportunity? However, to answer these question is outside of the scope of this project. But this is a important issue to consider nonetheless.
To sum, our project has a simple methodology, yet with the right target and the write audiences, we generated so much more than we have expected. I believe the key to our success lies within the intricate network accumulated through our years of effort. And it is through the in-person connection that we generatedmany of our learning in this project.
2. Documentation of your impact. Please use space below to share links that help tell your story, impact, and evaluation. (required)
Share links to:
- Project page on Meta-Wiki or any other Wikimedia project
- Dashboards and tools that you used to track contributions
- Some photos or videos from your event. Remember to share access.
You can also share links to:
- Important social media posts
- Surveys and their results
- Infographics and sound files
- Examples of content edited on Wikimedia projects
1. Main project page - https://hackmd.io/@wikidata-tw/oclc2019report 2. 1/4 talks event page - https://meta.wikimedia.org/wiki/Wikimedia_Taiwan/Wikidata_Taiwan/COSCUP_2025/%E9%8F%88%E7%B5%90%E9%96%8B%E6%94%BE%E8%B3%87%E6%96%99%E7%9A%84%E6%95%88%E7%9B%8A%E8%88%87%E6%8C%91%E6%88%B0_-_%E6%AA%A2%E8%A6%96_OCLC_%E5%B0%8D_Wikibase_%E7%9A%84%E5%AF%A6%E9%A9%97%E8%88%87%E6%88%90%E6%9E%9C#%E8%8F%AF%E8%AA%9E%28zh%29 3. 2/4 talks event page - https://hackmd.io/@wikidata-tw/S14QHFf9ge 4. 3/4 talks event page - https://g0v.hackmd.io/@jothon/20251207?_gl=1*1w3yw2t*_ga*Mzg0MDY2MjIxLjE3NTYxODE0MjM.*_ga_NGVZMM6DR6*czE3NzE1NjU1MDUkbzYxJGcxJHQxNzcxNTY1NTA1JGo2MCRsMSRoNzYyMjkwNjc4 5. 4/4 talks event page - https://hackmd.io/@wikidata-tw/B1fayc4--x
Additionally, share the materials and resources that you used in the implementation of your project. (required)
For example:
- Training materials and guides
- Presentations and slides
- Work processes and plans
- Any other materials your team has created or adapted and can be shared with others
- Presentation Slides:
- https://docs.google.com/presentation/d/1jF0Mnig7Ft825pCz-5kYxBHONqg4_puw/edit?usp=drive_link&ouid=104904623846889968485&rtpof=true&sd=true
- https://docs.google.com/presentation/d/119ETjSa2GdzPWLJvWjXjIg-bS4fCA8Zn/edit?usp=drive_link&ouid=104904623846889968485&rtpof=true&sd=true
- Community Summary
3. To what extent do you agree with the following statements regarding the work carried out with this Rapid Fund? You can choose “not applicable” if your work does not relate to these goals. Required. Select one option per question. (required)
| A. Bring in participants from underrepresented groups | Strongly agree |
| B. Create a more inclusive and connected culture in our community | Strongly agree |
| C. Develop content about underrepresented topics/groups | Strongly agree |
| D. Develop content from underrepresented perspectives | Agree |
| E. Encourage the retention of editors | Neither agree nor disagree |
| F. Encourage the retention of organizers | Neither agree nor disagree |
| G. Increased participants' feelings of belonging and connection to the movement | Agree |
| F. Other (optional) |
Part 2: Learning
4. In your application, you outlined some learning questions. What did you learn from these learning questions when you implemented your project? How do you hope to use this learnings in the future? You can recall these learning questions below. (required)
You can recall these learning questions below: As this project stems from a practical need for more documentation and knowledge to support our work, those would be the most straightforward learning we expected to achieve. Besides that, the main learning we expected to acquire is the how-to related to working with small language communities, What do they need? What would be useful, and how can we help to facilitate a smoother experience to encourage further participation in the movement?
The learning, echoing the project, is two-fold. First, the OCLC white paper is a valuable documentation for our effort, for it not only is an authoritative source, it also reflects many findings we've gathered through our years of community activities. The main goal of this translation project was achieved without much incidents, though this does lead to our desire for more quality resources to further expand our capabilities. The workflow is straightforward and requires little extra alteration. However, we believe, in the future, we need to shift our focus to establishing a scale able workflow. Since the biggest speed bump in this project is the limited translation power we have access to. In order to expand further and increase our capacity, establishing a replication workflow becomes a important strategic goal. The result might also prove to be beneficial to the small language partners since they have a even weaker translation capacity.
Speaking of which, another important learning is that during the translation period, we soon realized that the original text would be too much for the communities to handle due to the sheer volume of text and also the depth of many concepts. And as expected, when the community partners finally got the community summary we distilled from the source, they still provided feedback that the text is too difficult to be both properly understood and thus translated. Though in the end the translation is successful, this exposes one of the very core challenges that we have to face in our movement to popularize technologies. The gap in terms of technology literacy is big, and this could be an enormous challenge for newcomers if not properly supported and could lead to early abandonment. This finding echos back to our original argument calling for more documented resources, and the situation is even more complex for indigenous and other small language communities for they not only have limited resources to work with they are constantly needing to create new words to accommodate the new ideas they encountered.
For small language communities, no matter where they come from, due to various historic/cultural/economic reasons, they are often disadvantaged in terms of acquiring science and other high education resources; and the result of this challenge is clearly demonstrated in our project. The community translators all face the similar challenging issue of lexical gap, missing words to properly reflect the original ideas. Though this is a very common dilemma to every translator, this is more significant for the small language communities as their user are not as abundant as the main language in professional sectors like English. The direct example of this effect can be seen in words as simple as like "data", "framework", or even "library"; often resulting extra burden on the translator's shoulder.
This segues to the second part of our learning, gathered from the presentations. During the second talk we held at Natou, we invited the translators to give us a short presentation on their translation. This event is particularly valuable for we have a chance to see how multiple translators tackle the same material in different logics and different tactics to overcome the lexicon gaps, featuring the different personalities of each language group. From this experience, we can deduce many important pieces of information about the communities and how we should cooperate in the future. First, is that this kind of exercise is at the edge of our current capabilities with introducing new knowledge into the small language communities; anything more professional or technical is almost certain to fail. However, this doesn't mean we cannot push more knowledge into the communities, but we need to pre-process the info into a more general language before they can properly digest it. And only after years of foundation work could we expect this issue to be improved. Second is that, even though translation is a straightforward process, taking the meaning of a word from on language and reflect it in others, it would be best to first establish a workflow guide in this kind of cooperation project. This is not to limit the creativity of the translator but to provide it with a supportive common ground for translators to share ideas and support each other's exploration.
Overall, we learn that regarding small language communities (particularly the indigenous community in Taiwan) they are facing complex issues which there's no simple solution to, other than general infrastructural support, as they have yet develop enough resources to sustain themselves. One of the biggest challenges that we can clearly identify from this project is that there is a big gap in documented resources and consensus between community members, resulting in very fundamental misalignment even at the personal level. One potential solution we proposed to this challenge is that we might need to, in future projects, not only provide more intense support at individual level, but also establish community level guidelines for the project, aiming to effectively relocate the load on the participants to the community. And most importantly, through this process, we hope to slowly accumulate documents and build up knowledge within the community for future project to thrive upon.
5. Did anything unexpected or surprising happen when implementing your activities? This can include both positive and negative situations. What did you learn from those experiences? (required)
Talking about the unexpected, though not totally out of our expectations, AI technology again and again sprouts in the discussion at events. Whether it's the general public, indigenous communities, or professional workers, everyone was worried or interested in this newly emerging technology. The discussion can be boiled down into a few topics: 1. How would Wikidata/Wikipedia's effort work with AI tools like LLM? 2. Are there experiments in the Wikimovement to integrate AI tools with Wikidata/Wikipedia data? 3. How would AI tools impact our work? Either a threat or helping hands?
As these topics are clearly out of our profession, we cannot offer answers beyond the general idea that AI tools are now an inevitable future; we need to learn it whether we like it or not. Experimentation is necessary, yet implementation needs more research to proves their reliability. However, from a pure technical point of view, Wikidata should be one of the, if not the most important, infrastructures in the era of generative AI, for it can serve as the bridge between abstracted LLM models with human-backed information. However, this inevitably leads to the issue that Wikidata as a database has yet reach its full potential, many gaps remain unfilled, and many experiment to be explored. We need more people to start sharing and improving the dataset, and most importantly start finding innovative and practical uses to the datasets.
Beside this, there are many more minor interest point sparkling through out all our event, we recommend to watch our event record to explore the full aspect of the event. If we have more time we will also slowly popularize our learning to Wiki Diff for everyone to consume.
6. What is your plan to share your project learnings and results with other community members? If you have already done it, describe how. (required)
Beside the common channel like Meta page, facebook, twitter, we document and share our learnings mainly through Diff posts in both Chinese and English. Through these records, we hope it can fuel our future talks with other communities or simply act as the digital assets of our community and keep bringing exposure to our work.
The following are the posts to the events we've held in this project:
- https://diff.wikimedia.org/2025/09/09/coscup-20th-anniversary/
- https://diff.wikimedia.org/2025/11/05/wikidata-taiwan-13th-birthday-taiwan-special-event-recap-part-2/
- https://diff.wikimedia.org/2026/01/08/open-chiayi-civic-tech-community-gathering/
- https://diff.wikimedia.org/2026/02/13/the-spark-of-linked-data-and-libraries-oclc-passage-project-white-paper-translation-report/
Part 3: Metrics
7. Wikimedia Metrics results. (required)
In your application, you set some Wikimedia targets in numbers (Wikimedia metrics). In this section, you will describe the achieved results and provide links to the tools used.
| Target | Results | Comments and tools used | |
|---|---|---|---|
| Number of participants | 45 | 64 | We have translated a document into Chinese, 3 dialects of Seediq, and Taiwanese. Each language has 1 main translator.
Also, after the translation, we hosted 4 talks, which reached at least 60 more audiences. |
| Number of editors | 0 | 0 | Though we didn't introduce new editors into the movement, we do have at each talk introduce the general editorial process of Wikidata. |
| Number of organizers | 2 | 2 |
| Wikimedia project | Target | Result - Number of created pages | Result - Number of improved pages |
|---|---|---|---|
| Wikipedia | |||
| Wikimedia Commons | 40 | 40 | |
| Wikidata | 0 | ||
| Wiktionary | |||
| Wikisource | |||
| Wikimedia Incubator | |||
| Translatewiki | |||
| MediaWiki | |||
| Wikiquote | |||
| Wikivoyage | |||
| Wikibooks | |||
| Wikiversity | |||
| Wikinews | |||
| Wikispecies | |||
| Wikifunctions or Abstract Wikipedia |
8. Other Metrics results.
In your proposal, you could also set Other Metrics targets. Please describe the achieved results and provide links to the tools used if you set Other Metrics in your application.
| Other Metrics name | Metrics Description | Target | Result | Tools and comments |
|---|---|---|---|---|
9. Did you have any difficulties collecting data to measure your results? (required)
Yes
9.1. Please state what difficulties you had. How do you hope to overcome these challenges in the future? Do you have any recommendations for the Foundation to support you in addressing these challenges? (required)
The main contribution of our project is the translated document and the subsequent talks and reports generated directly from the document. The exact impact of such a project is highly qualitative and difficult to estimate with simple numbers of participants or edits. Since we mainly post our report on Wiki Diff, it would be a great opportunity to have access to more metrics to understand the impact of our posts on the site, such as traffic counts or other data (which, as far as I'm aware, is quite limited).
Part 4: Financial reporting
[edit]10. Please state the total amount spent in your local currency. (required)
159194.13
11. Please state the total amount spent in US dollars. (required)
5136.63
12. Report the funds spent in the currency of your fund. (required)
Provide the link to the financial report https://docs.google.com/spreadsheets/d/1yw9abJQ585eMZjKU6FBiyO4xwDBvhx6Did7WN7Taonc/edit?usp=sharing
12.2. If you have not already done so in your financial spending report, please provide information on changes in the budget in relation to your original proposal. (optional)
There's no change in the budget; However, due to the exchange rate fluctuation across the project timeline, the budget was "inflated" over the original applied amount and has some fuzziness on the exact number. Overall speaking, the exchange loss is estimated at around 7k NTD (200$) whcih basically eats up all the leftover budget.
13. Do you have any unspent funds from the Fund?
No
13.1. Please list the amount and currency you did not use and explain why.
N/A
13.2. What are you planning to do with the underspent funds?
N/A
13.3. Please provide details of hope to spend these funds.
N/A
14.1. Are you in compliance with the terms outlined in the fund agreement?
Yes
14.2. Are you in compliance with all applicable laws and regulations as outlined in the grant agreement?
Yes
14.3. Are you in compliance with provisions of the United States Internal Revenue Code (“Code”), and with relevant tax laws and regulations restricting the use of the Funds as outlined in the grant agreement? In summary, this is to confirm that the funds were used in alignment with the WMF mission and for charitable/nonprofit/educational purposes.
Yes
15. If you have additional recommendations or reflections that don’t fit into the above sections, please write them here. (optional)
Since the main learning section focused mainly on the indigenous communities, I hope to use this section to talk about the other three event and the learning generated in these events.
First start in the Coscup, for it is the least specialized and the audiences aren't particularly interested in Wikidata. The presentation mainly focus on the big strokes of Wikidata as a infrastructure to record information in a more efficient manner. Build on top of this, in Chiayi, since we have a more focused audiences, we advanced our presentation to dive deeper into the technical aspects of Wikidata and Opendata as a whole. Either way, we successfully sparks up more engagement among the audiences and have establish a first connection with one if the local community which have similar vision and precious resources that we have yet to acquire. But how far can this cooperation go remain uncertain. Finally at the last event, we work with the Light Box Library, a community focus on curating Taiwan's photography resources, to explore even deeper into the OCLC white paper, and examine how the image documents can be recorded into Wikidata. The discussion sparks even deeper conversation on the current situation between cultural workers and Generative AI tools. Interested, worried, curious, the question all deserve a project of its own to explore.
To sum, this project is our first attempt to push our movement into the next stage. Previously we mainly focus on practical, hands-on project that is highly task-oriented. This isn't bad in any sense, only first hand experience can generate the most authentic learning and memories. However, this method have a very obvious down fall which is that only the participant of the project can learn from it. Therefore, we expect to correct this gap by increasing our effort in the documentation department and focus on transforming our accumulated learnings and experiences into documents for more people to see to read and to understand. Our strategy is to revitalize our outreach capabilities back to the pre-covid level, by increase our exposure to the public one post at a time. Also these are all foundational works to be done if we ever wish to strengthen our onborading process for the community. Afterall conversion rate is the most important index, and motivation is the key.