Grants talk:IEG/Semi-automatically generate Categories for Vietnamese Wikipedia/Final

From Meta, a Wikimedia project coordination wiki

Final Report Approved[edit]

Dear Alphama,

Thank you for submitting this Final Report! Thank you for taking the time to write up the outcomes of your project. I am approving your report with the following comments:

  • Congratulations on completing your project. It's wonderful that your project resulted in 4000 contributions to Vietnamese Wikipedia.
  • Sometimes projects take longer than expected, for a variety of reasons. We understand that things don't always go as planned, and we are so glad that you found a way to finish your project anyway. Since this project took longer than you originally planned, I wonder if you have any feedback about what kind of challenges you encountered, and whether there is any support you might have needed that would have helped your project go more smoothly?
  • Thanks for your efforts to engage with the editor community around this project. We think this is especially important with semi-automatically generated content. It's very interesting that you emailed more than 1000 editors to invite them to give input but just over 27 members responded with their opinions. Do you have any ideas about why more people didn't respond? Do you have any insight into how participants in the Vietnamese Wikipedia community generally communicate with each other? Of the 27 people who did respond, what kind of feedback did you get back? Did the guidelines you summarized for category creation mostly come from the professor, or did community members have opinions?
  • Do you need support learning how to use pywikibot?
  • What are the maintenance needs you foresee for the Alphama Category tool going forward? How do you anticipate maintenance will be handled?
  • Have you created documentation of the tool that would allow other software developers to be able to understand, replicate or modify the tool in the future if needed? If so, have you linked your documentation to this report and made sure it is discoverable to other software developers? (You may have already provided a link in your report and if so, I apologise for asking this question. I'm not technically trained myself and don't always recognize what is being presented).
  • Thank you for your insights about how to handle translation conflicts. Can you say a little more about how you handled it when a category name contained terms/words that did not appear in the dictionary?

Congratulations again on finishing this project. We appreciate your contributions to improve Vietnamese Wikipedia.

Warm regards,

--Marti (WMF) (talk) 18:58, 15 October 2020 (UTC)[reply]

Do you have any ideas about why more people didn't respond? Do you have any insight into how participants in the Vietnamese Wikipedia community generally communicate with each other? Of the 27 people who did respond, what kind of feedback did you get back? Did the guidelines you summarized for category creation mostly come from the professor, or did community members have opinions?
There are not so much attention to the category taxonomy in our community. I observed most editors care more about the article content and sometimes they think category taxonomy is a task of the bot. Our 27 editors discussed the naming conventions for categories in different cases which you can see our discussions here: [1], [2]
Do you need support learning how to use pywikibot?
Actually, I don't need this. My major is about NLP so pywikibot is relatively easy for me.
What are the maintenance needs you foresee for the Alphama Category tool going forward? How do you anticipate maintenance will be handled?
I will upload the code to GitHub where any editor can follow the project. I manage to switch my platform from .NET to Python so it is easier for developers to develop and maintain it because Python is an easy-to-set-up-and-run programming language.
Have you created documentation of the tool that would allow other software developers to be able to understand, replicate or modify the tool in the future if needed? If so, have you linked your documentation to this report and made sure it is discoverable to other software developers? (You may have already provided a link in your report and if so, I apologize for asking this question. I'm not technically trained myself and don't always recognize what is being presented).
The tool itself contains the guideline in the Help section so anybody can understand how to use this tool. For the development documentation, I do not create this.
Thank you for your insights about how to handle translation conflicts. Can you say a little more about how you handled it when a category name contained terms/words that did not appear in the dictionary?
For translation conflicts, my tool will pass these cases because it may be affected by the category taxonomy and may receive a lot of complaints from the community (if has). To handle the terms/words that do not have in the dictionary, it is a difficult case. I depend on the terms which appear on Internet (Google or somewhere else) and must have to discuss with the community or set a label as "this category name may not have the correct name, please discuss or set delete label if you find this a mistake..."

I will continue to develop this tool to be more automatic and easier to use for editors. Thank you! Alphama (talk) 09:56, 16 October 2020 (UTC)[reply]