Grants:IEG/Web Application to automate frequent tasks for Tamil Wikisource and Tamil Wiktionary

From Meta, a Wikimedia project coordination wiki
statusnot selected
Web Application to automate frequent tasks for Tamil Wikisource and Tamil Wiktionary
summaryA web application for users to make repeating changes to/bulk edit multiple wiki pages, and to do frequent tasks.
targetTamil Wiktonary and Tamil WikiSource
strategic priorityincrease participation
amount8500 USD
contact• tshrinivasan@gmail.com
this project needs...
volunteer
join
endorse
created on19:31, 11 April 2016 (UTC)

Project idea[edit]

What is the problem you're trying to solve?[edit]

Recently developed a tool to integrate Google OCR and Indic wiki sources. https://github.com/tshrinivasan/OCR4wikisource based on the request here https://phabricator.wikimedia.org/T120788 Using this indic wiki source projects added around 4,00,000 pages.

As these pages are from OCR, the text requires repeated replacements, cleanups, markup additions etc.

In Wiktionary and WikiSource projects for Tamil Language, we have to make these changes in multiple pages or all pages for given category. Windows users are using Auto Wiki Browser tool for this purpose. GNU/Linux users have pywikibot based commandline tools. But many beginners feel tough to work with command line tools. Pywikibot has a steel learning curve to do automation.

What is your solution?[edit]

I am planning to write a web application in Python Language to help the Tamil WikiSource and Tamil Wiktionary Communities.

As it is a web application, any user of Windows, Mac OSX, GNU/Linux or any OS can use it easily.

Project goals[edit]

To provide a tool to automate the repeated edit tasks, so that human can work on real manual works.

The following are the frequent requests from Tamil WikiSource and Tamil Wiktionary communities. Currently most of the tasks are done manually. They take huge amount of man hours to do these tasks.

  1. Add text on top or bottom of the pages
  2. Find and replace the set of words provided as csv files
  3. Create wiktionary pages with CSV files
  4. Change proofread quality status for specific set of pages
  5. Find and report broken links on pages
  6. Add message or template for lonely pages, short pages.
  7. Auto welcome new users
  8. Daily/Weekly/Monthly reports on contributors stats
  9. Users stats for the efforts on any special events like edit-athon
  10. Send bulk message/ notify users on any events or announcements.

Pages can be given individually or taken from given categories or from URL patten matching.

The tool will give web interface to design their solutions themself to automate the repeated tasks.

This application can be hosted on tools labs so that anyone can access it online, easily.

Project plan[edit]

Activities[edit]

Phase 1: Build this web application to do frequent wiki edit tasks.

Add the following features.

  • Add text on top or bottom of the pages
  • Find and replace the set of words provided as csv files
  • Change proofread quality status for specific set of pages
  • Create wiktionary pages with CSV files

Test and Deploy.

Phase 2:

Add the following features.

  • Find and report broken links on pages
  • Add message or template for lonely pages, short pages.
  • Auto welcome new users

Test and Deploy.

Phase 3:

Add the following features.

  • Daily/Weekly/Monthly reports on contributors stats
  • Users stats for the efforts on any special events like edit-athon
  • Send bulk message/ notify users on any events or announcements.

Test and Deploy.

Budget[edit]

  • Project Manager : 600 USD/Month
  • Developer : 400 USD/Month
  • Tester : 400 USD/Month
  • Packager : 300 USD/Month
  • Total : 1700 USD/Month

Project will be completed in 5 months.

Total budget is 1700 x 5 = 8500 USD


Community engagement[edit]

Will get inputs from Tamil wiktionary, wikisource community for the requirements, UI design and usage patterns before developing the application.

Will engage them as community testers to provide feedback. Improvements will be communicated to them via villagepumps every week. Based on the feedback, features/bugs fixes will be added.

Sustainability[edit]

The tool will be available for community to develop further. We will be providing further support and maintain for one more year. Will try to bring more developers by training new contributors.

Measures of success[edit]

Increase the automation tool users to 50% more on Tamil Wiktionary and wikisource communities as most of contributors do repeated changes manually.

Get involved[edit]

Participants[edit]

tshrinivasan - A python developer for 3 years, developer of OCR4WikiSource, a tool to integrate google OCR and indic wikisource projects. https://github.com/tshrinivasan/OCR4wikisource

Community notification[edit]

Please paste links below to where relevant communities have been notified of your proposal, and to any other relevant community discussions. Need notification tips?

https://ta.wikipedia.org/wiki/விக்கிப்பீடியா:ஆலமரத்தடி_%28தொழினுட்பம்%29#.E0.AE.A4.E0.AE.BE.E0.AE.A9.E0.AE.BF.E0.AE.AF.E0.AE.99.E0.AF.8D.E0.AE.95.E0.AE.BF_.E0.AE.B5.E0.AE.BF.E0.AE.95.E0.AF.8D.E0.AE.95.E0.AE.BF_.E0.AE.95.E0.AE.B0.E0.AF.81.E0.AE.B5.E0.AE.BF_-_IEG

Endorsements[edit]

Do you think this project should be selected for an Individual Engagement Grant? Please add your name and rationale for endorsing this project below! (Other constructive feedback is welcome on the discussion page).

  • Shrinivasan has demonstrated his commitments and skills for Wikimedia projects time and again. I am very hopeful that an AWB like tool with more features for GNU/Linux users will be very useful. Ravi (talk) 08:44, 16 April 2016 (UTC)