Grants:Project/Rapid/Chlod/Contributor copyright investigation tool

From Meta, a Wikimedia project coordination wiki
statusfunded
Chlod/Contributor copyright investigation tool
Develop a userscript/gadget to improve the contributor copyright investigations case handling workflow.
targetEnglish Wikipedia, possibly all Wikimedia wikis
start dateMarch 20 April 30
end dateMay 20 August 31
budget (local currency)PHP ~102,000
budget (USD)2000
grant typeindividual
granteeChlod
contact(s)• wiki(_AT_)chlod.net


Review your report

Project Goal[edit]

Briefly explain what are you trying to accomplish with this project, or what do you expect will change as a result of this grant. Example goals include, "recruit new editors", "add high quality content", or "train existing editors on a specific skill".

Copyright enforcement is a policy on the English Wikipedia, with the standard instructions being to remove and request revision deletion on found copyright violations. The Copyright Cleanup WikiProject has been dealing with English Wikipedia Contributor copyright investigation (CCI) cases since late 2009. With over 192 active cases, 181,885 pages to be checked (218,621 in total), numerous images uploaded locally and on Wikimedia Commons, and a continuously growing list of cases, CCI is one of the English Wikipedia's largest backlogs. Case requests and opens are still on an yearly upward trend (with the exception of last year's net -3), with even more cases (particularly those of accounts which have been blocked for persistent copyright violations) waiting to be started and filed.
Anyone (who does not have a history of copyright issues) can work on CCI cases. This process is tedious, however, with much of the work requiring scanning through many diffs, identifying infringing text, and manually editing CCI casepages. Many tools and userscripts aid with the work of editors in removing copyright violations, however none of them provide a decent workflow experience for editors. Such a script has even been requested on Wikipedia:User scripts/Requests some time ago. Although (as of the moment) it is unfeasible to automatically check every page for copyright violations, a specialized userscript or gadget can be made to improve the workflow of CCI editors by making certain processes efficient and semi-automated, much like how vandalism userscripts and gadgets currently exist to speed up the process of dealing with vandalism.
I've been part of the Copyright Cleanup project since last year and I've worked on a couple of cases myself, albeit most of my recent copyright work has been patrolling WikiProject Tropical cyclones (which has an active CCI case). I've also already developed two userscripts (1, 2) that makes dealing with cases (and copyright violations in general) easier. With the proper investment of time and effort, I will be able to create a centralized tool that integrates the features of both, and also add new features that make CCI editing more like ticking off a checklist of tasks.
With the developed userscript/gadget, I hope to encourage more editors to take on the responsibility of cleaning up copyright violations and also to assist not only the English Wikipedia but also other wikis which also suffer from severe copyright problems. Inviting other wikis to use the tool, however, would likely be passive rather than proactive, as it is not possible to spearhead and supervise the CCI operations of every Wikimedia wiki.

Project Plan[edit]

Activities[edit]

Tell us how you'll carry out your project. What will you and other organizers spend your time doing?

During the first month, I'll be rapidly prototyping and developing workflow improvements for CCI.
  • Improve and develop my existing Parsoid document handling library for browsers
  • Create the initial CCI case handling components of the tool (see phab:M324 for mockups)
  • Add the ability to deal with uploaded images (both on local wikis and on Wikimedia Commons)
  • Integrate my existing CopiedTemplateEditor (for modifying {{copied}} on Wikipedia) and InfringementAssistant (for quickly filing entries to Wikipedia:Copyright problems) scripts into the new tool
  • Expand the CopiedTemplateEditor module to graphically edit other attribution templates (e.g., {{translated page}}, {{merged-to}}, {{merged-from}})
  • Expand the InfringementAssistant module to allow for more fine-tuned control over which sections need to be hidden with the {{copyvio}} template
  • Improve the state of copyright workflow templates on Wikipedia
  • Add the ability to search a user's talk page for any copyright-related warnings and determine when and who gave the warning
  • Add the ability to fine-tune the script's features by adding customization options (as I think editors best perform when given a tool that suits them, not when suiting themselves for a tool)
  • Develop an interactive guided to handling copyright problems, akin with the enwiki ACC Wizard, albeit integrated with the script and accessible as a popup or dialog on all pages (to be expanded upon after the grant end date)
During the second month, I'll be continuing work on the tool from the past month, along with the following:
  • Expand the tool further based on requested features from tool users
  • Prepare image copyright modules for standalone use on Wikimedia Commons
  • Consult different communities and identify ways to expand the script and support other wikis with copyright problems
  • Provide localization and internationalization options for the tool
  • Gather feedback from users and identify potential points of improvement, including parts that can be made faster, more efficient, and built upon
Everything to be developed will be released under permissive software licenses, namely the MIT/Expat License and the Apache License 2.0. Like my other scripts, I will be using the OOUI library for developing the user interface, so that extra dependencies are no longer required. The time required is based on my experience working with OOUI in userscripts and in drafting solutions for certain parts of the CCI workflow (which have taken half a month combined for both CopiedTemplateEditor and InfringementAssistant). Given the possible scope of the project, my ongoing academics, and the fact that I'm part of other volunteer projects in active development, I expect to require more than a month to implement all features of the tool, and at least a 2 to 3 weeks to allow for community consultation and development of requested features.
Rest assured, even after the grant's end date, I will continue supporting this project as a maintainer and as a CCI editor. Userscript development (and open-source software development in general) is a strong passion of mine, and one that I intend to keep for the forseeable future.

How will you let others in your community know about your project (please provide links to where relevant communities have been notified of your proposal, and to any other relevant community discussions)? Why are you targeting a specific audience?

WikiProject Copyright Cleanup and other interested editors are targeted as they are going to be the primary beneficiaries of the created tools, and because I require their feedback as part of developing the tool.

What will you have done at the end of your project? How will you follow-up with people that are involved with your project?

  • Have made long-lasting tool that aids editors in dealing with copyright violations on-wiki
  • Have made a modular tool which can allow users to only load specific parts of the tool if they so wish (to optimize loading times)
  • Have invited more editors to contribute in the copyright cleanup field
  • Have helped in decreasing the amount of open cases and bringing the backlog to a continuous net negative in case count

Are you running any in-person events or activities?

No in-person events and activities will be done.

Impact[edit]

How will you know if the project is successful and you've met your goals?

  • Increased number of WikiProject Copyright Cleanup participants
  • Increased efficiency of CCI case handling and cleanup
  • Reduction of extant copyright violations on the English Wikipedia
  • Have more CCI cases closed and finished in 2022 than requested and opened
  • Encourage other wikis to deal with copyright-infringing text in a systematic manner

Resources[edit]

What resources do you have? Include information on who is the organizing the project, what they will do, and if you will receive support from anywhere else (in-kind donations or additional funding).

Myself, my personal computer, and the skills I've gained from developing Wikipedia userscripts (particularly mastery in using the OOUI library) and RedWarn (particularly mastery in UI design and userscript optimization), a counter-vandalism script for the English Wikipedia. Aside from unofficial comments and advice given by other volunteer editors, I will not receive support (particularly in the form of financial support) from anywhere else for this project.

What resources do you need? For your funding request, list bullet points for each expense:

  • USD 2000 for 2 months of software development time and effort, akin to previous tool-related Rapid grants (1, 2, 3).

Endorsements[edit]

  • The English Wikipedia desperately needs more tools (and volunteers!) for combating copyright violations—this tool is being designed and built with the CCI team in mind, and should absolutely be funded to allow Chlod the time and space to develop this -- TNT (talk • she/her) 21:25, 6 February 2022 (UTC)
    Hi, TNT! I've made some substantial changes to the grant proposal since you've endorsed. Mind taking a look again and seeing if all the changes were alright with you? Chlod (say hi!) 02:24, 10 February 2022 (UTC)
    I reaffirm my endorsement :) -- TNT (talk • she/her) 02:53, 10 February 2022 (UTC)
  • * Pppery * it has begun 02:29, 10 February 2022 (UTC)
  • I endorse this 100%. I've been wanting a CCI script for a while and this is perfect. It will be great for us editors which work on CCIs. Thanks Chlod for creating this! The4lines (talk) 02:30, 10 February 2022 (UTC)
  • Thank you and stay sane. If the code isn't a jungle, the copyvio is. Sennecaster (talk) 02:35, 10 February 2022 (UTC)
  • Endorse While I personally don't work in CCI backlogs a lot myself, I have seen enough cases of Copyvio to know what a big problem it can be. I've also seen similar tools in action that massively ease user functionality, making processing backlogs much much easier. The details of the proposal look reasonable, and I see enough details in here to be convinced of the submittee's technical capabilities. Such a tool would encourage me to check out CCI backlog as well. Wholeheartedly endorse. Soni (talk) 02:39, 10 February 2022 (UTC)
  • Endorse I like to think this will help process the CCI for the WikiProject I am involved in. Nova Crystallis (Talk) 04:00, 10 February 2022 (UTC)
  • Very helpful for the backlog Qwerfjkl (talk) 07:15, 10 February 2022 (UTC)
  • Endorse, Chlod has already been incredibly helpful in the CCI community with InfringementAssistant. His record of creating countless other scripts shows he has the skills required for such a project, which would greatly help the CCI community (and good luck!) Berrely • TC 07:20, 10 February 2022 (UTC)
  • Endorse. Everything to help with copyright problems is welcome. I can't support this critical work by myself. MER-C 20:40, 10 February 2022 (UTC)
  • Endorse. A tool like this has been overdue for a long time, and Chlod has much previous experience. Canvassing disclosure: I was linked to this page on Discord, however comments are my own and I was not encouraged to endorse or make any comments in one way or another. EpicPupper (talk) 00:12, 11 February 2022 (UTC)
  • Endorse, I do think that this tool will be vital for copyright investigations, especially at smaller wikis where it is almost nonexistent. (I just wander around and found this project, so no canvassing.) CactiStaccingCrane (talk) 09:28, 13 February 2022 (UTC)
  • Endorse - anything that makes CCI easier is something we want. Chlod is an experienced userscript dev, and I have full confidence in their ability to see this project through. firefly ( t · c ) 10:54, 13 February 2022 (UTC)
    • Forgot to say - if I can be useful in any way (assisting with development or beta testing), please ping me :) firefly ( t · c ) 10:55, 13 February 2022 (UTC)
      • Will do, of course. ;) Chlod (say hi!) 04:10, 16 February 2022 (UTC)
  • Endorse - Copyvios are serious issues that affect all of us. There is no denying that Chlod is an extremely skilled open source developer who will utilise the funds well. Ed6767 (talk) 01:14, 17 February 2022 (UTC)
  • I could have sworn I already did this, but absolutely endorse this Asartea Talk (Enwiki Talk (preferred)) 17:18, 24 February 2022 (UTC)