Jump to content

Community Wishlist Survey 2020/Wikisource/Improve workflow for uploading books to Wikisource

From Meta, a Wikimedia project coordination wiki

Improve workflow for uploading books to Wikisource

  • Problem:
Uploading books to Wikisource is difficult.
In the current workflow you need to upload the file on Commons, then go to Wikisource and create the Index page (and you need to know the exact URL). :The files need to be DJVU, which has different layers for the scan and the text. This is important for tools like Match & Split (if the file is a PDF, this tool doesn't work).
More importantly, the current workflow (especially for library uploads) includes Internet Archive, and the famous IA-Upload tool. This tool is now fundamental for many libraries and uploaders, but it has several issues.
As Internet Archive stopped creating the DJVU files from his scans, the international community has struggled solving the issue of creating automatically a DJVU for uploading on Commons and then Wikisource.
This has created a situation where libraries love Internet Archive, want to use it, but then get stuck because they don't know how to create a DJVU for Wikisource, and the IA-Upload is bugged and fails often.
    • IA-Upload tool is bugged and fails often when creating DJVU files.
    • M&S doesn't work with PDF files.
    • Users do not expect to upload to Commons when transferring files from Internet Archive to Wikisource.
    • Upload to Internet Archive is an important feature expecially for GLAMs (ie. libraries).
  • Who would benefit:
    • all Wikisource communities, especially new users
    • new GLAMs (libraries and archives) who at the moment have an hard time coping with the Wiki ecosystem.
  • Proposed solution:
Improve the IA-Upload tool: https://tools.wmflabs.org/ia-upload/commons/init
The tool should be able to create good-quality DJVU from Archive files, and do not fail as often as it does now.
it should also hide, for the end-user, the uploading to Commons phase. The user should be able to upload a file on Internet Archive, and then use the ID of the file to directly create the Index page on Wikisource. We could have an "Advanced mode" that shows all the passages for experienced user, and a "Standard" one that makes things more simple.