Grants:IdeaLab/Djvu text layer editor

From Meta, a Wikimedia project coordination wiki
Jump to navigation Jump to search
status: idea
Idea Lab
Meet more people
more people
Visit more ideas
more ideas
join other ideas
please add a title
idea creator:
project contact:
Use some of VE features to edit djvu text layer
created on: 12:10, 13 March 2014

Project idea[edit]

What is the problem you're trying to solve?[edit]

Wikisource makes a large use of OCR text layer, but effectively uses just a little bit of it (naked text). Djvu text layer contains much more information (words, lines, paragraphs, regions, columno, page text coordinates), unluckily better exportable in a lisp-like format or as xml instead of hOCR.

What is your solution?[edit]

  • To test VE or other WYSIWYG simpler html/xml editors for editing text only, saving information wrapped into xml tags;
  • to test conversion extraction/upload of text layer into djvu files using a simple web interface.

Ideas for a test tool[edit]

A test could be done with existent tools:

  • djvuLibre (running into Tool Labs), and particularly:
    • djvutoxml, that extracts internal mapped text of djvu pages as an xml file;
    • djvuxmlparser, that loads back modified mapped text into djvu file;
  • tinyEditor, to edit xml text with a WYSIWYG comfortable interface (xml tags are hidden, only editable text is shown into any html textarea;
  • a little bit of cgi from Tool Labs to manage such a web editing interface.

Project goals[edit]

  • to split proofreading into two steps:
    • djvu text editing (saving the result into djvu text layer)
    • text formatting

Get involved[edit]

Welcome, brainstormers! Your feedback on this idea is welcome. Please click the "discussion" link at the top of the page to start the conversation and share your thoughts.

See also[edit]

Does this idea need funding? Learn more about WMF grantmaking. Or, expand to turn this idea into an Individual Engagement Grant proposal
Step 1. Change your infobox from IdeaLab to IEG:

Step 2. Create the rest of your IEG proposal:

Ready to create the rest of your proposal?
Use the button below just once to create the remaining sections you'll need!

Need more help?