Grants:IdeaLab/Djvu text layer editor

status: idea

IdeaLab	meet more people	visit more ideas
join other ideas

project:

please add a title

idea creator:

Alex brollo

project contact:

alex.brollogmail.com

participants:

User:Alex brollo

summary:

Use some of VE features to edit djvu text layer

created on: 12:10, 13 March 2014

Project idea[edit]

What is the problem you're trying to solve?[edit]

Wikisource makes a large use of OCR text layer, but effectively uses just a little bit of it (naked text). Djvu text layer contains much more information (words, lines, paragraphs, regions, columno, page text coordinates), unluckily better exportable in a lisp-like format or as xml instead of hOCR.

What is your solution?[edit]

To test VE or other WYSIWYG simpler html/xml editors for editing text only, saving information wrapped into xml tags;
to test conversion extraction/upload of text layer into djvu files using a simple web interface.

Ideas for a test tool[edit]

A test could be done with existent tools:

djvuLibre (running into Tool Labs), and particularly:
- djvutoxml, that extracts internal mapped text of djvu pages as an xml file;
- djvuxmlparser, that loads back modified mapped text into djvu file;
tinyEditor, to edit xml text with a WYSIWYG comfortable interface (xml tags are hidden, only editable text is shown into any html textarea;
a little bit of cgi from Tool Labs to manage such a web editing interface.

Project goals[edit]

to split proofreading into two steps:
- djvu text editing (saving the result into djvu text layer)
- text formatting

Get involved[edit]

Welcome, brainstormers! Your feedback on this idea is welcome. Please click the "discussion" link at the top of the page to start the conversation and share your thoughts.