Talk:Wikisource across projects

From Meta, a Wikimedia project coordination wiki

Managing the book[edit]

Hey, great proposal! I'm really excited about these projects, and I think linking them together from the start is wise. Looking through the proposal, I saw a few components of my project that I hadn't included in the proposal or current plan, and just wanted to comment on them.

  • The main page of the book should also represent the metadata as schema.org/book styled html.
Sounds good! I think building in support for the schema.org style is a great idea, and it should be relatively simple to add the HTML parameters to the generated HTML.
Great!--Micru (talk) 14:05, 25 June 2013 (UTC)[reply]
  • ...plus specifying a label in case there are section transclusions present on the group of pages. Ideally the range of pages should be queried to find out which labels have users applied during the proofreading and then be able to select the one(s) that apply.
Can you elaborate a little bit on what you mean by this? I understand that section transclusions are an issue we'll need to support, but I don't really know what you mean by "label", or what we would do with this marking.
For instance this book section has "<pages index="Dictionary of National Biography. Sup. Vol II (1901).djvu" from=351 to=353 fromsection="Grant, James Augustus" tosection="Grant, James Augustus" />". The idea is that when the user enters a page range (in this case from=351 to=353), then this pages are scanned for sub-section labels and the user can select which one to apply. In this case on page 351 there are the section labels ## "Grant, Albert" ## and ## "Grant, James Augustus" ##. On page 352 there are no section labels, and on page 353 there are the section labels ## "Grant, James Augustus" ## and ## "Grant, John Peter (1807-1893)" ##. So after entering the range from=351 to=353, the user would have a choice between "Grant, Albert", "Grant, James Augustus", and "Grant, John Peter (1807-1893)". It think it might be more a job for the proofread extension to offer a method to return the section labels used in a page. Maybe you can leave it as a "if time permits".--Micru (talk) 14:05, 25 June 2013 (UTC)[reply]
  • generate automatically the sections from the labels that the user has added to each page
I've already marked this as "future". It's definitely a good idea, but I think it's out of scope of the current plan.
Ok!
  • View options
I've marked all of the view options aside from "standard view" as future, as these are also out of scope of my project.
Book2scroll is done, it can be linked.--Micru (talk) 14:05, 25 June 2013 (UTC)[reply]
  • Export and book options
These are in my "if time permits" deliverables. They are two things I very much want to deliver, and am very hopeful I'll have the time to complete them.
Sure, no problem. There are priorities and not enough time to do everything. If the basics work I will be already pleased :) --Micru (talk) 14:05, 25 June 2013 (UTC)[reply]

Anyway, these aren't set in stone at all – if you think some of these things are absolutely pivotal to the success of the extension or the combination of projects, I can restructure my plans. GorillaWarfare talk 21:42, 24 June 2013 (UTC)[reply]

The only change that seems more important is to have the metadata form generated dynamically from a template. Nazmul is working on that, so maybe you can reuse his work. Thanks! :) --Micru (talk) 14:05, 25 June 2013 (UTC)[reply]

Proofreading screen and spelling checker[edit]

I can't wait for Aarti WS VisualEditor, which would be the best tool to split proofreading and formatting, the former being focused on "naked text" only. Experienced users can manage more or less happily any horrible mixture of text and code, but such a mixture is very discouraging for unexperienced users and for users who don't like at all programming/programming languages. IMHO I presume that the "proofreading interface" should be completely different from "formatting interface", just displaying text, source image and tools devoted to proofreading; nothing more than this.

Tools buttons/links should be fixed on the screen, while text and source image only should scroll; a simple, and customable procedure to link any tool with shortcut keys should be implemented (I personally can't work any more without Chrome Shortcut Manager, but there's a serious sharing drawback: more and more often I'm using "personal unshared tools" :-(, and Shortcut manager runs on Chrome only).

An absolutely needed tool is a WS spelling checker. Build-in spelling checkers are excellent for odern languages, but they are almost unuseful for ancient texts, where OCR mistakes are particularly heavy and frequent; OCR adds specific mistakes when fixing scan results by comparison with a modern language dictionary, since it sees as "mistakes" ancient variants of words (i.e. ancient variants for spelling or, in Italian, of accents: perchè (why) is wrong in modern Italian, the right spelling being perché, but is right in 19th century Italian; in 16th century Italian often the right spelling is simply perche). Dealing with difficult, old texts, the ideal dictionary should be "book-specific", t.i. should be dynamically build by scratch while carefully proofreading a specific book/a specific "class" of books. I encourage any of you to think about such a tool. --Alex brollo (talk) 04:41, 27 June 2013 (UTC)[reply]

Regarding this I would recommend taking a look to LanguageTool. Maybe it can also be configured to use an old-language dictionary and create a VE plugin to use it.--Micru (talk) 15:38, 24 July 2013 (UTC)[reply]