Talk:Wikisource Community User Group/Wikisource Conference 2015/Program

From Meta, a Wikimedia project coordination wiki

program proposals[edit]

I suggest that the program contains a panel of one hour, you just need to choose the topic, for example in wikiarabia the topic was Wikipedia in the era of social networking --Touzrimounir (talk) 13:59, 13 August 2015 (UTC)[reply]

May I suggest to include reflection on mass uplaods of source doucments by GLAMś ? Digitization, and automatic generation of doucments through OCR techniques, --DerekvG (talk) 18:05, 5 September 2015 (UTC)[reply]

Wikisource: Identity, Purpose, Added Value[edit]

I would like to encourage discussion of Wikisource's identity, especially in terms of its specific purpose and added value vis-à-vis other (often larger) existing digital libraries such as Project Gutenberg, the Internet Archive, various national digitization efforts, etc.

It seems to me the question has repeatedly come up over the years, but never received a compelling and shared answer. This lack, it seems to me, makes it hard to see the value of investing specifically in Wikisource (technologically, for instance). To be clear, I would very much like to see such investment in Wikisource. I think discussing these questions and coming up with good answers would go a long way toward making the case for more attention to and more investment in Wikisource.

Some proposed questions for debate:

  1. What is the purpose of Wikisource? (note: beyond what seems like the original purpose of "be a place that isn't Wikipedia for Wikipedians to stick primary sources in")
  2. What is the added value of Wikisource over other digital libraries, in general?
  3. Is there added value to developing a Wikisource edition of a text already freely available in a stable non-commercial digital library (e.g. Project Gutenberg, Internet Archive)? If so, what is it?
  4. What would it take to make Wikisource stand out even more from these other digital libraries?
  5. Should Wikisource develop in the direction of citizen archivism and processing of archival (as distinct from library) materials?

I hope some significant time can be made to discuss these topics. I will attend the conference. Ijon (talk) 21:45, 24 September 2015 (UTC)[reply]

Thank you Ijon, I think the proposal is perfectly in line with the Saturday track: discussing about Wikisource identity, writing a "manifesto" and providing the foundations of a future movement-wide conversation about Wikisource is the core of the conference. --Aubrey (talk) 08:43, 25 September 2015 (UTC)[reply]

Blind people and Wikisource[edit]

Dear all, this year Wikimedia CH has in the annual plan a project to check the compatibility of Wikisource to the tools used by blind people (in general daisy tools) or by people with vision loss. The idea came up after a speech of an engineer with vision loss who presented several tools and displayed how blind people work and what kind of barriers they have with websites (a blocking example is Captcha). So we started to check with Swiss associations for blind people to analyze the gap that divides Wikisource to become a "tested" digital library for them. The biggest suggestion is to have "well formatted" texts following specific guidelines and to don't integrate any tool in the web pages (anyone has their own). It would be good to have a feedback from the Wikisource community to check the interest about this project and to collect some suggestions. --Ilario (talk) 14:04, 12 October 2015 (UTC)[reply]

Indic language OCR[edit]

Hi, Till now, Wikisource projects of Indic languages depend mainly on 100% manual proofreading, which not only waste a lot of time, but also a lot of energy also. I am sorry to say, editors are easily burnt out and leave the projects early. And thats why the progress of all Indic language Wikisource is not so inspiring at all. As you all know, recently Google has released OCR software for more than 20 Indic languages. Although it has many limitations, this software is far far better and accurate than the previous OCRs. Now, in this scenario, it has created a new hope to the Indic language community. I would like to request the organizers of this conference to bring together the OCR developers or tech guys, who can help utilize the Google OCR in Indic language Wikisource projects. Thanks. -- Bodhisattwa (talk) 20:39, 28 September 2015 (UTC)[reply]

It would be good to get a better grasp of what exactly is needed. Presumably, it is already possible for one person to spend an hour feeding pages to Google's OCR and generate proofreading work for a few days of volunteer proofreading. That sounds like enough opportunity to engage a few more contributors and develop the Indic Wikisource projects.
Are you asking for some automated solution that would produce thousands of OCRed pages? That may well be a worthwhile ability to develop, but perhaps isn't an urgent need given the current community size.
(Ideally, we could get the Internet Archive to benefit from Google's OCR technology, and then re-use existing Wikisource pipelines to batch-feed texts to IA and get OCRed text to proofread. I don't know how easy or hard that would be, yet.) Ijon (talk) 23:13, 28 September 2015 (UTC)[reply]
Hi Asaf, Thanks for your reply. Uploading the same large file two times (one time for Google OCR and another at Commons) is not an easy solution for most of the contributors, as Internet connection is way slow in India. What I suggest is, can it be possible to develop a tool which can feed the uploaded pdf or djvu files of Commons directly to Google OCRs, so that uploading them 2 times can be avoided. (More like Flickr2commons or Geograph2commons tools, but with some more development). -- Bodhisattwa (talk) 12:32, 29 September 2015 (UTC)[reply]
Thanks, that's a clear description. It can certainly be done!
By the way, see this other tool (thanks, Ravi!) , also related to automating Google's OCR. Ijon (talk) 05:38, 30 September 2015 (UTC)[reply]
Thanks Asaf and Ravi for the link of this tool. Certainly a forward step. -- Bodhisattwa (talk) 19:09, 1 October 2015 (UTC)[reply]

How to recruit new editors[edit]

It could be interesting to talk about initiatives done in this area. In France, we tried some projects : partnership with the National Library, partnerships with National and local archives., presentation of Wikisource at the Paris Book Fair, a scanning party/month in Paris... I would be interested by success stories (and failures) from other countries. Pyb (talk) 17:08, 30 September 2015 (UTC)[reply]

Community tech listening/discussion session[edit]

I'm one of the developers on the WMF Community Tech team. We're interested in hearing about the technical needs of Wikisource and other non-Wikipedia projects. I'd like to attend (or facilitate, whichever makes more sense) a discussion on the current technical status of Wikisource and on what the community needs and wants to be able to do its work more smoothly. --Fhocutt (WMF) (talk) 22:23, 6 October 2015 (UTC)[reply]