CIS-A2K/IRC meeting 2018-02-25

From Meta, a Wikimedia project coordination wiki
CIS-A2K

CIS-A2K (Centre for Internet and Society - Access to Knowledge) is a campaign to promote the fundamental principles of justice, freedom, and economic development. It deals with issues like copyrights, patents and trademarks, which are an important part of the digital landscape.
If you have a general proposal/suggestion for Access to Knowledge team you can write on the discussion page. If you have appreciations or feedback on our work, please share it on feedback page.


[19:14] == Ananth has joined #cis-a2k
[19:16] <Ananth> Hello
[19:16] == enwnbot has joined #cis-a2k
[19:16] <enwnbot> No chat_id set! Add me to a Telegram group and say hi so I can find your group's chat_id!
[19:16] <enwnbot> No chat_id set! Add me to a Telegram group and say hi so I can find your group's chat_id!
[19:16] <enwnbot> <acagastya> @enwnbot ping
[19:17] == acagastya [~acagastya@wikinews/acagastya] has joined #cis-a2k
[19:17] <Ananth> IRC starts in 40mins
[19:18] <acagastya> Who is the chan op?
[19:19] == tuxnani has joined #cis-a2k
[19:23] == Anoop-Rao [~Anoop@wikia/vstf/Minato826] has quit [Quit: 00]
[19:25] == Anoop-Rao [~Anoop@wikia/vstf/Minato826] has joined #cis-a2k
[19:26] <enwnbot> Saileshpat was removed by: Saileshpat
[19:33] == Titodutta has joined #cis-a2k
[19:33] == mode/#cis-a2k [+o Titodutta] by ChanServ
[19:34] == Titodutta changed the topic of #cis-a2k to: Wikisource discussion
[19:41] == Titodutta has quit [Ping timeout: 260 seconds]
[19:41] == Anjali has joined #cis-a2k
[19:42] == Raghu has joined #cis-a2k
[19:43] == Anjali has quit [Client Quit]
[19:46] == axagastya [uid243544@wikinews/acagastya] has joined #cis-a2k
[19:46] == acagastya [~acagastya@wikinews/acagastya] has left #cis-a2k []
[19:47] == Gapu has joined #cis-a2k
[19:47] <Tulsi> Hi everyone ! :)
[19:47] <Anoop-Rao> hello
[19:47] <Gapu> Hello Tulsi
[19:47] == shrini has joined #cis-a2k
[19:48] == enwnbot has quit [Ping timeout: 240 seconds]
[19:48] <Tulsi> Hello Gapu ji
[19:48] == shrini has quit [Client Quit]
[19:49] == shrini [uid38773@gateway/web/irccloud.com/x-olzqajcftyqvoqvi] has joined #cis-a2k
[19:49] == enwnbot has joined #cis-a2k
[19:49] <tuxnani> Hi Everyone. We will start at 8 pm IST. In the mean time, we can have any other discussions that may not be about Wikisource.
[19:51] <tuxnani> I am Rahmanuddin from Telugu Wiki Community. Sysop on Telugu Wikisource.
[19:51] <Gapu> Hello..
[19:51] <enwnbot> <AnanthSubray> I am ananth from Kannada wiki community and I work with CIS-A2K
[19:52] <Anoop-Rao> I'm Anoop  from kannada wiki community
[19:53] <Gapu> I'm Sangram from Odia Wikipedia, woriking for both Wikipedia and Wikisource.
[19:54] == Gram has joined #cis-a2k
[19:55] == Satpal has joined #cis-a2k
[19:55] == Subas has joined #cis-a2k
[19:55] <tuxnani> ಅನಂತ, ನೀನು ಏಕೆ ಬಾಟ್ ಮೇಲಿಂದ ಸಂಭಾಷಣೆ ಮಾಡ್ತಿದ್ದೀರಾ? ಐಆರ್ಸೀ ಖಾತೆ ಇಲ್ಲವಾ ಹೇಗೆ?
[19:56] == Gurlal has joined #cis-a2k
[19:58] <enwnbot> <lahariya> Hello,
[19:58] <Gurlal> Hi
[19:58] == Satdeep has joined #cis-a2k
[19:59] == Manav has joined #cis-a2k
[19:59] == Ravidreams has joined #cis-a2k
[19:59] <Manav> Hi everyone :)
[19:59] == Titodutta [dfe36465@gateway/web/freenode/ip.223.227.100.101] has joined #cis-a2k
[20:00] == mode/#cis-a2k [+o Titodutta] by ChanServ
[20:00] <tuxnani> Thanks Titodutta for setting up the topic!
[20:00] <shrini> hello all
[20:01] <shrini> happy to meet you all here
[20:01] <Satdeep> hello everyone
[20:01] <@Titodutta> Thanks and hello.
[20:01] == Aliva ] has joined #cis-a2k
[20:02] == KCVelaga has joined #cis-a2k
[20:02] <Aliva> Hello
[20:02] == Sudhanwa has joined #cis-a2k
[20:02] == Vinay has joined #cis-a2k
[20:02] <Vinay> Hello all
[20:02] <tuxnani> I am using mobile client and hence may be on and off.
[20:02] <enwnbot> <Pavan89> Hi everyone, … I'm pavan santhosh
[20:02] == sangappa has joined #cis-a2k
[20:03] <Aliva> Now am using mobile
[20:03] == Krishna has joined #cis-a2k
[20:03] <Aliva> All are here?
[20:03] <sangappa> hello every one
[20:03] <@Titodutta> Yes we are here
[20:03] <Krishna> Hi
[20:04] == Gram has quit [Ping timeout: 260 seconds]
[20:04] <Raghu> Hello all
[20:04] <Sudhanwa> I am also on mobile. May get disconnected frequently.
[20:04] <@Titodutta> Me too (on mobile) :)
[20:04] <Aliva> Then start
[20:04] <tuxnani> We can start with short introductions
[20:04] <@Titodutta> Yes
[20:04] == Bodhisattwa has joined #cis-a2k
[20:05] <Aliva> Actually am not in home . So I will chat some where
[20:05] <tuxnani> I am Rahmanuddin from Telugu Wiki Community.
[20:05] <Gapu> I'm Sangram from Odia Wiki community.
[20:05] <Anoop-Rao> Anoop from kannada wiki community
[20:05] <Manav> Manav from Punjabi Wiki UG
[20:05] <Vinay> Vinay from Kannada wiki community.
[20:06] <shrini> I am Shrini from Tamil Community
[20:06] <@Titodutta> I am Tito. English Wikipedian
[20:06] <Subas> Subas, Odia community
[20:06] == mallikarjuna has joined #cis-a2k
[20:06] <Satpal> Satpal from Punjabi wiki
[20:06] <mallikarjuna> Hello from Mallikarjunasj
[20:06] <sangappa> sangappadyamani from kannada wiki community
[20:06] == Gram has joined #cis-a2k
[20:06] <Aliva> Hello am Aliva From odia wiki
[20:06] <Satdeep> Satdeep from Punjabi UG
[20:06] <Gurlal> Gurlal from Punjabi Wiki Community
[20:06] <Ravidreams> Hi, I am Ravi :)
[20:06] <KCVelaga> I am Krishna Chaitanya Velaga, from English Wikipedia
[20:06] == Sush0809 has joined #cis-a2k
[20:06] <Satdeep> haha @Ravi
[20:07] <tuxnani>
[20:07] == vishwa has joined #cis-a2k
[20:07] <Tulsi> Hi ! This is Tulsi from Nepal.
[20:07] <Manav> Ravi :)
[20:08] <Gram> I am Guntupalli Rameswaram from telugu Wikisource
[20:08] <Tulsi> Hi Manav :)
[20:08] <Manav> hi Tulsi :)
[20:08] <enwnbot> <Pavan89>
[20:08] <Sush0809> Hey ! I m Sushma from.hindi Wikipedia
[20:08] <Ravidreams> :)
[20:09] <Sush0809> Hi tulsi, guntupati :)
[20:09] <tuxnani> I would request Pavan Santhosh in disguise of @enwnbot to give insights into Telugu Wikisource progress.
[20:09] == Rajwinder has joined #cis-a2k
[20:10] <Aliva> Yes start now
[20:10] <Tulsi> @ Sush0809: Hi Sushma ji,
[20:10] == balajj has joined #cis-a2k
[20:10] <balajj> hi
[20:11] <Aliva> What is our main topic
[20:11] <Anoop-Rao> WikiSource
[20:11] == Gram_ has joined #cis-a2k
[20:12] <KCVelaga> @Ananth What is the agenda?
[20:12] <Aliva> Yes I know . But what the topic on Wikisource
[20:12] <tuxnani> 1. Self Introduction
[20:12] <tuxnani> 2. Few stats on Indic wiki source project ( share relevant links )
[20:12] <tuxnani> 3. Major issues we faces
[20:12] <tuxnani> 4. Ideas to overcome
[20:12] <tuxnani> 5. Common tools used
[20:12] <tuxnani> 6. Tools we need
[20:12] <tuxnani> 7. Potential organizations to collaborate
[20:12] <tuxnani> 8. Conducting Mini-one day events
[20:12] <Ananth> 1. Self Introduction 2. Few stats on Indic wiki source project ( share relevant links ) 3. Major issues we faces 4. Ideas to overcome 5. Common tools used 6. Tools we need 7. Potential organizations to collaborate 8. Conducting Mini-one day events
[20:12] <Subas> Shoe stats
[20:12] == Gram has quit [Ping timeout: 260 seconds]
[20:12] == Vinay has quit [Ping timeout: 260 seconds]
[20:12] <Subas> Show
[20:13] == jayanta has joined #cis-a2k
[20:13] == vishwa has quit []
[20:13] <Gram_> Hello
[20:13] == Vinay has joined #cis-a2k
[20:13] <Vinay> Lost network.
[20:14] == Rajwinder has quit [Ping timeout: 260 seconds]
[20:14] == KCVelaga has quit []
[20:15] <Manav> Ananth plz start
[20:15] == KCVelaga has joined #cis-a2k
[20:16] <Satdeep> Jayanta Da can share stats of various Wikisource projects
[20:16] <balajj> yes.. Ananth pls initiate the converstation
[20:16] <Ananth> https://tools.wmflabs.org/phetools/statistics.php?diff=0
[20:17] <Ananth> the above link will help you understand growth of Wikisource communities
[20:17] <enwnbot> kc_velaga was added by: acagastya
[20:17] <Ananth> and the work
[20:17] <Aliva> Oho .
[20:17] <mallikarjuna> Proposal: This report, if sortable, would be helpful.
[20:18] == info-farmer has joined #cis-a2k
[20:18] == Sush0809 has quit [Ping timeout: 260 seconds]
[20:18] == Yann_ [~Yann@47.247.188.121] has joined #cis-a2k
[20:19] <Ananth> I request one from each community to share their work related to Wikisource.
[20:19] == info-farmer has quit [Client Quit]
[20:19] <jayanta> https://wikisource.org/wiki/Wikisource:Indic_Wikisource_Stats
[20:19] == smjalageri has joined #cis-a2k
[20:19] <Gapu> In these days, we have only a few active members for Odia Wikisource.
[20:19] <smjalageri> namaste  - Siddappa, kannada wikisource
[20:19] == info-farmer has joined #cis-a2k
[20:20] <Gapu> But, we are trying to give more time to wikisource for proof reading.
[20:20] <jayanta> https://wikisource.org/wiki/Wikisource:Indic_Wikisource_Stats this page is better under stand the Indic wikisource stats
[20:20] <Gurlal> @ananth punjabi is not in the statistics
[20:20] <Gapu> Also we are ispiring people to work for wikisource, because at Wikisource, they could learn more and the arctiles could editted easily.
[20:20] <jayanta> paws will be added soon
[20:20] <Satdeep> Thank you for sharing the stats @Jayanta
[20:21] == yannf [~Yann@2405:204:e40e:d441:7504:e463:9f4a:4f85] has quit [Ping timeout: 245 seconds]
[20:21] <sangappa> namaskara @smjalageri
[20:21] <Satdeep> Thanks Jayanta da
[20:21] <balajj> indic stat is useful
[20:22] <Gram_> Useful stats. More readable books inTelugu
[20:22] == Krishna has quit [Ping timeout: 260 seconds]
[20:22] <Gapu> And if the ORC could work properly for our scripts, then we could add more books. We need more time in proof reading than typing them.
[20:23] == Yann_ has quit [Changing host]
[20:23] == Yann_ has joined #cis-a2k
[20:23] <Satdeep> Same in Punjabi Gapu
[20:23] == Vinay has quit [Ping timeout: 260 seconds]
[20:23] == Yann_ has changed nick to yannf
[20:23] <jayanta> as of now our main issue is OCR script not working
[20:23] <Satdeep> So, since we have the stats
[20:23] <Satdeep> Let's move forward
[20:23] <Ananth> Some languages are missing in stats, I think we should add those also.
[20:23] <tuxnani> Its about activity
[20:23] <yannf> hi, sorry, I am late ;)
[20:23] <jayanta> which language??
[20:24] <Ananth> Punjabi
[20:24] <enwnbot> <Pavan89> Participation: Telugu Wikisource during Jan to June 2017 saw 3-4 editors contributing 100+ edits every month (very active editors) … And after a hike in mid 2018 to 15-14 finally settled at 5 very active editors
[20:24] <tuxnani> More activity - statd
[20:24] <Subas> OCR is very defective.
[20:24] <tuxnani> No activity - no stats
[20:24] <Gapu> We could inspire people to read the books without any cost, at the time of proof reading. So more people could join.
[20:24] <yannf> I am currently at Gwalior
[20:25] <shrini> For few PDF files, the google OCR is not working
[20:25] <shrini> we may have to try tesseract 4
[20:25] <yannf> what about Hindi wikisource?
[20:25] <tuxnani> @Ravi can throw some light on OCR in Tamil
[20:25] <shrini> tesseract 4 it is still in development, but gives nice results for tamil
[20:25] <Aliva> In odia I also started 100 days
[20:26] <Satdeep> There is not much activity in Hindi currently
[20:26] <enwnbot> <Pavan89> Telugu Wikisource Workshop was held in July 2017 which aimed to help existing Wikimedians to learn complete book digitization process, and the event got 2 new Wikisource editors who consistently contributing from then.
[20:26] <jayanta> welcon Yann
[20:26] <Sudhanwa> Can we come to the point please. No point in discussing known issues. If there are solutions, please share.
[20:26] <balajj> ocr in tamil with recent fonts. bad with few of the old tamil scripts
[20:26] <enwnbot> <Pavan89> @Pavan89 [Telugu Wikisource Workshop was held in July 2017 which aimed to help existing Wi …], One of those new Wikisourcers, Ramesham is currently participating in this ITC
[20:27] <enwnbot> <Pavan89> *IRC
[20:28] <yannf> any idea about doing workshop in Hindi?
[20:28] <Gapu> Yes, in Odia, We are only two memebers who wrote everyday as a challenge of 100wikidays. and 2-3 members are also contributing regularly, but we need at least 10-15  members who could contribute daily.
[20:28] == KCVelaga has quit []
[20:28] <enwnbot> <Pavan89> @Sudhanwa, … Many Wikipedians who can potentially contribute to Wikisource can contribute if we can hold some workshop to them
[20:28] <yannf> that would be a good way to increase participation
[20:28] == tuxnani has quit [Read error: Connection reset by peer]
[20:28] <Aliva> Yes but try more
[20:29] <sangappa> number of visitors to Wikisource is less compare to wikipedia
[20:29] <enwnbot> <Pavan89> @Pavan89 [@Sudhanwa, … Many Wikipedians who can potentially contribute to Wikisource can con …], As Wikipedians already care for free knowledge, Wikisource Workshop is a little push for them to contribute to project
[20:29] <enwnbot> <kc_velaga> Yes, Creating an editor base for Wikisource is important
[20:29] <@Titodutta> Yes. That will help
[20:30] <Gapu> The number of visitors would increase, if we could get some news books for wikisource.
[20:30] <balajj> wsexport is a wonderful tool for the users to download books from wikisource in variety of formats such as pdf ebup text mobi etc.this would make reading easy in many devices. but only bn ta and te and integrated wsexport tool
[20:30] <Ananth> So can we have a small small event in upcomin days in all the langauges?
[20:30] <Gram_> But in our case we first started to work in wikisource
[20:31] <Gapu> Only a few people are interested on these old books.
[20:31] <balajj> pls other language also integrate wsexport tool..
[20:31] <Manav> we can have one school level event
[20:31] <balajj> if any help required i can do..
[20:31] <Ananth> @balajj thank you for sure we will use it
[20:31] <Manav> and can involve students in this project
[20:31] <enwnbot> <Pavan89> @balajj [<balajj> if any help required i can do..], Thank you
[20:31] <Aliva> Can we start a workshop
[20:31] <Gram_> Old people are in general interested in old books
[20:32] <Gapu> Yes, I agree with Manav mam.. we could inspire school children, but first we have to convince their parents.
[20:32] <Manav> no... we need to talk to the school management
[20:32] <jayanta> any technical help for tool intregration, I am ready for  help
[20:33] <balajj> https://tools.wmflabs.org/wsexport/tool/stat.php
[20:33] == Krishna has joined #cis-a2k
[20:33] <Gram_> What is ideal age group for students
[20:33] <Ananth> CIS-A2K can help you work with school etc
[20:33] <Aliva> We can start in library.  Start with oldage people
[20:33] <yannf> I may be able to organise a workshop in my place, a new computer room is going to be installed soon
[20:33] <balajj> 1000s of books are downloaded through ws export tool in ta language
[20:34] == Bharath [2be08299@gateway/web/freenode/ip.43.224.130.153] has joined #cis-a2k
[20:34] == tuxnani has joined #cis-a2k
[20:34] <balajj> in old books for religious text lot of requirement from public
[20:34] <enwnbot> <Pavan89> And for, Wiki Education Program, Wikisource can be one of the best projects to start
[20:34] <enwnbot> <acagastya> Not all parents would allow — but yes, like what Manav said, if the school promotes it, probability of parents allowing their children would increase.
[20:35] <jayanta> wikisource will be the reference backbone of Wikipedia in future
[20:35] <Aliva> <Pavan89> Agree
[20:35] <Ananth> agreed
[20:35] <@Titodutta> Yes
[20:35] <Anoop-Rao> Instead of school start with pu colleges
[20:35] <Gram_> Yes
[20:35] <balajj> yes
[20:36] <Gurlal> yes
[20:36] <Gapu> The students from English medium might get the chance to know about the local languages easily, if they contribute for wikisource.
[20:36] <tuxnani> Can we organise these issues, one language after other.
[20:36] <enwnbot> <Pavan89> One of our Wikisource contributers Murthy, Wrote little introductions to chapters of book he is working on, in fb and it can be good idea to get people there
[20:36] <Raghu> In christ degree students are working with wikisource
[20:36] <tuxnani> Can we list down if there are any issues for English Wikisource from English community?
[20:37] <enwnbot> <kc_velaga> May be @Yann can comment on this
[20:38] <@Titodutta> I don"t know about anything major (although I don't know much about English Wikisource,  so Tanned or someone might help)
[20:38] <tuxnani> yannf: any insights?
[20:38] <Subas> I think English community has no problems
[20:39] == Vidyu44 has joined #cis-a2k
[20:39] <@Titodutta> Yann* uh autocorrect
[20:39] == Satpal_ has joined #cis-a2k
[20:39] == Ramesam has joined #cis-a2k
[20:39] <tuxnani> Can we move onto Hindi wikisource then? Hindi wikisource is a very recent addition.
[20:40] <yannf> I have not edited en.ws recently, but no issue ad far as I can remember
[20:40] == Satpal has quit [Ping timeout: 260 seconds]
[20:40] == Gram_ has quit [Ping timeout: 260 seconds]
[20:40] <tuxnani> yannf: Thanks
[20:40] <yannf> tuxnani, Hindi WS still doesn't exist independently
[20:41] <tuxnani> https://wikisource.org/wiki/Main_Page/%E0%A4%B9%E0%A4%BF%E0%A4%A8%E0%A5%8D%E0%A4%A6%E0%A5%80
[20:41] <yannf> tha's the issue
[20:41] <enwnbot> <kc_velaga> One thing is clear that awareness about English Wikisource and number of contributors to en Wikisource from India are very less.  … Even though I am not myself an active contributor to it. Whenever I hover around I can see a lot of Indian government and Indian records, and English books by Indian authors are to be proofread.  … Th
[20:41] <enwnbot> is is again brings us back to the need to do more Wikisource workshops.
[20:42] == tuxnani_ has joined #cis-a2k
[20:42] <balajj> manual work is enormous..
[20:42] <balajj> we have to find ways to automate
[20:43] == Raghu has quit [Ping timeout: 260 seconds]
[20:43] <balajj> identify common ocr errors. built a public list. use bots to correct them
[20:44] == Sumita [6ad79442@gateway/web/freenode/ip.106.215.148.66] has joined #cis-a2k
[20:44] == Raghu has joined #cis-a2k
[20:44] <tuxnani_> balajj: that's a good idea, we implemented in Telugu Wiksiource while working with placing Quran translation on Wikisource.
[20:44] == Aliva ] has quit [Ping timeout: 260 seconds]
[20:45] <balajj> commendable job
[20:45] == tuxnani has quit [Ping timeout: 260 seconds]
[20:45] <info-farmer> This small tool will helping me a lot to do small edits instead of opening full page
[20:45] <info-farmer> https://commons.wikimedia.org/wiki/File:Tutorial-tamil-firefox-addon-QuickWikiEditor-usage.webm
[20:46] <tuxnani_> Krishna: My request to you is to acquire works of Baba Saheb Ambedkar as they have been copyright lapsed now.
[20:46] <info-farmer> but it is to be updated
[20:46] <@Titodutta> Yes it is a fine addon
[20:46] <Sumita> Sorry I am late. can we get a small detail of important discussion
[20:46] <tuxnani_> find out what's copyright status of English translations
[20:46] <Bodhisattwa> Balaji, share the script to fix OCR typos
[20:46] <tuxnani_> That will give enough boost for Wiksiource to be in media.
[20:47] == Krishna has quit [Ping timeout: 260 seconds]
[20:47] == Nirajan has joined #cis-a2k
[20:47] <enwnbot> <Pavan89>
[20:47] == Aliva ] has joined #cis-a2k
[20:47] <tuxnani_> Any issues from Bangla, specific?
[20:47] == Gram has joined #cis-a2k
[20:47] == Bharath [2be08299@gateway/web/freenode/ip.43.224.130.153] has quit [Ping timeout: 260 seconds]
[20:47] <Bodhisattwa> Balaji, Infofarmer, share the script to fix OCR typos
[20:47] <Sudhanwa> Awareness of copyright is very important. If schools are to be targeted for trainings, clear understanding of copyright must be given.
[20:48] <balajj> i am using find and replace option in autowikibrowser. so not using scripts
[20:48] == Subas has quit [Ping timeout: 260 seconds]
[20:48] <balajj> can be done with pywikibot too i guess
[20:48] <Sumita> Is there any scripts for OCR typos
[20:48] <Sudhanwa> There are instances where some books still not free from the rights are used.
[20:48] <shrini> https://github.com/tshrinivasan/tools-for-wiki/tree/master/fix_spellerrors_tawikisource
[20:48] == Ramesam has quit [Ping timeout: 260 seconds]
[20:49] <tuxnani_> Sumita: and others, we need to list down common OCR errors and their replacements and use AWB or Pywikibot to bulk replace.
[20:49] <Bodhisattwa> Thanks shrini
[20:49] <tuxnani_> Sudhanwa: Can you indicate?
[20:49] <shrini> here is the script to fix OCR errors from a list in a wikipage
[20:49] <Bodhisattwa> Thanks Shrini
[20:49] <info-farmer> Is it possible for CIS,  to update that tool ? or  where i can get help?
[20:50] <Sudhanwa> And a good ocr for indic languages is a must for wikisource.
[20:50] <Bodhisattwa> Shrini, please solve the issue for empty text
[20:50] <Bodhisattwa> All work has been stopped
[20:51] <@Titodutta> We can of course help and connect with the people who might solve the issue. From mobile it is difficult to try all links. But yes we'll put our effort
[20:51] <tuxnani_> jayanta: any specific issues with Bangla Wikisource? Bodhisattwa
[20:51] <Bodhisattwa> Nothing, we are good
[20:51] == Aliva ] has quit [Ping timeout: 260 seconds]
[20:51] <enwnbot> <lahariya> @tuxnani_ [<tuxnani_> Krishna: My request to you is to acquire works of Baba Saheb Ambedkar …], MEA which holds copyright of Dr. B.R. Ambedkar's writing officially has not responded to out requests. Maharashtra Govt has outright refused to release it under cc licence. We can approach Navayana Publications...
[20:52] <tuxnani_> enwnbot: @Tanveer, a book thats in public domain already, why request to release in CC license?
[20:52] <Bodhisattwa> Infofarmer, of you are free, you can resume your bot work in bnws
[20:53] <tuxnani_> I can get this done.
[20:53] <info-farmer> You pls focus on technical issues on indic WS.
[20:53] == Aliva ] has joined #cis-a2k
[20:53] <Bodhisattwa> After the issue has been fixed
[20:53] <tuxnani_> Requesting for release of a public domain work into CC is making things complicated.
[20:53] <yannf> the stats are incomplete, if anyone can do coding, it would be created to fix the stats
[20:53] <tuxnani_> Can we hear from Punjabi Wikisource, any specific issues?
[20:54] <enwnbot> <acagastya> tuxnani, you do not need to prefix "enwnbot" in your mentions; however, if you are responding to someone whose message was sent via enwnbot, consider using "@".
[20:54] == Balajijagadesh has joined #cis-a2k
[20:54] == Gram has quit [Ping timeout: 260 seconds]
[20:54] <tuxnani_> @acagastya, that would not ping, would it?
[20:54] <yannf> only, bn, te, and sa graphs are shown
[20:55] <enwnbot> <acagastya> It will.
[20:55] <tuxnani_> yannf: check https://stats.wikimedia.org/wikisource/EN/Sitemap.htm
[20:55] <enwnbot> <acagastya> enwnbot is just a bridge bot.
[20:55] == Bodhisattwa_ has joined #cis-a2k
[20:56] <yannf> for proofreading, http://tools.wmflabs.org/phetools/statistics.php is much more interesting
[20:56] <Gurlal> OCR is major issue of punjabi wikisource
[20:56] <Balajijagadesh> @yannf the stat link is not updated after nov 2017
[20:56] == dp has joined #cis-a2k
[20:56] <@Titodutta> Is it not working properly,  or explain the issue Gurlal
[20:57] == Sudhanwa has quit [Ping timeout: 260 seconds]
[20:57] == balajj has quit [Ping timeout: 260 seconds]
[20:57] <Nirajan> @Gurlal Which OCR do you use?
[20:57] == Bodhisattwa__ has joined #cis-a2k
[20:58] == Manav has quit [Ping timeout: 260 seconds]
[20:58] == Gram has joined #cis-a2k
[20:58] <jayanta> https://wikisource.org/wiki/Wikisource:Indic_Wikisource_Stats/stats.py can anyone run this script to update the indic stasts
[20:58] <Satdeep> We use the python bot OCR4wikisource
[20:58] <yannf> Balajijagadesh, yes, Phe's stats are up to date, but not for all langages
[20:58] <Bodhisattwa__> https://github.com/tshrinivasan/OCR4wikisource/issues/99
[20:58] <info-farmer> Bodhi! i already discussed with Srini about the server need for bnWS. You please have a mail with him. i think to a server for bnWS, we need a small amount.
[20:58] <Gurlal> there's only 10% accuracy
[20:59] <Gurlal> while using OCR4wikisource
[20:59] <@Titodutta> Oh
[21:00] <Bodhisattwa__> @Infofarmer, what server, please explain
[21:00] <enwnbot> <Pavan89> @Gurlal [<Gurlal> while using OCR4wikisource], Is it even on good quality, recently published work?
[21:00] == Bodhisattwa_ has quit [Ping timeout: 260 seconds]
[21:00] <enwnbot> <Pavan89> @Pavan89 [Is it even on good quality, recently published work?], I mean good scan quality and clear font
[21:00] <Nirajan> Oh
[21:00] <Ananth> @pavan, yes
[21:01] <Ananth> with new books also the quality is bad
[21:01] <info-farmer> Srini! will u pls give a brief about the server
[21:01] <Ananth> In punjabi
[21:01] <enwnbot> <Pavan89> Oh
[21:01] <Balajijagadesh> @jayanta how to run this script https://wikisource.org/wiki/Wikisource:Indic_Wikisource_Stats/stats.py
[21:01] <Ananth> so we shoould focus on it
[21:01] <Gurlal> yes pavan ...same problem with new books
[21:01] <tuxnani_> jayanta: can you provide the two.txt file?
[21:01] <enwnbot> <Pavan89> Okay
[21:02] <tuxnani_> whats there in the txt file? jayanta
[21:02] == dp has quit [Quit: Page closed]
[21:03] <info-farmer>  @jayanta If u make a screenast that will be helpful for all of us
[21:03] <tuxnani_> Gurlal: can we move on? We need to hear from Telugu, Kannada, Malayalam, Odia and Tamil communities
[21:03] <jayanta> two.txt will be generated
[21:04] <tuxnani_> jayanta: send me a sample one
[21:04] == Gapu has quit [Quit: Page closed]
[21:04] <Gurlal> okay
[21:04] <tuxnani_> Gurlal: I have checked Google OCR, https://docs.google.com/document/d/1OoAtzddZHE0o1ftIAnOa6344YoGGz2O5DJ-4b2lwj0Y/edit?usp=sharing
[21:05] <tuxnani_> Can we hear fom Kannada community, what their issues are?
[21:05] <Satdeep> its good when do like this one page at a time
[21:06] <Satdeep> but OCR4wikisource does not work
[21:06] <Nirajan> I am interested in OCR. I will check it. Please send me some sample images.
[21:06] == Gram_ has joined #cis-a2k
[21:06] <Nirajan> Hope other OCR perform better.
[21:06] <Ananth> @Satdeep we will work on it.
[21:07] <jayanta> any one from sa and assumes community
[21:07] == Gram has quit [Ping timeout: 260 seconds]
[21:07] == Aliva ] has quit [Ping timeout: 260 seconds]
[21:07] <Ananth> In Kannada we don't have many active community members that the main problem
[21:08] <Anoop-Rao> @Ananth @smjalageri any issues with kannada wiki source
[21:08] <tuxnani_> In Sanskrit Wikisource, the convention of using numerals needs to be fixed
[21:08] <tuxnani_> some places hindi numerals are used, some places indo-arabic
[21:09] <Ananth> and if we get help in bulk replacement of common errors it will be good
[21:09] == Vidyu44 has quit [Ping timeout: 260 seconds]
[21:09] == Bodhisattwa has quit [Ping timeout: 260 seconds]
[21:09] <tuxnani_> Anyone from Malayalam here?
[21:09] <Anoop-Rao> would it be possible to Christ college WEP program to focus more on kannada wikisourse
[21:09] <Ananth> We are doing it
[21:09] == Bodhisattwa__ has quit [Ping timeout: 260 seconds]
[21:10] <tuxnani_> Anyone from Odia?
[21:10] <Ananth> and from now we will increase work on Wikisourc.
[21:10] == HoloIRCUser2 [~holoirc@27.59.162.33] has joined #cis-a2k
[21:11] == Manav has joined #cis-a2k
[21:11] <tuxnani_> Interesting fact is, Odia is the most viewed Wikisource among Indic wikisouces in last 30 days.
[21:11] <Raghu> In christ 1st year students are working on wikisource
[21:11] == HoloIRCUser2 has changed nick to Pavan89
[21:11] <@Titodutta> It's interesting
[21:12] <Raghu> From now we will make sure that they spend more time on Wikisource
[21:12] <tuxnani_> Any issues of Telugu community?
[21:12] <smjalageri> Can I raise a query from Kannada?
[21:12] <Pavan89> Personally, Automation of publishing unicode text to Wikisource pages basing on page numbers can reduce repetitive work
[21:12] <Ananth> yes you can
[21:12] <tuxnani_> smjalageri: yes please
[21:12] <Gram_> Like the Christ college, we need find one in Hyderabad
[21:12] == Satpal has joined #cis-a2k
[21:12] <tuxnani_> Pavan89: Can you explain?
[21:12] == tuxnani_ has changed nick to tuxnani
[21:13] <Pavan89> I mean if we have a text file which was in unicode text given by some one and if we can publish content to relevant pages based on page numbers to Wikisource
[21:13] <tuxnani> Any issues from Tamil Wikisource community?
[21:13] <smjalageri> As knowledge-sharing community, Wikisource can have one Self-Service Page, .. explaining in lucid terms on Creative Commons, Copyright & Free books
[21:13] <tuxnani> Pavan89: thats doable through pywikibot
[21:14] == mallikarjuna has quit [Ping timeout: 260 seconds]
[21:14] <Balajijagadesh> yes..
[21:14] == Satpal_ has quit [Ping timeout: 260 seconds]
[21:14] <Balajijagadesh> there is problem with few old tamil fonts not recognised with ocr4wikisource
[21:14] <Balajijagadesh> since many of the books are old there is lot of ocr errors..
[21:15] <tuxnani> smjalageri: https://commons.wikimedia.org/wiki/Commons:Copyright_rules_by_territory/India This page would help
[21:15] <smjalageri> Many gems of books are lying with doubts on copyright
[21:15] <Tulsi> Guys, What do you say for Maithili Wikisource? Should we start this on incubator?
[21:15] <tuxnani> I request CIS-A2K staff to come up with taking requests on checking copyright status of books.
[21:15] <tuxnani> thanks Balajijagadesh
[21:15] <smjalageri> Hi Tuxnani, I've seen that page, It follows the legal terms kind of language, not clear
[21:16] <Ananth> Sure
[21:16] <Balajijagadesh> @shrini can you help old tamil fonts ocr correction
[21:16] <@Titodutta> Yes @tuxnani
[21:16] == sangappa has quit [Ping timeout: 260 seconds]
[21:16] == Bodhisattwa has joined #cis-a2k
[21:16] <tuxnani> smjalageri: CIS-A2K staff will support you with this.
[21:16] <@Titodutta> Yes
[21:16] <shrini> I am exploring tesseract 4
[21:16] <smjalageri> Oh, Ok, I'll take that offline.  Thank you.
[21:16] <shrini> it is good
[21:16] <jayanta> @ info-farmar , please explain what server why needed.???
[21:17] <shrini> compiled it for my laptop. Trying to package it so that anyone can use it
[21:17] == Krishna has joined #cis-a2k
[21:17] <shrini> @jayanta : We need a cloud server to run the OCR4Wikisource script
[21:17] == Satpal has quit [Ping timeout: 260 seconds]
[21:17] == info-farmer has quit [Ping timeout: 260 seconds]
[21:17] <shrini> thats what he mentioned
[21:18] == Gurlal_ has joined #cis-a2k
[21:18] == Tulsi [uid192784@wikimedia/Tulsi-Bhagat] has quit [Quit: Bye]
[21:19] <Bodhisattwa> @Shrini, can you revive the Book uploader bot
[21:19] <Balajijagadesh> @shrini can we do it paws?
[21:19] == Ravidreams has quit [Ping timeout: 260 seconds]
[21:19] <Anoop-Rao> @smjalageri , you can request support at aralikatte ,i will forward it to cis-a2k for review
[21:19] <tuxnani> shrini: any possibility for using paws
[21:19] <shrini> we can not do in paws
[21:19] <shrini> the tools for PDF splitting is not available in paws
[21:20] == enwnbot has quit [Ping timeout: 276 seconds]
[21:20] <Balajijagadesh> or we can built something like paws in jupyter like yuvi did for paws
[21:20] <Balajijagadesh> ??
[21:20] <shrini> anyhow, it is a good idea.Will explore for any workarounds to use in paws
[21:20] == Satdeep has quit [Ping timeout: 260 seconds]
[21:20] == jsahu has joined #cis-a2k
[21:20] <tuxnani> sure. Thanks shrini. You are our super star!
[21:20] == sangappa has joined #cis-a2k
[21:20] <smjalageri> Ok, I will do that Anoop
[21:20] <shrini> or we can ask wiki tech team to install the required tools on paws
[21:21] <Ananth> thats good
[21:21] <tuxnani> Since we are done asking for issues, can we move onto tools?
[21:21] <Balajijagadesh> nice
[21:21] <shrini> @Bodhisattwa : what is the issue with book uploader script? file a bug on relevant github repo
[21:21] == jayanta has quit [Ping timeout: 260 seconds]
[21:22] == Gram_ has quit [Ping timeout: 260 seconds]
[21:22] <tuxnani> There was a query from Tulsi, who seems to have left.
[21:22] <tuxnani> its about starting Maithili wikisource.
[21:23] == enwnbot has joined #cis-a2k
[21:23] <Bodhisattwa> Book uploader Bot or BUB is a tool created by Rohit Dua and Nemo to upload books from different urls like Google , Hathi trust to IA. For last 2 years it's totally defunct
[21:23] <Ananth> yes we need to work with Maithili community
[21:23] <tuxnani> I request CIS-A2K again to list down literary works available in print or online as scans in all possible languages. That will help us work around different possibilities, including above query.
[21:23] <Bodhisattwa> It's built upon python,
[21:23] <Ananth> and get it done
[21:24] <shrini> @Bodhisattwa Nice. Share the link and issues for that
[21:25] <Bodhisattwa> Ok, I will mail you
[21:25] <tuxnani> Bodhisattwa:  tried that, the library files are stale.
[21:25] <tuxnani> Can we now move onto discuss any potential organisations to collaborate with?
[21:27] <Bodhisattwa> Yes, National Library,
[21:27] <tuxnani> I personally see a lot of potential with gazzettes
[21:28] == Gurlal_ has quit [Ping timeout: 260 seconds]
[21:28] <Bodhisattwa> It would be great of CIS-A2K collaborate with National Library to release their digital collection
[21:28] <Bodhisattwa> ;-)
[21:28] <tuxnani> Also, all the parliament speeches if translated to our languages will hold good.
[21:29] == Nirajan_ has joined #cis-a2k
[21:29] == enwnbot has quit [Ping timeout: 256 seconds]
[21:29] <Ananth> sure
[21:29] <shrini> we need to ask government to release more contents in CC
[21:29] <Ananth> we will try our best to get it done
[21:30] <tuxnani> state level assembly speeches and GOs if in local languages, can be used at wikisource too
[21:30] <Bodhisattwa> @Ananth, Niational Library?
[21:30] == enwnbot has joined #cis-a2k
[21:30] <tuxnani> Next, we will move onto our last point of discussion - mini one day events
[21:31] <Pavan89> For participation, we can look find literature appriciation groups working in social media
[21:31] <tuxnani> Titodutta: any inputs? @lahariya Pavan89
[21:31] <Ananth> @bodhi i will talk with my team about it and let you know
[21:31] <Pavan89> They will have digital skills, interest and can enjoy proof reading
[21:32] <tuxnani> Ananth: What do you plan to do when you said mini one day events?
[21:32] <@Titodutta> About tuxnani? National library or mini event?
[21:32] <tuxnani> Titodutta: mini events
[21:32] <Ananth> i am thinking to have a small event in each language
[21:32] == Nirajan has quit [Ping timeout: 260 seconds]
[21:33] == Bodhisattwa_ has joined #cis-a2k
[21:33] <Ananth> at least one event per language
[21:33] <tuxnani> In Telugu Pavan Santhosh does two days of just introducing.
[21:33] <tuxnani> that will not yield any thing.
[21:33] <shrini> Face to face events always inspire me to do more
[21:33] <tuxnani> We need to plan properly.
[21:33] <tuxnani> shrini: (y)
[21:33] == smjalageri has quit [Ping timeout: 260 seconds]
[21:34] <shrini> We need more mini events and workshops
[21:34] <Pavan89> @tuxnani it does yield retention of new participants
[21:34] <Ananth> @shrini we can have one tech work shop for Wikisource
[21:34] <@Titodutta> Yes sure. I am not the best person to plan or comments on it. Perhaps we can plan something keeping the workshop related discussions and requisition above and after more discussin
[21:34] <enwnbot> <acagastya> If there would be for every language, what about English? (and where do you plan to conduct the workshop for enws?)
[21:34] <Pavan89> When we carefully find participants who already have some inclination towards language and literature
[21:35] == Bodhisattwa has quit [Ping timeout: 260 seconds]
[21:35] <Pavan89> And invite them to event, retention will be high
[21:35] <Anoop-Rao> @Ananth any update on https://kn.wikipedia.org/wiki/ವಿಕಿಪೀಡಿಯ:ಅರಳಿ_ಕಟ್ಟೆ#ಕನ್ನಡ_ಸಾಹಿತ್ಯ_ಪರಿಷತ್ತಿನ_ಪುಸ್ತಕಗಳನ್ನು_ಸೇರಿಸುವುದು
[21:36] <tuxnani> Pavan89: Whats update on the Veeresalingam books you were supposed to dissect and upload? its been 4 months or so.
[21:36] <Krishna> No. @Anoop
[21:37] <Ananth> we are wokring on it and currenlty getting all the requried tools to finish that project
[21:37] <Pavan89> I am doing it and started uploading to Commons and updated on Wikisource, will complete in this week
[21:37] <tuxnani> Thanks Pavan89
[21:37] <Pavan89> On another note, my understanding finding right participants will help in retention
[21:38] <tuxnani> Pavan89: your efforts in that case are not focused per community but per person.
[21:38] <tuxnani> that means, you have to put way more efforts per person. Thats too expensive.
[21:39] <Pavan89> I'm just sharing a pattern, wherein we can find some interest groups and get them to Wikisource
[21:39] <tuxnani> I did not see much happening in that pattern, at least not what was promised for.
[21:40] <tuxnani> If there are no more discussion points, we can close this here.
[21:40] <tuxnani> Titodutta: Ananth @lahariya, Krishna Pavan89 anything else?
[21:41] <Krishna> Nothing from my side.
[21:41] <Ananth> nothing for my side
[21:41] <Pavan89> Nothing more from my side too
[21:42] <tuxnani> Thank you all for being part of this discussion.
[21:42] <@Titodutta> Nothing feom agenda. Perhaps followup or next irc idea? Perhaps next month or so?
[21:42] <Ananth> Thanks for everyone
[21:42] <Ananth> we should have one per month
[21:43] <tuxnani> By next month, i request Pavan89 o create list of works for Telugu and show as model
[21:43] <Bodhisattwa_> One thing from my side, @Ananth, when are you going to talk about National Library with each other in CIS
[21:43] <@Titodutta> Thanks.
[21:44] <Ananth> Monday
[21:44] <Ananth> and i will give you update soon
[21:44] <Pavan89> @tuxnani sure, will start creating list
[21:44] <Bodhisattwa_> Ok, looking forward for your positive update
[21:44] <Bodhisattwa_> :-)
[21:44] <Anoop-Rao> nice discussion's for next please update IRC meetups in respective community village pumps
[21:45] <Ananth> Sure
[21:45] <@Titodutta> Alright
[21:46] <tuxnani> Take care all. Bye.
[21:46] <Pavan89> Good Night everyone
[21:46] == Pavan89 has quit [Quit: Pavan89]
[21:47] <Gurlal> Good Night
[21:48] <Anoop-Rao> good night all , thanks for nice IRC meetup @CIS-A2K
[21:48] <shrini> Thanks all
[21:48] <yannf> thanks, good night all
[21:48] == tuxnani has quit [Quit: Page closed]
[21:49] == Bodhisattwa_ has quit [Ping timeout: 260 seconds]
[21:49] <Balajijagadesh> good night

Hy