Community Wishlist Survey 2020/Archive/hOCR should work for all wikisource
Jump to navigation
Jump to search
hOCR should work for all wikisource
Merged into Community Wishlist Survey 2020/Wikisource/New OCR tool.
- Problem: hOCR is not working for Non-latin language Wikisource. Presently PheTool hOCR is creating a Tesseract OCR text layer for all latin language Wikisource. For Indic Wikisource, We have a temporary properity Google OCR to do this. So I am proposing this Phetools works for all Non latin wikisource including 12 Indic Wikisource.
- Who would benefit: All Non latin Wikisource contributor.
- Proposed solution: Just impliment the same as like enws, frws and creat OCR text layer with updating langdata
- More comments: This proposal was merged into Community Wishlist Survey 2020/Wikisource/New OCR tool.
- Phabricator tickets:phab:T228594
- Proposer: Jayantanth (talk) 16:56, 26 October 2019 (UTC)
Discussion
Phe's tool has suffered some serious difficulties recently and nobody seems to be able to solve them, see phab:T228594. That is why I have suggested replacing this external tool with a brand new tool that would be an integral part of MediaWiki and would not be dependent on availability of a specific single and unreachable volunteer (see the proposal New OCR tool). I suggest to merge our proposals. --Jan.Kamenicek (talk) 20:24, 26 October 2019 (UTC)
- @Jan.Kamenicek, thanks for your reply. Both proposals can be merge in one, if you have no issue. Jayantanth (talk) 07:23, 27 October 2019 (UTC)
- @Jayantanth: I have merged it, can you check please, whether I worded it properly there? Thanks! --Jan.Kamenicek (talk) 11:59, 27 October 2019 (UTC)
- @Jan.Kamenicek, thanks for your reply. Both proposals can be merge in one, if you have no issue. Jayantanth (talk) 07:23, 27 October 2019 (UTC)