Community Wishlist Survey 2020/Archive/hOCR should work for all wikisource
hOCR should work for all wikisource
- Problem: hOCR is not working for Non-latin language Wikisource. Presently PheTool hOCR is creating a Tesseract OCR text layer for all latin language Wikisource. For Indic Wikisource, We have a temporary properity Google OCR to do this. So I am proposing this Phetools works for all Non latin wikisource including 12 Indic Wikisource.
- Who would benefit: All Non latin Wikisource contributor.
- Proposed solution: Just impliment the same as like enws, frws and creat OCR text layer with updating langdata
- More comments: This proposal was merged into Community Wishlist Survey 2020/Wikisource/New OCR tool.
- Phabricator tickets:phab:T228594
- Proposer: Jayantanth (talk) 16:56, 26 October 2019 (UTC)
Phe's tool has suffered some serious difficulties recently and nobody seems to be able to solve them, see phab:T228594. That is why I have suggested replacing this external tool with a brand new tool that would be an integral part of MediaWiki and would not be dependent on availability of a specific single and unreachable volunteer (see the proposal New OCR tool). I suggest to merge our proposals. --Jan.Kamenicek (talk) 20:24, 26 October 2019 (UTC)