Grants:PEG/WM IL/Software Development for Wikipedia on OLPC/Report
We have performed the following activities toward a usable search experience for the offline Hebrew Wikipedia:
- commissioned paid work by HebMorph chief developer Itamar Syn-Hershko, to:
- port HebMorph to C++
- support chief Kiwix developer Emmanuel Engelhart in integrating HebMorph-powered searching in Kiwix, which required adding support for CLucene-powered searches to Kiwix, which hitherto relied only on Xapian.
- significant volunteer software development work, by Asaf Bartov and Tomer Ashur, to investigate and attempt to solve blocking bugs in the way Unicode Hebrew is handled in the Kiwix code responsible for using the search engine.
Project goal and measures of success
The goal has been "to provide a high-quality subset of the Hebrew Wikipedia (text, categories, and some images) with offline browsing software, in a Hebrew interface useful for the disadvantaged children and youths without an Internet connection, who are the intended beneficiaries of the Israeli OCPC project."
We have not been able to achieve this goal: such an offering is still not available in Hebrew, and the OCPC project is not distributing Wikipedia with its computers yet.
However, we have come close, and have had some second-order impact supporting the improvement of two key free software projects, and this progress remains available and awaits the resolution of the final blocking bug, following which the original goal of this grant may yet be achieved without additional expenditure. What was achieved is:
- HebMorph itself was improved in several ways per our requests, preparing for integration with Kiwix. This is already useful in searching the Hebrew Wikipedia corpus through HebMorph online, as well as potentially useful for other FOSS projects involving searchable Hebrew corpora.
- Kiwix now supports multiple search-engine back-ends, notably CLucene. This works well on most languages right now.
- Additional volunteer work helped pinpoint the remaining technical problem, and we are writing it up, for anyone interested in tackling this in the future.
Measures of success
- Done An improved HebMorph is available on GitHub (link above) and as an online service (also above)
- Done The search engine improvements on Kiwix are in the mainline and part of current and future Kiwix releases
- Not done The Israeli OCPC does not yet incorporate the Hebrew Wikipedia in computers it distributes
What lessons were learned that may help others succeed in similar projects?
- the final stumbling block was quite unexpected and hard to anticipate even with more planning. A lesson that can be drawn is simply that "no foreseeable risks" is indeed not the same as "no risks".
- a wider pool of interested volunteer developers would have probably enabled the full success of this project. This is something WMIL can work on cultivating.
What impact did the project have on WMF mission goals of Increased Reach, Increased Quality, Increased Credibility, Increased and Diversified Participation?
- the improved software is already useful for improved searches in general (Kiwix) and in Hebrew in particular (HebMorph), thus improving the reach of our content.
- if we manage to resolve the remaining technical issue, offline distribution of the Hebrew Wikipedia will significantly increase our reach, reaching underprivileged children as well as some religious communities that use computers but shun the Internet.
Reporting and documentation of expenditures
Did you send WMF documentation of all expenses paid for with grant funds? No
Details of expenditures:
- Sep. 08, 2010 - Payment of 3,788 ILS (1,000 USD) to Itamar
- Feb. 27, 2011 - Payment of 7,361 ILS (2,000 USD) to Itamar
- Total grant amount: 5,000 USD - 3,000 USD = 2,000 USD surplus
Will you be requesting re-allocation of remaining grant funding? No
Will you be returning unused funds to the Wikimedia Foundation? Yes.
Will you be requesting an extension or were you granted an extension? We should have requested an extension, but failed to do so.
Please link to related grant proposals here: