Grants talk:PEG/Anderson/Script encoding proposal for Nepal

From Meta, a Wikimedia project coordination wiki
Jump to navigation Jump to search

Evaluation by the GAC[edit]

GAC members who approve this request[edit]

  1. In-principle strong support, pending budget tweak and more specific information on stakeholder benefits (see my comments below). There seems to be significant opportunity benefit from having the right people in the right space right at the moment, and I get the feeling this should be a priority project, even though at a preliminary stage. Tony (talk) 02:22, 17 July 2014 (UTC)
  2. Support. Satisfied with the importance of the matter. NLIGuy (talk) 03:23, 17 July 2014 (UTC)
  3. --Ilario (talk) 08:46, 24 July 2014 (UTC)

GAC members who oppose this request[edit]

GAC members who abstain from voting/comment[edit]

GAC comments[edit]

Applicant[edit]

I am a bit confused about the applicant's details: Is that a Unicode consortium itself? An independent working group? Any academic group in Berkeley? Could you please clarify (or did I miss any description in the application?). Thank you.
Danny B. 07:42, 10 July 2014 (UTC)

Hello (author of this application here).

[Re: "Applicant"] I opted to have the Unicode Consortium, a 501c3, be the formal "applicant" since it would be handling the funds as the fiscal sponsor. I am not applying with UC Berkeley as the fiscal sponsor due to overhead requirements, which are quite high; the Unicode Consortium, on the other hand, will waive overhead costs, thereby helping funnel funds directly to the project without taking out any overhead. In other grant applications I have made (such as to NEH, Luce Foundation, etc.), "UC Berkeley" or "Unicode Consortium" is deemed the formal applicant, though I run the project. If there is another way to better reflect the fiscal sponsor (vs. who is the project leader) in the WMF application, let me know and I'd be happy to make changes. (I was myself uncertain.)

[My relationship to Unicode Consortium] I have longstanding relationship with the Unicode Consortium: I am the UC Berkeley representative to the Unicode Consortium, am on the Unicode Editorial Committee, and am also a Unicode Technical Director (all unpaid positions). My (paid) position is running a project I established at UC Berkeley, which works to get various scripts into the Unicode Standard. I have established a close enough relationship with the Unicode Consortium over the past 12 years so that they feel comfortable being the fiscal sponsor for funding applications.

In short:

  • Fiscal sponsor: Unicode Consortium (501c3)
  • Work to be overseen by: Deborah Anderson, who is a member of the Unicode Consortium and runs a project at UC Berkeley that works on getting scripts into Unicode

Let me know if I haven't answered your questions. Dwanders14 (talk) 03:13, 14 July 2014 (UTC)

Tony1 comments[edit]

Hi, I think this might be related, ultimately, to a very worthy goal, although the essential background information doesn't seem to be there. I'm just totally confused, and was wishing the pieces of the jigsaw puzzle were laid out so we could more easily enter what is a complicated technical and cultural interface.

Just how to make it easy for applicants to do so in the application is an important challenge for us.  :-)

Here are my immediate questions, answers to which would be great if inserted on the application page itself for GAC members who subsequently arrive here to review, rather than separated into this location (but if you're more comfortable interleaving responses under each query here, please do):

  1. Briefly, what is unicode, and how are non-roman scripts rendered generally? (I have only a very foggy notion of this whole issue, and it smells like something extremely important to the movement.) What, briefly, are the advantages of this unicode advance over current methods of rendering these scripts? What is involved technically, and who would do it? Isn't there some expert in the Netherlands who specialises in doing this? In other words, the big picture for those who aren't coders or developers, like me. Is it to do with pixels?
  2. Are there any precedents for the development of other unicode scripts—whether in or outside the WM movement?
  3. What is the relationship between between Berkeley (which I've detected within a URL) and Wikimedians in Nepal? How did this come about, very briefly? Who's from where, and is anyone an employee of an organisation (if so, position)? Who has what skills or knowledge relevant to this undertaking?
  4. What proportion of people in Nepal read/write in each of these targeted scripts? I guess I could go look them up in en.WP (if they have more than stub-articles on them).
  5. Most importantly: It all seems to be focused just on meetings. I'm uneasy about expensive physical meetups that are not the culmination of extensive online discussions over at least a few weeks, and that probably should have been the basis of the plan overleaf. Rather than a strong agenda and structure for a meeting, and a set of questions at the opening, there's just a vague mention of a few things much further down. Are you sure you have to meet up with people internationally?
Agree with you on that last point, Tony. NLIGuy (talk) 04:12, 14 July 2014 (UTC)

Procedural note to grantmaking staff (not part of my review of the application):

I find the "Please link to any relevant documents, including your website if you have one" question towards the top not very reader-friendly. Two reasons: First, I think I'd rather have the organisation's website further up as a pipe or a link in the "Please provide your name, or the name of the group or organization requesting this grant" question, like this:

Please provide your name, or the name of the group or organization requesting this grant.
Unicode Consortium

And second, encouraging other relevant websites to be dumped into one general area makes it really hard to follow the case (unless you have time to wander through them all before you understand what's going on). It might be much easier for GAC members to see the links/pipes embedded in the activities and/or goals sections. Tony (talk) 14:42, 10 July 2014 (UTC)

Thanks for the recommendations Tony. We'll be sharing the new application template next week. Alex Wang (WMF) (talk) 16:45, 10 July 2014 (UTC)

Response to Tony1's questions[edit]

Tony, Thanks for your questions. I will try to insert answers to portions of your questions - wherever possible - in the application itself, but will respond to all your questions in this space within the next day or so. Dwanders14 (talk) 04:09, 14 July 2014 (UTC)

Below are full answers to your questions. I will try to insert only select portions in the application itself, since the detail I provide below might be overwhelming.

Q: Briefly, what is unicode, and how are non-roman scripts rendered generally? (I have only a very foggy notion of this whole issue, and it smells like something extremely important to the movement.)

A: The Unicode Standard is the international character encoding standard that is used on all modern computers and mobile devices for sending text electronically. In Unicode, each character is assigned a unique number, which remains the same, no matter what platform or what software. For example, Latin small letter “a” is assigned the number – or “code point” -- 0061 (in hexadecimal notation). Non-Roman scripts and various symbols are included in Unicode Standard, too.

However, if the script is not in Unicode, the users must rely on "font hacks" or non-standard solutions, which means that sending text in those scripts cannot be sent/received reliably. If Greek was not in Unicode, for example, and I sent you an email (or a text message or a word-processing document or webpage) with the Greek letter alpha, you may see a “$” or a “p” or some other letter. This makes text interchange very problematic, including Wikipedia articles. (I believe that Wikipedia pages in a non-Roman script require the non-Roman script already be in Unicode.)

Q: What, briefly, are the advantages of this unicode advance over current methods of representing these scripts?

A: Unicode Standard is the standard of character encoding in the world today; it is supported by national body standards organizations, computer companies and font foundries. There are no other international standards in competition. The only way to represent scripts that are not in Unicode is to (a) use images for the letters or (b) to use a non-standard font (that is, one not based on Unicode), which will prevent reliable interchange of text.

(I changed out the word “rendering” you originally used – rendering typically refers to drawing the glyphs on a screen, and is later in the stage of getting a letter and script supported on computers and mobile devices. A very rough description of the order of how a script gets into Unicode and supported on devices is: [1] get script into Uncode so the characters have unique numbers, or code points, assigned to them, [b] once approved, develop fonts that use the assigned code points, [c] ensure the rendering – or drawing of the glyphs – on the screen is supported in current software/fonts, a process typically done by implementers.)

Q: What is involved technically, and who would do it? Isn't there some expert in the Netherlands who specialises in doing this? In other words, the big picture for those who aren't coders or developers, like me. Is it to do with pixels?

A: To get a script into Unicode, a proposal must be written and submitted to two standards committees. I sit on both committees and hence am very familiar with what needs to be included in the script proposal. There are a few experts who write script proposals, one being Anshuman Pandey (who is the Unicode proposal author on this WMF application). Another one is Michael Everson of Ireland. Both have worked for my project.

(“Pixels” are the small elements of a picture on a screen. My project does not focus on the rendering of pixels.)

Q: Are there any precedents for the development of other unicode scripts—whether in or outside the WM movement?

A: Yes. The main driver of new script proposals in the Unicode Standard has been my project at UC Berkeley. Of the 23 new scripts in Unicode 7.0 (which was released a few weeks ago), all by 2 came from my project.

WM has not yet been involved in getting scripts into Unicode. In discussions with Alolita Sharma, we identified “Prachalit Nepal” as one script that is an urgent need for WM, because the users are very active. However, users are currently writing their entries in Wikipedia in Devanagari, which is not the native script. This interest in submitting entries in the Wikipedia for Nepal Bhasa coincides with strong interest from the user community in getting the two scripts used for this language into Unicode. Since the two scripts share similar technical issues and the same people, it made sense to combine the two scripts into a single project.

I had been in touch with users since 2006 regarding getting both the Prachalit and Ranjana scripts into Unicode. (Regarding WM’s participation on script encoding, see below.)

Q: What is the relationship between between Berkeley (which I've detected within a URL) and Wikimedians in Nepal? How did this come about, very briefly? Who's from where, and is anyone an employee of an organisation (if so, position)? Who has what skills or knowledge relevant to this undertaking?

A: I am based at UC Berkeley, running the Script Encoding Initiative, and I will be overseeing this project. (As explained above, the Unicode Consortium will be the fiscal sponsor.) I have been in touch with those in Nepal interested in getting their scripts into Unicode since 2006. Only recently have I been in contact with Wikimedians in Nepal, via Alex Wang and others, when the idea of working with Wikimedia was identified as a possible way to help move this project forward. I have discussed the project with Alolita Sharma of WM, who encouraged me to apply to WMF.

Participants from WM (including Wikipedia contributors) involved in this project and what they will do:

  • Eukesh Ranjit (admin at Nepalbhasa wikipedia): will use the Prachalit Nepal and Ranjana scripts in Wikisource and Wikipedia
  • Saroj Dhakal (active Wikipedian/Wikimedian in Nepal): is connected with those at Google who work on fonts and implementing new scripts; will help put meeting organizers in touch with the Nepal Bhasa Academy and Nepal Academy, Local Languages Department.
  • Chris Fynn (administrator on Dzongkha Wikipedia; contributor of images to Wikimedia Commons since June 2007): has worked on a Ranjana font and been involved in other Unicode proposals and font projects; has expertise on the process of getting scripts into fonts; will submit Ranjana materials into Wikisource

Other WM personnel consulted:

  • Ganesh Paudel (Wikimedia Nepal): moral supporter of the use of Prachalit Lipi in Wikimedia Projects.
  • Prof. Bhimdhoj Shrestha, Tribhuwan University (Advisor Wikimedia Nepal): supporter of a meeting should be conducted in Nepal to finalize and initiate Ranjana script in Wikipedia

See also onwiki page: (link)

Others involved (outside WM):

  • Deborah Anderson (project leader of this project and other script encoding projects at UC Berkeley): will help plan the meeting and moderate discussion in Kathmandu
  • Allen Tuladhar (Microsoft Nepal): will act as key organizer in Kathmandu for meetings and invite participants;
  • Anshuman Pandey (author of 15 successful script proposals): will meet with users and take part in discussion and work on proposals
  • Peter Constable (Unicode representative and Senior Program Manager at Microsoft): will help moderate discussion in Kathmandu and can explain technical issues from implementers’ viewpoint
  • Suwarn Vajracharya (International coordinator for encoding Nepallipi, and Chair, Nepal Study Center, Japan [NSCJ]); has worked on fonts for Nepal Prachalit and will submit content to Wikipedia in Prachalit Nepal (once the script is in Unicode and fonts are available)
  • Devdass Manandhar: author of script proposals for Prachalit Nepal and Ranjana

Q: What proportion of people in Nepal read/write in each of these targeted scripts? I guess I could go look them up in en.WP (if they have more than stub-articles on them).

A: Both scripts, Prachalit and Ranjana, are used in Nepal to write the Nepal Bhasa language, which is spoken by approximately 847,000 in Nepal (2011 census, http://www.ethnologue.com). (The Ranjana script is also used to write languages in other countries, too.)

Today the Nepal Bhasa language is most often written in the Devanagari script, but there are passionate users who want to use Prachalit to write their language. The Prachalit script and Nepal Bhasa language were banned in 1905, but the ban was lifted in 1951. The Prachalit script is in some use today and remains an important symbol of identity for speakers of Nepal Bhasa. Ranjana is not used as a primary vehicle for everyday use, though it does appear on signs in Nepal, in ritual documents, and in some titles of publications. It is an important script because it is used in Buddhist scripture materials and literary texts, and would be a valuable source for Wikisource.

Regarding the proportion of people in Nepal who can read/write in these scripts today, this is difficult to gauge. The general figure of literacy in Nepal is 57.4%, with 3.2% of the population identified as having Nepal Bhasa as mother tongue (CIA Factbook https://www.cia.gov/library/publications/the-world-factbook/geos/np.html). The Nepal Bhasa language is taught as a second language in 13 schools (Sept. 18 2013 issue of Voice of Sikkim http://voiceofsikkim.com/newari-community-thanked-cm-for-newari-bhawan-and-language-recognition/), so the language appears to be taught in schools.Of the two scripts, Prachalit is far more likely to be taught, since Ranjana is used primarily in historical materials.

Q: Most importantly: It all seems to be focused just on meetings. I'm uneasy about expensive physical meetups that are not the culmination of extensive online discussions over at least a few weeks, and that probably should have been the basis of the plan overleaf. Rather than a strong agenda and structure for a meeting, and a set of questions at the opening, there's just a vague mention of a few things much further down. Are you sure you have to meet up with people internationally?

A: In this case, yes.

In general, work on encoding scripts can be done via email between the person writing the script proposal, those familiar with Unicode requirements, and native users/experts. And indeed, the work on Prachalit and Ranjana has been discussed via email with users and documents since 1998 (and especially since 2012) – see list of documents below. The proposals for the Prachalit Nepal script (=”Nepaalalipi” and “Newar”) and the Ranjana script come from either Devdass Manandhar from Nepal or Anshuman Pandey (and Michael Everson) from my project.

The issue is that the proposals coming from Nepal (from Manandhar) vary in technical aspects from those proposals coming from my project (from Pandey and Everson). Unfortunately, email communication on various issues is not working: the proposals continue to have incompatibilities that need to be resolved.

Without a face-to-face meeting, the script proposal will not progress. Proposals that address all the technical requirements from Unicode and the user community, and have the support of the user community are needed.

From my perspective, one main problem is the fact that representatives from Nepal are not able to participate in the Unicode meetings and thereby understand the technical requirements of Unicode. Similarly, the user community requirements need to be voiced. One option would be to bring a group of users to a Unicode meeting, but this would be expensive (more expensive than sending one person to Nepal, as is proposed in this application). I believe that meeting users in their home country, where the group would be able to include Wikimedians from Nepal, and native script experts and font developers, would be far more preferable. (We had a similar situation with an historic script from China. Email was simply not working. A face-to-face meeting with experts in Beijing in December 2013 was able to resolve the issues. Another script that was also stalled was Pahawh Hmong. A face-to-face meeting in Minnesota with was necessary in order to resolve the issues, which it did.)

In short: Every effort is made to try to progress script proposals via email. However, if, after several years if no progress is made, then a face-to-face meeting is the only way to proceed effectively, which is the situation with the Prachalit Nepal and Ranjana script proposals.

In this case, I will happen be in an ISO standards meeting in Sri Lanka at the end of September, as will a colleague Peter Constable. We are both on the two standards committees that must approve scripts. Both Peter and I are willing to pay personally to make the trip to Nepal to discuss with the user community the script proposals. In this funding request, I am asking WMF to bring the proposal author Pandey over to Nepal to work with the user communities on progressing the proposals. The meeting would then encompass those with Unicode/general script encoding knowledge (myself, Constable), script proposal authors (Manandhar from Nepal and script proposal author Pandey from my project), Wikimedians who are likely to be using the scripts in WM projects,and others with expertise in the script or implementation.


Rough timetable chronicling work on encoding these scripts:

1998

  • Nepal representative Tuladhar goes to ISO meeting to discuss encoding of scripts from Nepal

2006

  • Contact made between Anderson and Tuladhar regarding encoding of scripts in Nepal, Tuladhar has a code chart of the characters needed for both Ranjana and Prachalit Nepal, but no funding was available from Nepal to bring a Unicode proposal author to discuss the script proposals

2009 (“L2/09-XXX” documents accessible from http://www.unicode.org/L2/L2009/Register-2009.html )

  • Preliminary Ranjana proposal (L2/09-192) – Everson [funded through my project at UC Berkeley]
  • Document describing overview of various scripts in Nepal (L2/09-325) – Everson [funded through my project at UC Berkeley]

2011 (“L2/11-XXX” documents accessible from http://www.unicode.org/L2/L2011/Register-2011.html )

  • Preliminary “Prachalit Nepal” proposal (L2/11-152) – Pandey [funded through my project at UC Berkeley]

2012 (“L2/12-XXX” documents accessible from http://www.unicode.org/L2/L2012/Register-2012.html )

  • Proposal for the “Newar” script (L2/12-003) – Pandey [funded through my project at UC Berkeley]
  • Proposal for “Nepaalalipi” script (L2/12-120) – Manandhar [Anderson in communication with Manandhar on his proposal]
  • Response to proposals (L2/12-200) –Ken Whistler and Unicode Technical Committee
  • Response to L2/12-200 (L2/12-244) - Manandhar
  • Ancillary materials on breathy consonants in Nepaalalipi (L2/12-245) - Manandhar
  • Proposal on “Nepaalalipi” (L2/12-349) - Manandhar
  • Comparison between Newar and Nepaalalipi proposals (L2/12-390) – Anderson [funded through my project at UC Berkeley]

2013 (“L2/13-XXX” documents accessible from http://www.unicode.org/L2/L2013/Register-2013.html)

  • Document with email on “Nepaalalipi” (L2/13-029) - Manandhar
  • Proposal for Ranjana script (L2/13-243) – Manandhar

2014 (“L2/14-XXX” documents accessible from http://www.unicode.org/L2/L-curdoc.htm)

  • Proposal for “Nepaalalipi” script (L2/14-086) - Manandhar

Dwanders14 (talk) 02:09, 15 July 2014 (UTC)

Second opinion[edit]

Hey, I read the application, and realize that I think at least I need a second opinion. Would it be OK for me to solicit an opinion of Amir Aharoni for this? NLIGuy (talk) 04:16, 14 July 2014 (UTC)

I don't think it's a good precedent to ask permission of an applicant to seek a second opinion. Anyone in the world can comment on this page, so go ahead. Tony (talk)
Looks good to me. --Amir E. Aharoni (talk) 05:56, 19 July 2014 (UTC)

Rejoinder to response to my questions[edit]

So much text is now on this page that this will be lost unless I put it in a new section.

Thank you Dwanders for this excellent, detailed information. It's certainly a project I think I support in principle. I believe the gist of it, in summary form with minimal detail, should have been in an "Introduction and background" section at the start of the application, with the greater level of detail placed further down. The application itself functions as the official documentation of the proposal—the benchmark, as it were—for later perusal by those reading the reports, and to later determine lessons learned.

So since readers now have to go to two places to read the application, overleaf and here above, perhaps you might insert a few section-links to your talk page explanations at the end of some of those paragraphs. That might encourage a few more GAC members to approach this, even though it's still a clunky experience going backwards and forwards.

Dwanders:

  • "I believe that Wikipedia pages in a non-Roman script require the non-Roman script already be in Unicode."—that wasn't my understanding, at least a few years ago. I thought that explained why some non-roman WPs take a lot more memory per word than others.
  • In terms of reader experience, what will the benefit be? This, I believe, should have been stated at the top of the application. Perhaps then in terms of editor experience. And then any other benefits in terms of technical administration of the sites.
  • I'm not yet convinced about this sticking point that requires face to face with a non-tech community: "Every effort is made to try to progress script proposals via email. However, if, after several years if no progress is made, then a face-to-face meeting is the only way to proceed effectively". Is it political? It doesn't seem to be a technical issue. Why is it so tough? Has there been discourse on the wikis themselves? How big are these editorial communities? Somehow the central piece of the puzzle is missing, since meetings are the core of the proposal.
  • The spin-off in conveying to the movement how this is done, lessons learned, how to proceed with other unicode script developments: that would be valuable, but is hard to find in the application.
  • Budget: I've been to Kathmandu. The cost structure is very very low in dollar/euro terms. Why does a hotel cost $166 a night (you could get a low-grade hotel room for that in London)? I'm guessing that $30 to $40 a night would buy a decent hotel room (not five-star and not international; pension style should be sufficient, shouldn't it?). And $68 a day sounds like dining in international hotels. A good meal should cost $5–$7, shouldn't it?

Memo to staff: Some GAC members probably spend a couple of minutes reading the application page and recoil with horror at the task of understanding it. What kind of guidelines/examples will properly convey to applicants the amount of background and other details required. Should there be a note about how the application page itself should be a unitary record of the proposal? And another note about how technical stuff should be pitched at intelligent non-experts in the opening summary background? Perhaps state that where a complex project has been explained to and discussed with staff, the detail still has to be in writing in the application. Here, the application text seems to start 3/4 of the way through the presentation.

Tony (talk) 02:05, 16 July 2014 (UTC)

Response to Tony1's rejoinder[edit]

Thank you for your response, Tony.

I was not sure where to put the background info, so, yes, I'd be happy to insert Introduction and Background sections. (I do think having something in the template for Background would be helpful for people like myself, who are trying to succinctly explain something that is fairly technical.)

As I feel very strongly that this is an important project, any assistance you can give me in how to make the proposal clearer is very helpful. In turn, I can relay problems I had in filling out the online form or, if you welcome suggestions, I'd be happy to share thoughts I have on how to make it easier for applicants.

Q: What is the "overleaf"? Q2: As I'm new to this, can you help explain what "section-links to your talk page explanations" refers to - links to my (lengthy) explanations above or ? Thanks!

Sorry, I shouldn't have assumed. I was merely suggesting links from the application ("overleaf"—see "Grants" tab top-left of this page) back here to specific explanatory paragraphs. But don't worry at this stage! In terms of your time and effort, I think it could be better spent on other aspects. I'm now assuming that GAC members who visit will be prepared to read about it in two separate places. Tony (talk) 16:22, 17 July 2014 (UTC)

Responses to your comments:

  • "I believe that Wikipedia pages in a non-Roman script require the non-Roman script already be in Unicode."—that wasn't my understanding, at least a few years ago. I thought that explained why some non-roman WPs take a lot more memory per word than others.

I checked with Alolita Sharma on this. She replied, "Historically, there have been languages which had Wikipedias before Unicode had code points defined and accepted for some scripts. Nowadays, languages for live Wikipedias do have to be Unicode supported (ISO 639) in production." From this, it sounds like the current policy is that the script must be in Unicode.

  • In terms of reader experience, what will the benefit be? This, I believe, should have been stated at the top of the application. Perhaps then in terms of editor experience. And then any other benefits in terms of technical administration of the sites.

Q: Do you mean readers/editors/technical administrators of Wikipedia? (I'll adjust text accordingly, depending upon your answer.)

Ah, I meant the readers we serve: readers of the Nepali-language Wikipedia(s): they are our ultimate concern. Other stakeholders are the editorial community of those Wikipedia(s), and those who run offline activities in support of those sites. Tony (talk) 16:22, 17 July 2014 (UTC)
  • I'm not yet convinced about this sticking point that requires face to face with a non-tech community: "Every effort is made to try to progress script proposals via email. However, if, after several years if no progress is made, then a face-to-face meeting is the only way to proceed effectively". Is it political? It doesn't seem to be a technical issue. Why is it so tough? Has there been discourse on the wikis themselves? How big are these editorial communities? Somehow the central piece of the puzzle is missing, since meetings are the core of the proposal.

There are technical and non-technical issues. One problem is the name of the Prachalit Nepal script. The ISO rules don't permit "script" or a word translated as "script" in the name, but the proposal from Nepal uses "lipi" ('script'). Acceptable alternatives that will pass ISO standards committee need to be discussed and, hopefully, agreed upon. No agreement on email has been possible, after much back and forth.

OK. Tony (talk) 16:22, 17 July 2014 (UTC)

On the technical side: the proposal from Nepal has characters that do not conform to current practices and implementations. The proposal from Pandey does conform to Unicode practices, but needs to have the user community buy-in. It has not been possible to try to get the two sides to converge and produce a single proposal that can be acted upon by the standards committees. I do not see continuing to email back and forth as being productive. (The model for the proposed meeting is the same as that used successfully for Tangut. Tangut was similarly stalled, and email communication was not working.)

OK. There are exceptional reasons to support a physical meet up, I now believe. And yes, building personal trust and understanding is going to be important. Tony (talk) 16:22, 17 July 2014 (UTC)

Q: By "editorial committees", do you mean "standards committees"? If the latter, the Unicode standards committee has about 15 or so regular participants, and the ISO standards committee has 20-35 people.

  • The spin-off in conveying to the movement how this is done, lessons learned, how to proceed with other unicode script developments: that would be valuable, but is hard to find in the application.

By lessons learned to the movement, do you mean to the Wikipedia/WM movement or to other communities whose script is not in Unicode? (Should I move up some of the text currently in the section under "Benefits: If successful, will the project have the potential to be replicated successfully by other individuals, groups, or organizations? Please explain how in 1–2 sentences" ?)

The Wikimedia movement. Benefiting the movement more widely through "lessons learned" is an important (although possibly not essential) aspect of grant making. I think this comes from the WMF board, although I can't pin-point the documentation. It is one of the reasons for writing a report on your project in retrospect, so that your experiences might possibly help other, similar projects in the future. Tony (talk) 16:22, 17 July 2014 (UTC)
  • Budget: I've been to Kathmandu. The cost structure is very very low in dollar/euro terms. Why does a hotel cost $166 a night (you could get a low-grade hotel room for that in London)? I'm guessing that $30 to $40 a night would buy a decent hotel room (not five-star and not international; pensione style should be sufficient, shouldn't it?). And $68 a day sounds like dining in international hotels. A good meal should cost $5–$7, shouldn't it?

A: I myself was surprised on the rates given, but I just used those posted on the per diem rate as given on the US Dept. of State (http://aoprals.state.gov/web920/per_diem.asp), which seems to be a common reference point for grant applications. Should I adjust this to be only a certain percentage of the Dept. of State figures?

Dwanders14 (talk)

Dwanders, a quick google of Kathmandu hotel websites should give a pretty good idea. Nothing flash. Not five-star. Just decent and no big trek from meeting places. I suspect US Dept. of State is assuming international hotel and international hotel restaurants to avoid any risk of political flack from ... how do I put this politely? ... "forcing" their employees/contractors to engage with the local economy.

Memo to grantmaking staff: this is an example of the need for either or both general and location-specific guidelines and benchmarks for the costs of accommodation, travel allowance (food), and travel. So for volunteers normally three-star hotel/pensione standard is appropriate? Perhaps more if their specific technical or professional skills are very important and given pro bono, and a brief case is made? I don't know, but deftly linked and succinct guidance on Meta's grant making pages should eventually be our aim. Compare: the WMF's "rules" on car hire (or transport? I can't find it now) show an early intent that was admirable; but they're now looking elderly and in need of freshening up. I recall that they'd have been better if much shorter and in simple, plain English, and not so US-centric. Tony (talk) 02:15, 17 July 2014 (UTC)

Dear Tony, I will make adjustments to the budget and revise the proposal today, based on your comments. Thanks for the feedback. Dwanders14 66.234.217.89
Thanks, Dw. Tony (talk) 16:22, 17 July 2014 (UTC)
Tony: I have changed the lodging costs so they are now $40/night, based on feedback just received from one of the Wikipedians I am in contact with in Nepal ($35/night + 13% tax). I left the incidentals at $20, just so as not to come up short and to cover any unexpected costs. I have moved the rationale to the top of the application, but will re-read it and try to strengthen that section based on your comments tomorrow morning [Friday]. I appreciate your comments, as it helps me to understand the GAC and WMF viewpoints more clearly. Dwanders14 (talk) 02:45, 18 July 2014 (UTC)

Comments MADe[edit]

Hey, it's a refreshing proposal in a new country, so I was really interested. However, I'm currently considering voting against, as I lack the connection with our community.

Nepal is a growing wiki community. There is a local user group that seems to do great stuff - look at their running programmes. There are currently establishing a Wikipedia in a new language.

So for me it's twice as sad that there's no link at all with the local community. I would have enjoyed the experts coming from the other side of the world to support this local user group, eg. by giving talks, supporting initiaves... The only advantage for the Wikipedia movement, as it appreas at the moment, is the unicode font. While we could do that much more.... MADe (talk) 18:51, 17 July 2014 (UTC)

Dwanders' recent edits to the proposal[edit]

These look good. Diff for the benefit of GAC members and staff. Tony (talk) 08:48, 18 July 2014 (UTC)

Approved[edit]

Thank you Dwanders14, for your work on this proposal. A technical project such as this takes significantly more time explaining and discussing. We appreciate your and the GAC's engagement on this request. The project is approved. Alex Wang (WMF) (talk) 19:20, 18 July 2014 (UTC)

Request for budget change[edit]

I would like to ask for a budget change as shown below (with rationale). The bold section identifies the key differences between the proposed change and the current budget.

Updated proposed budget:

Number Category Item description Unit Number of units Cost per unit Total cost Currency Notes
1 airfare Seattle-Delhi-Kathmandu roundtrip 1 $1130.73 (SEA-DEL return); $407.02 [24657 INR] (DEL-KTM return) $1537.75 USD (and INR; OANDA conversion used for USD amount) For Pandey to attend scripts meeting in Kathmandu:

depart Seattle 22 Sept.; travel Delhi-Kathmandu 2 Oct.; travel Kathmandu-Delhi 11 Oct.; depart Delhi-Seattle 22 Oct.

2 lodging (in Kathmandu) night 9 nights $40 $360 USD $40 * 9 nights
3 meals and incidentals (in Kathmandu) M&I rate per day 9 days $20 $180 USD $20 * 9

Rationale for requested change There are two changes to the original budget and schedule:

1. Anshuman Pandey will break up the trip so he can stop in Delhi on the way to and from Kathmandu. His time in Delhi on his way to Kathmandu will be devoted to research into the scripts which will be discussed at the meeting. On his way back from Kathmandu, he will again meet with experts to relay the outcome of the meeting and try to gather support for the final recommendations of the Kathmandu meeting.

Details of the stop in Delhi, provided by Anshuman Pandey:
There has been some significant activity regarding the study and usage of the "Prachalit Nepal" and Ranjana scripts in India. In 2009, the National Mission for Manuscripts and the Indira Gandhi National Centre for the Arts hosted a sixteen-day workshop on the "Prachalit Nepal" and Sharada scripts in Varanasi. This suggests that there is some effort in India to study Nepalese manuscripts. Moreover, the Newar language is an official language of Sikkim and government bulletins are published in the "Prachalit Nepal" and Ranjana scripts. I am in the process of contacting and scheduling meetings with experts of these scripts in India. Moreover, communication with these experts will provide me with information regarding manuscripts and other materials in these scripts housed in Indian archives. Such meetings will also give me broader insight into the requirements of the global community of scholars of Nepalese palaeography, which is important for developing a comprehensive encoding for the "Prachalit Nepal" and Ranjana scripts in Unicode.

The net difference in cost for airfare is $289.75 over the projected cost (new airfare cost $1537.75 - original projected airfare cost $1248)

2. The meeting has been shortened from 10 days (4 Oct - 14 Oct) to 7 days (4 Oct - 10 Oct) so that Anderson can more fully participate in the discussions (as she departs 8 Oct). The schedule has basically removed the "BREAK" days in the tentative schedule in the original request. As a result, Anshuman Pandey will only be staying in Kathmandu 9 days (2 Oct. - 11 Oct). (Note: Pandey is not requesting lodging/meals and incidentals in Delhi.)

The net difference for lodging is $160 under budget (original budget $520 [13 * $40] - new projected cost $360 [9 * $40])

The net difference for meals/incidentals is $80 under budget (original budget $260 [13 * $20] - new projected cost $180 [9 * $20])

Overall difference:
Increased airfare cost $289.75
Decreased lodging/M&I cost $240
Net increase $49.75

This budget change request is approved. Alex Wang (WMF) (talk) 20:55, 22 September 2014 (UTC)

Post-Nepal Meeting Update and Request for Budget Change[edit]

Due to unforeseen circumstances, Anshuman Pandey was not able to make the trip to Kathmandu. However, I made the trip, as did my colleague Peter Constable. (Unfortunately, Chris Fynn was also not able to attend the meeting.) Both Peter C. and myself met with the user community in Kathmandu on Sat., 4 Oct., and I met again with the group of users on Sunday 5 Oct., Monday 6 Oct., and Tuesday 7 October. (Peter was only able to attend one day.) During the discussions in Kathmandu, I corresponded with Anshuman Pandey via email and other members of the Unicode Technical Committee, when needed.

Outcome of the Kathmandu meeting
The Kathmandu meeting produced a list of recommendations for the Unicode Technical Committee. The document (http://www.unicode.org/L2/L2014/14253-rec.pdf) included recommendations for the "Prachalit Nepal" (/"Nepaalalipi") script, along with a similar set of recommendations for the Ranjana script and a note about the Bhujinmola script. It was noted that many of the recommendations for Ranjana hinged on decisions for Prachalit Nepal (/Nepaalalipi), so the main focus of the meeting was on Prachalit Nepal (/Nepaalalipi). (Ranjana also is used in Tibet and other countries, so any proposal would need to get the buy-in from users in those countries.)

Unicode Technical Committee results
At the Unicode Technical Committee (UTC) meeting that was held from October 27-30 in Sunnyvale, CA, a consensus proposal was created and discussed. The consensus document took into consideration the Kathmandu recommendations as well as comments from UTC members. (The UTC's comments involved a few modifications of the Kathmandu recommendations, many of which will help to speed implementation of the script.) The consensus proposal for the "Newa" script (new name, in place of Prachalit Nepal or Nepaalalipi) was approved on October 30 by the Unicode Technical Committee (http://www.unicode.org/L2/L2014/14285r-newa.pdf). The consensus proposal includes specific information for developers on the modern use of the script (as opposed to the historic usage), including identifying the modern characters and their preferred names. (Note: A Nepal Bhasa speaker was able to attend one day of discussion at the UTC.)

Next steps and Ranjana script
A copy of the Newa consensus proposal was sent to the lead members of the Kathmandu meeting for distribution and comments on 10 November 2014. I am awaiting feedback from the attendees from the Kathmandu meeting. If there are no major issues from the users in Nepal, the script is expected to go onto the next ballot that will include new scripts (CD for the 5th edition of ISO/IEC 10646). This ballot will be circulated to ISO/IEC JTC1/SC2 WG2 members, probably by mid-year 2015. If no major stumbling blocks arise, it is expected that the script will continue through the balloting process, and can be ready for publication in Unicode 9.0 in the summer of 2016.

Since Newa is used in everyday usage more often than Ranjana, the focus of the Kathmandu meeting was on Newa. It was not possible to make more progress on Ranjana in Kathmandu, since many of the details for a Ranjana proposal depended upon the reception by the Unicode Technical Committee on the recommendations for the Newa script.

If the Newa proposal proceeds smoothly, the next step is to work on (a) symbols used in Newa, where there was no agreement between various parties, and (b) Ranjana, which should include stakeholders from Nepal and other countries. In this context, it would be useful to apply for an IEG grant for Anshuman to travel to Nepal and meet with users there on Newa symbols and Ranjana. Such a meeting would build on connections established at the Kathmandu meeting; the goal would be to discuss Newa symbols and Ranjana, and to forge a concrete path towards consensus documents.

Rationale for change in budget
The original proposal budgeted travel and expenses for Anshuman Pandey to go to Kathmandu (total amount approved: $2077.75). Since he was not able to attend the meeting but I was, and the goal of the project was achieved - to meet with the user community in Kathmandu and lay out a plan to move the scripts forward - I am asking the Wikimedia Foundation for reimbursement of my travel and lodging to Kathmandu in lieu of Anshuman's costs. Over half of the funds will be returned to WMF.

Updated budget (based on actual costs, receipts for all charges are available):

Number Category Item description Unit Number of units Cost per unit Total cost Currency Notes
1 airfare Colombo-Delhi-Kathmandu one-way 1 $377 + $132.03 $509.03 USD For Anderson to attend scripts meeting in Kathmandu:

$377 is for travel on Air India from Colombo-Delhi 3 Oct. and Delhi-Kathmandu 4 Oct. (Note: No direct flights available from Colombo-Kathmandu); $132.03 were change fees incurred in modifying Singapore Airlines original ticket from SFO-Colombo (return) to SFO-Colombo and Kathmandu-SFO.

2 lodging (Delhi) night 1 nights 137.45 137.45 USD One night in Delhi necessitated as there is no direct flight from Colombo to Kathmandu
2 lodging (Kathmandu) night 4 nights 3929.97 NPR ($39.09 USD) 15719.88 ($156.36 USD) NPR (USD) 3929.97 NPR * 4 nights = $39.09/day * 4 = Total $156.36 USD (OANDA conversion rate)

Total: $802.84

Earlier funding approved for the project was $2077.75, so the amount to be returned to WMF would be $1274.91.


Notes:

1. M&I were minimal since the hosts generously provided lunch and several dinners.

Hi Debbie. Thank you for this detailed progress report. We appreciate you notifying us when Anshuman was not able to attend the meetings and providing updates throughout the grant period. We are happy that despite his inability to travel, you were able to make significant progress moving the Newa script forward. This budget change request is approved. We look forward to the report (for which I am sure you can copy a lot of the above!) and discussions around future plans for the Ranjana script. Alex Wang (WMF) (talk) 22:37, 19 November 2014 (UTC)
Hi Alex, I made a small error in typing in the Kathmandu lodging. The cost was 3913.97 NPR/night instead of 3929 NPR (* 4 = 15679.88 NPR total). However, the total for 4 days still converts to $156 (today, on OANDA). - Debbie
Hi Alex, A correction to the comment above ("Earlier funding approved for the project was $2077.75, so the amount to be returned to WMF would be $1274.91"). The WMF funding was $2077.75, but WMF also paid for $30 to cover the wiring fees, so the total received was actually $2107.75. I will be repaid $802 (instead of $802.84, just to make accounting a bit easier), so the money returned to WMF should be $2107.75 - $802 = $1305.75. - Debbie