Language committee/Archives/2009-02

For a summary of discussions, see the archives index.

Spanned discussions

[edit]

The following discussions span multiple months and are archived in the first applicable archive:

"Sabine Cretella resigns, abolition of Chair position" (January 2009)
"Antony D. Green joins the committee" (January 2009)
"Polytonic format on Wikipedia Greek" (January 2009)

Wikipedia South Azerbaijani

[edit]

No decision was taken on the request for a South Azerbaijani Wikipedia.

Milos Rancic (Millosh)
01 February 2009 04:28

There is a proposal for South Azerbaijani Wikipedia [1] and I am talking now with Mohsen Salek (User:Mardetanha), an Azerbaijani from Iran, who is a native speaker of South Azerbaijani.
The situation is the next:
- Standard Azerbaijani is written in Latin, while South Azerbaijani is written in Arabic script.
- Yes, there are some linguistic differences, but they are not yet so strong. It is possible to work on dictionary for conversion engine.
- Community is using both scripts at the same project and they are happy with that.
- Latin script is preferred even in Iran because it is alphabetic, not consonantal.
- There are people who prefer to treat South Azerbaijani as a separate language, but they are rare.
- Literacy in Iran is not so high (around 80%).
My conclusion here is that we should reject the project conditionally. As the situation is not urgent; they may use South Azerbaijani at az.wp; we'll add more to the confusion if we approve South Azerbaijani project because level of literacy is not high; it is possible to make a conversion engine in a couple of years and even 10-20 years are not an issue in such situations.
But, if there are much more persons interested in making South Azerbaijani Wikipedia (let's say, 10-15) and in the mean time we don't make a conversion engine, they should get the project.
[1] - http://meta.wikimedia.org/wiki/Requests_for_new_languages/Wikipedia_South_Azerbaijani
Michael Everson (Evertype)
01 February 2009 04:41

Milos Rancic wrote:

> * Latin script is preferred even in Iran because it is alphabetic,
> not consonantal.

I was in Iranian Azerbaijan two summers ago. Not a shred of Latin anywhere to be seen. I have a major, modern Azeri dictionary published in Arabic in Iran.
Milos Rancic (Millosh)
01 February 2009 04:46

Yes, I wasn't enough clear toward this issue... Latin is known just from people who went to school. Usually, Azerbaijanis from Iran have problems in reading Latin. However, literate persons prefer to write Azerbaijani in Latin because of the contrast between alphabetic and consonantal scripts.
The most important issue related to this is that they don't have problems in cooperation (see the project [1]). AFAIK, the most prominent Wikipedians from az.wp are from Iran, not from Azerbaijan.
Of course, there are a lot of work to be done there, but nothing is urgent.
[1] - http://az.wikipedia.org/

Wrong language code used by Aramaic Wikipedia

[edit]

No committee action was taken to change the language code used by the Aramaic Wikipedia (see linked discussion), since existing wikis are not within the committee's scope.

Gerard Meijssen (GerardM)
01 February 2009 20:17

<this user has not agreed to public archival.>
— (Karen)
02 February 2009 20:34

<this user has not agreed to public archival.>
Milos Rancic (Millosh)
03 February 2009 13:25

Yes, it is straight-forward and it is possible to add mod_rewrite instructions for those codes for some time (in the case of Aramaic Wikipedia, it could last forever because there won't be Official Aramaic Wikipedia).

Wikipedia Sorani

[edit]

No decision was taken on the request for a Sorani Wikipedia.

Gerard Meijssen (GerardM)
14 February 2009 11:13

<this user has not agreed to public archival.>
Milos Rancic (Millosh)
14 February 2009 13:37

Yes, they should be eligible. At the other hand, I am quite sure that we would be able to make conversion engines in such cases. And, generally, it makes me sad because we are not doing that: Cultures, like Kurdish is, need a lot of efforts to build one encyclopedia, but they will have to build at lest two.
Gerard Meijssen (GerardM)
14 February 2009 15:26

<this user has not agreed to public archival.>
Antony D. Green (Antony D. Green)
15 February 2009 10:16

Hmm... Gerard says they're more different than German and Dutch, http://en.wikipedia.org/wiki/Kurdish_language#Kurmanji_and_Sorani says (with source!) that they're as different as German and English, but http://www.cal.org/co/kurds/klang.html says they're mutually intelligible. Nevertheless, I think a separate Sorani (Central Kurdish) Wikipedia (code ckb) in the Arabic alphabet would be a good idea. It seems that Kurmanji can be written in Latin, Arabic, or Cyrillic, while Sorani is pretty much written only in Arabic. So even if Sorani does get its own Wikipedia, the issue of script conversion engines for Kurmanji Wikipedia could still be relevant.
Would Kurmanji Wikipedia stay at ku, or would it be moved to kmr?
Gerard Meijssen (GerardM)
15 February 2009 10:21

<this user has not agreed to public archival.>
Milos Rancic (Millosh)
15 February 2009 16:20

Gerard Meijssen wrote:
<this text is quoted from a user who has not agreed to public archival.>

This is a Betawiki problem: All of the code is there and conversion may be done automatically (sr-ec <-> sr-el). Why it is not done, it is not a question to translators, but to programmers.
Gerard Meijssen (GerardM)
15 February 2009 16:29

<this user has not agreed to public archival.>
Milos Rancic (Millosh)
15 February 2009 16:50

Gerard Meijssen wrote:
<this text is quoted from a user who has not agreed to public archival.>

A number of languages need conversion engines, most notably Chinese. This is not just Serbian language problem. Also, this is not a general MediaWiki issue, but an issue specific to Betawiki: how to implement *existing* conversion engines into the Betawiki extension.
Gerard Meijssen (GerardM)
15 February 2009 16:54

<this user has not agreed to public archival.>
Milos Rancic (Millosh)
15 February 2009 17:34

Gerard Meijssen wrote:
<this text is quoted from a user who has not agreed to public archival.>

There are 56 files at [1], not all of them are languages. This is the full list of localizations implemented into MediaWiki which runs on Wikipedia.
Inside of those 56 files there are three conversion implementations: Kazakh, Serbian and Chinese. All of the methods used at Kazakh and Serbian are subsets of the Chinese implementation with different parameters. If something is functioning for Chinese, it will function for the other two examples.
Also, I don't know how Chinese solved that. By converting characters by hand? Note that just Taiwan has two times more inhabitants than a number of speakers of Serbian in the world is.
Besides that, a number of other languages need the same, including Kurmanji, Azerbaijani and so on, need conversion engines, but there is a lack of will to implement that.
I know that having characters in Unicode is a more basic need, but saying that one community is responsible for the lack of the systematic will -- is not nice.
[1] - http://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/languages/classes/
Gerard Meijssen (GerardM)
15 February 2009 17:54

<this user has not agreed to public archival.>
Gerard Meijssen (GerardM)
15 February 2009 18:01

<this user has not agreed to public archival.>
Michael Everson (Evertype)
15 February 2009 18:26

Do I have to say AGAIN that reversible transliteration conversion is ONLY possible from certain scripts to other scripts, and CANNOT be applied generally?
Gerard Meijssen (GerardM)
15 February 2009 18:45

<this user has not agreed to public archival.>
Milos Rancic (Millosh)
15 February 2009 20:13

Gerard Meijssen wrote:
<this text is quoted from a user who has not agreed to public archival.>

Gerard, thanks for your expertise. I would appreciate some examples, especially Wikimedian ones.
Michael, there are different levels of usefulness of conversion engines:
- In the case of Cyrillic<->Latin for Serbian and Simplified<->Traditional for Chinese, engines are fully useful for production.
- In the case of very similar standards (Serbian-Croatian-Bosnian), it is possible to make a fully useful conversion engine for production.
- In the case of similar standards (Bokmal-Nynorsk-Danish), it is possible to make a useful conversion engine for semi-automatic articles import-export.
- In the case of controlled languages, it is, again, possible to make a fully useful conversion engine. (See below.)
- And, all of us know how translation engines between distant languages are working now (cf. Google Translate).
BUT, we'll never do anything if we don't start to do it. At the present stage, I just see obstruction of that idea. Again, yes, I know that characters for some languages are more important, but those two issues are not exclusive. And, I know that I am talking about this issue around 5 years earlier, but if I don't talk about it now, we'll need 10 or 20 years instead of 5.
About controlled languages, I finally suggested to the Simple English Wikipedia community to change their path [1]. Feel free to discuss there.
[1] - http://simple.wikipedia.org/wiki/Wikipedia:Simple_talk#An_idea_for_the_new_path_of_Simple
Gerard Meijssen (GerardM)
16 February 2009 02:54

<this user has not agreed to public archival.>
Milos Rancic (Millosh)
17 February 2009 15:22

Gerard Meijssen wrote:
<this text is quoted from a user who has not agreed to public archival.>

Thanks for raising this issue with Siebrand.
And about credibility... I hope that you understand that scripts are usually the part of the cultural/ethnic/national identity. Because of that nobody cares about making conversion engines between scripts. A person identified with one cultural identity usually doesn't like the competitive one. I had to pass through a traumatic experience while advocating conversion engine for Serbian: While script usage is around 50-50, majority of both populations have strong aversions against the competitive script. But, things became much better when conversion engine became reality: majority isn't frustrated because of scripts related issues anymore.
While protecting cultural identities is a good task, my priority is spreading free knowledge and making informations more available. Conversion and translation engines are helping in that task because more people are able to share their knowledge together.
As I said, conversion/translation engines may be better or worst. In some cases it is possible to make a conversion engine able to fulfill production needs and we have a couple of them (Chinese, Serbian) and much more, not just script-related, are possible. In other cases it helps to people to understand text written in their language, but in different script. In the third set of cases, they may produce useful material for semi-automatic knowledge sharing. And, again, if we don't start to do that today, we'll wait more until we get really useful solutions.

Simple-English Wikipedia and other simple-language wikis

[edit]

A previous committee decision was upheld against approving simple-language wikis, and the second request for a simple-English Wikinews was rejected.

Gerard Meijssen (GerardM)
17 February 2009 05:23

<this user has not agreed to public archival.>
Michael Everson (Evertype)
17 February 2009 05:48

You're against Simple English? Or you're against this project?
Gerard Meijssen (GerardM)
17 February 2009 06:13

<this user has not agreed to public archival.>
Milos Rancic (Millosh)
17 February 2009 13:31

Gerard Meijssen wrote:
<this text is quoted from a user who has not agreed to public archival.>

Besides the policy, there are two issues related to the Simple English Wikinews:
- The particular one is related to Wikinews itself. Wikinews may be maintained by at least a couple of devoted people. And I don't think that Simple community is strong enough to maintain it.
- I would really like to see the work toward more useful type of projects. (As mentioned in other thread: I would like to see Simple English as a controlled language for machine translation purposes.)
Antony D. Green (Antony D. Green)
17 February 2009 15:11

I'm opposed to Simple English Wikinews too. I don't see that anything has changed since the proposal was rejected last year; see http://meta.wikimedia.org/wiki/Requests_for_new_languages/Wikinews_Simple_English
Jesse Plamondon-Willard (Pathoschild)
17 February 2009 15:56

I also object, per all the reasons we've discussed before. "Simple English" is not a language, but a vague, undefined subset of English with no published orthography, literature, or any indication of culture or usage distinct from English. It is completely mutually intelligible with English, so there is no benefit to having a separate wiki. Whether English-wiki articles should be written more simply is a content issue, not a language issue (and there is no obstacle to having separate simple articles on the English Wikinews). There is no ISO 639 code distinct from English.
At the very least, languages should exist before they have a wiki. It exists only as English, and we already have English wikis.
Jon Harald Søby (Jon Harald Søby)
17 February 2009 16:03

I also object to the creation of more Simple English projects; instead of creating separate projects, the simple versions should be sub-projects of the individual language. So just like Simple English is a subset of English, Simple English Wikinews should be a sub-project of English Wikinews. That will be of more benefit to the readers of both projects.

Lower Saxon message file code

[edit]

Siebrand asked for the committee's opinion about renaming the Lower Saxon message file code to nds-de, to match the existing nds-nl for a potential future unified Dutch wiki; the response was negative.

Gerard Meijssen (GerardM)
21 February 2009 19:35

<this user has not agreed to public archival.>
Milos Rancic (Millosh)
22 February 2009 13:16

<s>If it wouldn't break anything, I prefer nds-de/nds-nl distinction.</s>
Ah, I see now... I would break consistency between Wikimedia codes and localization codes. So, it shouldn't be done.

Wikiversity Galician

[edit]

The request for a Galician Wikiversity was marked eligible.

Gerard Meijssen (GerardM)
24 February 2009 22:29

<this user has not agreed to public archival.>

Board email address changed

[edit]

The email address used to contact the Board of Trustees was updated.

Gerard Meijssen (GerardM)
TO Sue Gardner (Executive Director)
25 February 2009 09:01

<this user has not agreed to public archival.>
Gerard Meijssen (GerardM)
TO Sue Gardner (Executive Director)
25 February 2009 14:57

<this user has not agreed to public archival.>

SPQRobin joins the subcommittee

[edit]

Robin P. (SPQRobin) was inducted into the committee.

Gerard Meijssen (GerardM)
CC Robin P.
22 February 2009 10:52

<this user has not agreed to public archival.>
Milos Rancic (Millosh)
CC Robin P.
22 February 2009 13:12

Of course, SQPRobin is around and as an admin of Incubator his experience may be very useful.
Jesse Plamondon-Willard (Pathoschild)
CC Robin P.
22 February 2009 18:06

Gerard Meijssen wrote:
<this text is quoted from a user who has not agreed to public archival.>

Agreed. I can find nothing worrying in any of his discussions and edits, and find several reasons to support.
Antony D. Green (Antony D. Green)
CC Robin P.
23 February 2009 03:54

You mention "tender years" but you don't say how old he is. Is he at least 18?
Gerard Meijssen (GerardM)
23 February 2009 09:15

<this user has not agreed to public archival.>
Antony D. Green (Antony D. Green)
23 February 2009 15:05

Well, if the Foundation doesn't have a problem with a <age censored>-year-old langcom member, neither do I. (I know there are some Foundation positions that require you to be at least 18.)
Milos Rancic (Millosh)
24 February 2009 05:51

Antony Green wrote:

> Well, if the Foundation doesn't have a problem with a <age censored>-year-old langcom
> member, neither do I. (I know there are some Foundation positions that
> require you to be at least 18.)

There is a need for that just in the cases which assumes dealing with private data. Up to now, I think that only CheckUsers, oversighters, stewards and OTRS volunteers belong to that category.
Gerard Meijssen (GerardM)
TO Board of Trustees
CC Robin P.
25 February 2009 08:52

<this user has not agreed to public archival.>
Shanel Kalicharan (Shanel)
25 February 2009 14:58

I don't see any problem with him being on Langcom. Timichal was a member of this committee, and he was even younger, if I remember correctly. Welcome SPQRobin :).
Jesse Plamondon-Willard (Pathoschild)
09 March 2009 05:46

Hello Robin,
Welcome to the language committee. You can catch up on past discussion by looking at the private archives, the public archive index, or the current policy (see links below); or just jump right in, and we'll poke you as needed.
Discussion with the committee is copied to a public archive for transparency, but this is opt-in only. Do you agree to the public archival of your emails? You can mark any discussion or comment as private and the text won't be archived. If you don't agree, your messages will be replaced with the note "<this user has not agreed to public archival>".
- Private archives: https://lists.wikimedia.org/mailman/listinfo/langcom-l
- Public archives: http://meta.wikimedia.org/wiki/Language_committee/Archives
- Current policy: http://meta.wikimedia.org/wiki/Meta:Language_proposal_policy
Robin P. (SPQRobin)
09 March 2009 14:29

Hello,
Thank you, and I hope I will fulfil my job as a langcom member well :-) Yes, you may archive my emails publicly.
Milos Rancic (Millosh)
09 March 2009 13:46

Welcome, SPQrobin :)