Wikipedia and linguistic varieties

From Meta, a Wikimedia project coordination wiki
Fabrics in different colours/colors, symbolizing the linguistic varieties within one language. Seen in a shop in Kathmandu.

Wikipedia exists in many different languages. Usually, you would expect to have one Wikipedia language version (one wiki website) per language. But sometimes, the situation is complicated. This page collects best practices and ideas: how to deal with linguistic varieties within one Wikipedia language version?

You are kindly invited to add general ideas or examples from a specific wiki or a specific case.

For example, English Wikipedia tolerates several varieties of the English language: British English, American English, Australian English and so on. English Wikipedia has developed some rules that explain in which situation you are allowed to use one variety or another one. It is quite likely that you, the reader of this page, speak a language with similar challenges.

Background[edit]

What is a language? What is a dialect? When do you need two different wikis? Can you have one wiki for two (or more) 'languages' that are very similar to each other?

There is Dutch Wikipedia, and there is German Wikipedia. Nobody would think of arranging one Wikipedia for both languages: Dutch and German are separate languages, although they are very closely related to each other. Both stem from the same dialect continuum.

  • But Dutch and German are standard languages based on different dialects.
  • The speaker communities of both languages are different: Dutch is spoken mainly in the Netherlands, Belgium and Surinam, German in Germany, Austria and Switzerland.
  • And the two languages are not mutually intelligible: a Dutch person does not understand German without having learned it, and vice versa (even if she does recognize many words).
  • Both languages develop independently from each other: Dutch evolves without regard of what is happening in the German speaking region, even if Dutch uses some German words or words that are loaned from German. (And, to a smaller degree, vice versa.)

But how about the differences within the Dutch linguistic community? In Dutch, a lorry can be called vrachtauto, vrachtwagen or camion. The Wikipedia article has the most commonly understood word nl:vrachtauto as title, and explains: 'A vrachtauto or vrachtwagen (in Belgian Dutch also camion) is a motor vehicle made for transporting goods...'

This is one solution, but is this the only one? In the different Wikipedia versions, there are many ways how to deal with linguistic varieties:

  • Technical dimension: sometimes, you can solve the problem by technology. Serbian can be written in Latin or Cyrillic script. Therefore, the reader can choose which alphabet she prefers, with a simple switch. This is relatively easy to implement, as usually one Cyrillic letter responds to one Latin letter. In other cases, the relationship between two alphabets is more complicated, e.g. between Latin and Arabic alphabet used for Kurdish.
  • Cultural/linguistic dimension: Writers can try to choose terms that are neutral with regard to varieties. For example, the English Wikipedia article en:Fixed wing aircraft is meant to satisfy both readers of British English (aeroplane) and American English (airplane).
  • Social dimension: The collaborators of the Wikipedia language version can agree to tolerate linguistic differences. They agree that an article should not be changed only to replace one acceptable term with another one evenly acceptable.

By the way: there are cases in which one can argue whether you are dealing with one languages or several. Are Urdu and Hindi the same language? Are Croatian, Bosnian, Serbian and Montenegrin four different languages? How about the shared origin and the commonalities between Bahasa Indonesia and Bahasa Malaysia?

Some Wikipedia language versions have found creative solutions: some Scandinavian Wikipedias link to featured articles in other Scandinavian Wikipedias.

Practices and ideas[edit]

Link the variety to the topic[edit]

In German Wikipedia (de.WP), the variety of en:Germany is de facto the standard. But articles with a thematic connection to a different German speaking country can and should be written in that respective variety. For example, the article de:Vaduz (which is the capital of the country Liechtenstein) is written in the spelling of Liechtenstein. Ziko (talk) 11:16, 14 June 2022 (UTC)[reply]

The de.WP rule is part of the 'How to write good articles' page, in the section about standardized language. There are also pages with specific advice for Austria and Switzerland. Both have the same paragraph about how to deal with the variety: 'Proofreaders are asked for restraint and tact, especially in ambiguous cases. In case of doubt, making knowledge available to all German speaking people is more important than a decision about linguistic policy; the author(s) of the text should be respected, and the style within an article should be consistent. / Before a change is made, you should ask yourself whether the variant used by the author is actually generally incomprehensible or misleading, or whether it is more a matter of taste. If not: Is there an expression in Common German that might convey the message just as well, rather than replacing one high-level idiosyncrasy with another?' Ziko (talk) 11:16, 14 June 2022 (UTC)[reply]

Explain the terminology at the beginning of an article[edit]

Offer different varieties[edit]

Are the readers of English Wikipedia more familiar with the metric system or with 'imperial units'? English Wikipedia often offers the information both ways. For example, the article en:Elephant informs: 'African bush elephants are the largest species, with males being 304–336 cm (10 ft 0 in – 11 ft 0 in) tall at the shoulder...' Ziko (talk) 11:16, 14 June 2022 (UTC)[reply]

Identify the variety of the article[edit]

Indicate the variety the article is written in. Collaborators are expected to respect the variety. For example, the Bavarian article (bar.WP) bar:Niamberg is written in Upper Bavarian, as a template explains. Ziko (talk) 11:16, 14 June 2022 (UTC)[reply]


Secession[edit]

In some cases, a minority left the Wikipedia language version and started a new one. For example, Kurdish exists in two major varieties. Originally there was only one Kurdish Wikipedia. In 2009, those Kurdish speakers who prefer Arabic script left and created a new Wikipedia. Ziko (talk) 11:32, 14 June 2022 (UTC)[reply]