Lexical alternates

From Meta, a Wikimedia project coordination wiki

This page proposes a mechanism for identifying spelling differences across two languages or dialects. It could be used to automatically change a wikipedia article's spelling depending on a user preference or geographical location. This might not be very useful for English (US/UK), but it might be useful for languages who are similar, yet different enough to use different wikipedias.

Motivation[edit]

Some languages and dialects have substantial spelling or vocabulary differences. English is a very mild though widespread example. You can say that the colour of the sky is grey or that the color of the sky is gray. So far users of British and American English have had a relatively easy time co-existing on the same wikipedia, largely because the differences are very minor; however, for other languages the spelling/vocabulary differences are large and bothersome enough that users would rather split their linguistic community and manpower across two wikipedias. When the differences are as slight as spelling and vocabulary, there is a possibility that a technological solution is possible.

Syntax[edit]

The basic idea is to tag each page with a list of words that it uses which have vocabulary differences. Depending on the user's preference or geographical location, or the wikipedia URL used (ms vs id), the wiki software can choose which alternative to use.

__LEXICAL_ALTERNATES__
us:color|uk:colour
us:gray|uk:grey

Note: while it might be useful to have a central page that includes a list of alternatives, this central page must remain very very small and be restricted only to words which appear very commonly and which vary 100% consistently. This is to keep things as predictable as possible and also to prevent them from getting computationally expensive.

Todo[edit]

  • Exclusion syntax
    • How would the exclusion syntax interact with other syntax?
  • User preferences?
  • MediaWiki messages?
  • Example from ms and id, perhaps