Proposal on language integration

From Meta, a Wikimedia project coordination wiki

This is a proposal to further integrate multiple language in MediaWiki. It is influenced for a large part by thoughts on language integration, but tries to distill the thoughts into a coherent proposal on how to change the software to get there. Discuss on the Talk page.

Current shortcomings[edit]

  • Dictionary entries cannot be easily moved to Wiktionary because:
    • Wiktionary is only available in English
    • Edit history not preserved
  • Duplication of images across all Wikipedias
  • Every Wikipedia requires users to register; no single signon
  • The interlanguage links in Wikipedia constitute a rudimentary multilingual dictionary. The links point to (somewhat) equivalent definitions in different languages. The problem with this is in the redirects and the scope of the articles in the different languages. Redirects cannot have interlanguage links; this makes it in impossible to make a good match between articles. Because the scope of articles between languages can be vastly different, an article in one language could point to three or more articles in a different language.
  • Meta information is scattered around different places. It can be found in the meta wikipedia and in the Wikipedia namespaces of every wikipedia.
  • The interface to wikipedia is single language only this is particularly troublesome for the Metawikipedia that is in English (content+interface) only, making it unsuited for users not fluent in English.

Background on the current situation[edit]

Every wikipedia has multiple namespaces:

  • Special, for statistics, recent changes, page history, page moving, etc.;
  • Main, the core content of the Wikipedia or Wiktionary, depending on the site;
  • Main_talk, for discussions of the content of the corresponding article in Main;
  • Wikipedia, information pages about this Wikipedia, in Wiktionary, this namespace is called Wiktionary;
  • Wikipedia_talk, for discussions about the corresponding information page in Wikipedia;
  • Image, information about uploaded images;
  • Image_talk, for discussions about the corresponding image page in Image;
  • User, information about users;
  • User_talk, for discussions with the user.

Only one of these namespaces, Main, has a default start page.

What could we improve[edit]

Make user language interface configurable[edit]

The user must be able to choose the language of the interface. For registered user this can be a preference setting. The software can determine the language from the browser's response. For anonymous visitors, the interface language should probably match the language of the page they're watching.

Merge language and Wiktionary databases into one[edit]

We can distinguish three separate dimensions:

  • language: undecided, en, de, fr, ...
  • namespace: encyclopedia, dictionary, user, special, quotes, books, meta, media, sept11.
  • relevance: article and talk

Together they form a cube where each intersection can be identified as a wikipedia that is already in use or has been suggested on thoughts on language integration.

By using these three dimensions, we can merge the different databases and rearrange the current content:

  • undecided:meta:article+talk: the current Meta Wikipedia
  • en:encyclopedia:article+talk: the article+talk namespace of the current English Wikipedia
  • en:dictionary:article+talk: the Wiktionary article+talk namespace
  • en:meta:article+talk: the Wikipedia namespace of the current English Wikipedia and Wiktionary projects
  • undecided:media:article+talk: the Image namespace for language neutral images
  • en:media:article+talk: the Image namespace for the current English language Image.
  • en:user:article+talk: user pages for English language wikipedians.
  • undecided:special: special pages for the entire project (statistics, moving pages across namespaces and languages, etc.)
  • en:special: special pages with English language preference

This enables the following new categories:

  • undecided:encyclopedia: the new interlanguage portal to the encyclopedia
  • undecided:dictionary: the new interlanguage portal to the dictionary
  • undecided:user:article+talk: a single signon for Wikipedians

Move interlanguage links to Wiktionary[edit]

As remarked above, the interlanguage links make up a rudimentary foreign language dictionary. That information should go into a dictionary and Wiktionary is our dictionary project. Instead of the multitude of language links at the top (and bottom) of the page, the software should show the availability of articles in the other projects.

Let's illustrate this with an example. Above the Wikipedia article on w:Alexander the Great there can be links to dictionary (how to write his name in different languages), quote (interesting things he said), image (all available images depicting him), textbook (book about his life and accomplishments), user (maybe there's a user with this handle)


Changes to the database[edit]

All relevant tables need to have a language field: cur, old, archive, recentchanges and user. The first four specify the language of the article while user_lang will specify the language of the user's interface.

To ensure consistency when all languages are put into one common database, the ID's should be prefixed with a number to avoid clashes. Barring that all languages can be copied into the database while adding the language field.

The only problem is the merging of the Wikipedia:Wikipedia and Wiktionary:Wiktionary namespace as they will be folded into a common namespace for the English language.


Current situation[edit]

Various interest by Ap, Kowey, BV et al. in the past. This page last significantly edited Dec, 2003.