User:OrenBochman/I18n

From Meta, a Wikimedia project coordination wiki

I18n[edit]

A Lead paragraph motivation, outline and basis in policy. Remember:

  • what is internatzation
  • where is the code
  • [1]


2 concepts of language:

  • user's $wgLang
  • content - $wgContLang

will be removed due to removal of globalization

Language::factory('en')

languages/ folder

langiages/


General use (for developers)[edit]

Language objects[edit]

There are two ways to get a language object. You can use the globals $wgLang and $wgContLang for user interface and content language respectively. For an arbitrary language you can construct an object by using Language::factory( 'en' ), by replacing en with the code of the language. The list of codes is in languages/Names.php.

Language objects are needed for doing language specific functions, most often to do number, time and date formatting, but also to construct lists and other things. There are multiple layers of caching and merging with fallback languages, but the details are irrelevant in normal use.

Using messages[edit]

MediaWiki uses a central repository of messages which are referenced by keys in the code. This is different from, for example, Gettext, which just extracts the translatable strings from the source files. The key-based system makes some things easier, like refining the original texts and tracking changes to messages. The drawback is of course that the list of used messages and the list of source texts for those keys can get out of sync. In practice this isn't a big problem, sometimes extra messages which are not used anymore still stay up for translation.

To make message keys more manageable and easy to find, always write them completely and don't rely too much on creating them dynamically. You may concatenate key parts if you feel that it gives your code better structure, but put a comment nearby with a list of the possible resulting keys. For example:

// Messages that can be used here:
// * myextension-connection-success
// * myextension-connection-warning
// * myextension-connection-error
$text = wfMessage( 'myextension-connection-' . $status )->parse();

Message processing introduction[edit]

The message system in MediaWiki is quite complex, a bit too complex. One of the reasons for this is that MediaWiki is a web application. Messages can go through all kinds of processing. The four major ones covering almost all cases are:

  1. as-is, no processing at all
  2. light wiki-parsing, parserfunction references starting with {{ are replaced with their results
  3. full wiki-parsing

Case 1. is for processing, not really for user visible messages. Light wiki-parsing should always be combined with html-escaping.

Recommended ways[edit]

Longer messages that are not used hundreds of times on a page:

  • OutputPage::addWikiMsg
  • OutputPage::wrapWikiMsg
  • wfMessage()

OutputPage methods parse messages and add them directly to the output buffer. wfMessage can be used when a message should not be added to the output buffer. ->parse() removes enclosing html tags from the parsed result, usually <p>..</p>, but can generate invalid code for example if there is no root tag in parsed result, for example <p>..</p><p>..</p>. Usage examples:

$out->addWikiMsg( 'foobar', $user->formatNum( count( $items ) ) );
$out->wrapWikiMsg( '<div class="baz">\n$1\n</div>', array( 'foobar', $user->getName() ) );
$text = wfMessage( 'foobar', $language->date( $ts ) )->parse();

Other messages with light wiki-parsing can use wfMsg and wfMessage with ->text(). wfMessage should always be used if the message has parts that depend on linguistic information, like {{PLURAL:$1}}. Do not use wfMsg, wfMsgHtml for those kind of messages! They seem to work but are broken.

$out = Xml::submitButton( wfMsg( 'foobar' ) ); # no linguistic information
$out = Xml::label( wfMessage( 'foobar', $wgLang->formatNum( $count ) )->text() ); # uses plural on $count

Some messages have mixed escaping and parsing. Most commonly when using raw links in messages that should not be escaped. The preferred way is to use wfMessage with ->rawParams() for the affected parameters. Be especially wary of using wfMsgHtml, it only escapes the message, not parameters. This has caused at least one XSS in MediaWiki.

Short list of functions to avoid:

  • wfMsgHtml (don't use unless you really want unescaped parameters)
  • wfMsgWikiHtml (breaks up linguistic functions, as does wfMsg)
  • OutputPage::parse and parseInline, addWikiText (if you know the message, use addWikiMsg or wrapWikiMsg)

Remember that almost all Xml:: and Html::-functions escape everything fed into them, so avoid double-escaping and parsed text with those.

Using messages in JavaScript[edit]

To use the messages in client side, we need to use resourceloader to make sure that the messages are available at client side first. For this, in your resource loader modules, define the messages to be exported to client side.

Example:

$wgResourceModules['ext.foobar.core'] = array(
    'scripts' => array( 'resources/ext.foobar.js'),
    'styles' => 'resources/ext.extension.css',
    'localBasePath' => $dir,
    'remoteExtPath' => 'FooBar',
    'messages' => array(
        'message-key-foo',
        'message-key-bar',
    ),
);

The messages defined in the above example message-key-foo, message-key-bar will be available at client side and can be accessed by mw.msg( 'message-key-foo'). Se the example given below:

$( '<a>' ).prop( 'href', '#' ).text( mw.msg( 'message-key-foo') );

We can also pass the dynamic parameters to the message(ie the values for $1, $2) etc) as shown below.

$( '<a>' ).prop( 'href', '#' ).text( mw.msg( 'message-key-foo',  value1, value2 ) );

In the above examples, note that the message should be defined in an i18n.php file. If the messagekey is not found in any i18n.php file, the result of mw.msg will be the message key in agnle brackets - like <message-key-foo>.

When using localization messages, be sure to always make sure it is properly escaped to prevent potential html injections as well as preventing malformed markup with special characters.

  • If using jQuery's .html, use .text( mw.msg( ... ) ) instead of .html( mw.msg( ... ) ). jQuery will make sure to set the elements' inner text value instead of the raw html. This is the best option and is also fastest in performance because it avoids escaping all together because .text() goes almost straight into the browser, removing the need for escaping.
  • If using jQuery's .append, escape manually .append( '<li>' + mw.message( 'example' ).escaped() + '</li>' );
  • If manually building an html string, escape manually by creating a message object and calling .escaped() (instead of the mw.msg shortcut, which does mw.message(key).plain() ):
    '<foo>' + mw.message( 'example' ).escaped() + '</foo>';

PLURAL and GENDER support in JavaScript[edit]

Mediawiki 1.19 onwards, the messages for JavaScript can contain PLURAL and GENDER directives. This feature is optional and extensions which require this feature should define an additional dependency mediawiki.jqueryMsg in the resourceloader module definition.

If you have a message , say, 'message-key-plural-foo' => 'There {{PLURAL:$1|is|are}} $1 {{PLURAL:$1|item|items}}' , in JavaScript , you can use it as given below:

 mw.msg( 'message-key-plural-foo',  count ) ;
// returns 'There is 1 item' if count = 1
// returns 'There are 6 items' if count = 6

If you have a message , say, 'message-key-gender-foo' => '{{GENDER:$1|he|she}} created an article' , in JavaScript, you can use it as given below:

 mw.msg( 'message-key-gender-foo', 'male' ) ; // returns 'he created an article'
 mw.msg( 'message-key-gender-foo', 'female' ) ; // returns 'she created an article'

Instead of passing the gender directly, we can pass an user object - ie mw.User object with a gender attribute to mw.msg. For eg, the current user object.

var user = mw.user; //current user
 mw.msg( 'message-key-gender-foo', user ) ; // The message returned will be based on the gender of the current user.

If the gender passed to mw.msg is invalid or unknown, gender neutral form will be used as defined for each language.

The keywords GENDER, PLURAL are case insensitive.

GRAMMAR in JavaScript[edit]

Mediawiki 1.20 onwards, the messages for JavaScript can contain GRAMMAR directive. This feature is optional and extensions which require this feature should define an additional dependency mediawiki.language.data in the resourceloader module definition.

The static grammar form rules can be defined in $wgGrammarForms gloabl. The dynamic language specific grammar rules in PHP has been ported to javascript. Once the dependency mediawiki.language.data iis added mw.msg method can be used as usual to parse the messages with word where N is the name of the grammatical form needed and word is the word being operated on. More information about Grammar is available here

Adding new messages[edit]

  1. Decide a name (key) for the message. Try to follow global or local conventions for naming. For extensions, use a standard prefix, preferably the extension name in lower case, followed by a hyphen ("-"). Try to stick to lower case letters, numbers and dashes in message names; most others are between less practical or not working at all. See also Manual:Coding conventions#Messages.
  2. Make sure that you are using suitable handling for the message (parsing, {{-replacement, escaping for HTML, etc.)
  3. Add it to languages/messages/MessageEn.php (core) or your extensions i18n file under 'en'.
  4. Take a pause and consider the wording of the message. Is it as clear as possible? Can it be understood wrong? Ask comments from other developers or from localizers if possible. Follow the #internationalization hints.
  5. Add documentation to MessagesQqq.php or your extensions i18n file under 'qqq'. Read more about #message documentation.
  6. If you added a message to core, add the message key also to maintenance/language/messages.inc (also add the section if you created a new one). This file will define the order and formatting of messages in all message files.

Removing existing messages[edit]

  1. Remove it from MessagesEn.php. Don't bother with other languages - updates from translatewiki.net will handle those automatically.
  2. Remove it from maintenance/language/messages.inc

Step 2 is not needed for extensions, so you only have to remove your English language messages from ExtensionName.i18n.php.

Changing existing messages[edit]

  1. Consider updating the message documentation (see Adding new messages).
  2. Change the message key if old translations are not suitable for the new meaning. This also includes changes in message handling (parsing, escaping). If in doubt, ask in #mediawiki-i18n or in the Support page at translatewiki.net.
  3. If the extension is supported by translatewiki, please only change the English source message and/or key. If needed, the internationalisation and localisation team will take care of updating the translations, marking them as outdated, cleaning up the file or renaming keys where possible. This also applies when you're only changing things like HTML tags that you could change in other languages without speaking those languages. Most of these actions will take place in translatewiki.net and will reach Git or Subversion with about one day of delay.

Localizing namespaces and special page aliases[edit]

Namespaces and special page names (i.e. RecentChanges in Special:RecentChanges) are also translatable.

Namespaces[edit]

To allow custom namespaces introduced by your extension to be translated, create a MyExtension.namespaces.php file that looks like this:

<?php
/**
 * Translations of the namespaces introduced by MyExtension.
 *
 * @file
 */

$namespaceNames = array();

// For wikis where the MyExtension extension is not installed.
if( !defined( 'NS_MYEXTENSION' ) ) {
    define( 'NS_MYEXTENSION', 2510 );
}

if( !defined( 'NS_MYEXTENSION_TALK' ) ) {
    define( 'NS_MYEXTENSION_TALK', 2511 );
}

/** English */
$namespaceNames['en'] = array(
    NS_MYEXTENSION => 'MyNamespace',
    NS_MYEXTENSION_TALK => 'MyNamespace_talk',
);

/** Finnish (Suomi) */
$namespaceNames['fi'] = array(
    NS_MYEXTENSION => 'Nimiavaruuteni',
    NS_MYEXTENSION_TALK => 'Keskustelu_nimiavaruudestani',
);

Then load the namespace translation file in MyExtension.php via $wgExtensionMessagesFiles['MyExtensionNamespaces'] = dirname( __FILE__ ) . '/MyExtension.namespaces.php';

When a user installs MyExtension on their Finnish (fi) wiki, the custom namespace will be translated into Finnish magically, and the user doesn't need to do a thing!

Special page aliases[edit]

Create a new file for the special page aliases in this format:

<?php
/**
 * Aliases for the MyExtension extension.
 *
 * @file
 * @ingroup Extensions
 */

$aliases = array();

/** English */
$aliases['en'] = array(
    'MyExtension' => array( 'MyExtension' )
);

/** Finnish (Suomi) */
$aliases['fi'] = array(
    'MyExtension' => array( 'Lisäosani' )
);

Then load it in the extension's setup file like this: $wgExtensionAliasesFiles['MyExtension'] = dirname( __FILE__ ) . '/MyExtension.alias.php';

When your special page code uses either SpecialPage::getTitleFor( 'MyExtension' ) or $this->getTitle() (in the class that provides Special:MyExtension), the localized alias will be used, if it's available.


Simulated article[edit]

The Sun is the star at the centre of our solar system. The Earth and other matter (including other planets, asteroids, meteoroids, comets and dust) orbit the Sun, which by itself accounts for more than 99% of the solar system’s mass. Energy from the Sun—in the form of sunlight, supports almost all life on Earth via photosynthesis, and, via heating from insolation—drives the Earth’s climate and weather. About 74% of the Sun’s mass is hydrogen, 25% is helium, and the rest is made up of trace quantities of heavier elements. The Sun is about 4.6 billion years old and is about halfway through its main-sequence evolution, during which nuclear fusion reactions in its core fuse hydrogen into helium. Each second, more than four million tonnes of matter are converted into energy within the Sun’s core, producing neutrinos and solar radiation. In about five billion years, the Sun will evolve into a red giant and then a white dwarf, creating a planetary nebula in the process. The Sun is a magnetically active star; it supports a strong, changing magnetic field that varies from year to year and reverses direction about every 11 years. The Sun’s magnetic field gives rise to many effects that are collectively called solar activity, including sunspots on the surface of the Sun, solar flares, and variations in the solar wind that carry material through the solar system. The effects of solar activity on Earth include auroras at moderate to high latitudes, and the disruption of radio communications and electric power. Solar activity is thought to have played a large role in the formation and evolution of the solar system, and strongly affects the structure of Earth’s outer atmosphere. Although it is the nearest star to Earth and has been intensively studied by scientists, many questions about the Sun remain unanswered; these include why its outer atmosphere has a temperature of over a million degrees K when its visible surface (the photosphere) has a temperature of just 6000 K. Current topics of scientific enquiry include the Sun’s regular cycle of sunspot activity, the physics and origin of solar flares and prominences, the magnetic interaction between the chromosphere and the corona, and the origin of the solar wind.

Test yourself[edit]

Discussion[edit]

Any questions or would you like to take the test?