Translating content on Meta
Translating content is a high priority for both the Grantmaking and Communication teams at WMF, since we work with a global audience that crosses many project, language, and geographic boundaries. Our goal is to make all of the information resources we create and maintain accessible to users in their own language. Doing that requires new solutions that will help us support translators and work with translated pages better at a larger scale.
- 1 Overview
- 2 General challenges
- 2.1 1. Translate tags and translation units add a lot of noise to wikimarkup
- 2.2 2. Translate tags and translation unit markers are brittle
- 2.3 3. Moving translated pages is very difficult and error-prone
- 2.4 4. Edits to translated pages often fail to show up (replag?)
- 2.5 5. Translate admins may mark pages for translation prematurely
- 2.6 6. Can't edit templates in translation
- 2.7 7. Bots can break translated pages
- 2.8 8. Communication
- 3 Templates and transcluded pages
- 4 Ideas for improving translation-related processes
- 5 Links
In order to expand our translated content, we need to understand how the translation extension works, including its current limitations. We also need to improve how we document what we do and communicate with one another, so that we can work productively even when we're working independently.
Translate admins, template developers, and content contributors currently face many challenges related to the way the translation extension works. A lot of the information one needs to work successfully with the translation extension is currently only documented on Bugzilla, in the form of known issues and hacky workarounds. Let's document some of the general issues, as well as tips and tricks for addressing them. We may also be able to come up with a set of requests for the Language Engineering team, if we can help them highlight the most important issues to fix.
Please add any issues you face to the list below, as well as potential solutions to those problems.
Take this table, for example. Embedded translate tags and translation unit tags create headaches when you want to change content in the source language, or re-use content elsewhere. Can we avoid putting translate tags in places that will be problematic in future revisions?
- need to separate the markup cruft (<translate> <!--T:1-->) from the page text.
Accidentally deleting or re-ordering them can truly mess up the page. How can we best deal with these on pages that we have to re-order or re-write?
3. Moving translated pages is very difficult and error-prone
- not like a normal page move; translatable pages can't be moved by non-admins.
- does not leave a re-direct, you need to do that manually.
- if you move from A>B and a mistake happens, you have to manually delete all the redirects. You can't just copy/paste the pages because you lose all the translations.
- DOES move the translations, as long as all target pages are clear (don't exist). You get warnings telling you you can't move ANY of the pages in your batch if a single target subpage exists.
- takes a long time for the requests to be processed in a batch move.
4. Edits to translated pages often fail to show up (replag?)
- things don't update after edit, translations not saving
5. Translate admins may mark pages for translation prematurely
Even if you subsequently "discourage" translation quickly (to avoid wasting translators time), you still now must deal with extra markup cruft, and difficulty moving, editing, etc. How can we make it easier to communicate that a page is NOT ready for translation?
6. Can't edit templates in translation
Changes to source lang of templates called through translatable navigation template will not display unless translation units are manually edited/deleted by sysop through manual and unreliable process.
- this may have changed? or work differently with different templates?
- there's a lack of documentation around how translation templates work, what their effects may be, who's in charge, who to contact?
7. Bots can break translated pages
- Code for bots (esp. GrantsBot) needs to be updated so that they do not overwrite translation markup.
- Bot documentation needs to be kept up to date so that translate admins and other content curators are able to find out how their edits may interact with the bot workflow.
- Certain pages that bots frequently update with new content need to be added to a do not translate list or hidden category.
- Grantmaking staff and translate admins who work with them on Grantmaking content need to keep each other informed of their activities and plans, to avoid conflict and wasted effort.
- There needs to be a central forum for having these discussions, that is easy to find. They shouldn't be taking place on users' talk pages.
Templates and transcluded pages
- Templates present an especially challenging use case for translation. Templates often contain complex markup (like tables!), inline HTML and CSS, parser functions, magic words, and calls to other templates. This makes templates difficult to read to begin with. When you add <translate> tags and translation unit markers (like <!--T:24-->) to the mix, this creates additional noise, which makes it much more difficult to read and understand what the template is doing.
- Update the non-translatable content of a translated template is only possible through a series of undocumented hacks, and requires translation admin privileges. If I want to make the background green, or remove some text entirely? If I just make the change on the template page, it won't show up when the template is transcluded.
Best practices for template design
- Design simpler templates
- not always possible or desirable, but the simpler your template is, the easier it will be for translate admins to mark it for translation, and for people to inspect your template through the haze of extra markup.
- Separate content from style and structure
- as much as possible, put words (stuff to be translated) in a separate template from wikitext, HTML, and CSS. Transclude these different pieces into the main template, so they can be edited independently.
- Call out unstable templates
- if a template is likely to change in the future, or is still being tweaked, add a banner to the top of the page, inside <noinclude> tags asking that it not be marked for translation.
Please add ideas for improvements to the list below
- batch insert/delete of translation markup
- simple rule-based addition of translation tags (for example: one set per page section). Ability to "clean" a page by deleting ALL translation and unit tags. Useful for re-purposing page contents elsewhere, or removing a page from translation.
- batch translatable page move improvements
- option to leave a redirect behind and move talk page. make the process less error-prone. Related bugs: #42239
- better on-wiki documentation of the known issues and available work-arounds
- The current docs don't always give complete/correct info.
Unresolved bugs around editing translated pages. Larger list is available on this pinboard
- #50973 Preview gives fatal exception on Meta page with translate extension and a table
- #55803 "Fatal exception of type MWException" when trying to transclude a translatable page
- #44329 Translate extension should warn users when they change translation IDs
- #46716 Translation page does not contain the latest translations/last translation
- #39415 Deleting a translation unit page doesn't remove the corresponding content from the translation page
- #53960 Can't perform normal undo action on a translation page
- #54579 FuzzyBot fails to generate /en subpage
- #51731 After re-marking an updated page for translation, FuzzyBot does not react, or only ports over the previous update
- #58760 marking a new version does not refresh the status/statistics of translated pages
- #56518 "This page is a translated version of ..." in English /en page
- #58946 "translate" in front of section header confuses section editing
- #43042 Deleting a translation page doesn't update the language bar
- #48891 "translate" and "tvar" do not work with Parsoid because they are not normal parser tags
- #35489 The source language of a page should be arbitrary
- #45096 Add way to transclude template or other page in the correct language, e.g. with Special:MyLanguage
- #53343 Translated page should be reset when translation unit subpage is deleted
- #46925 translate code copied to /en subpage occasionally when page edited (requires null edit and remark to fix)
- #53932 "Mark of translation pages screws with tvar's in /en translation" (James' favorite bug)
- #60920 Moving a translatable page doesn't create a redirect to the original page (or give the option to create one)