Talk:Interwiki sorting order

From Meta, a Wikimedia project coordination wiki
Jump to: navigation, search

Archives of this page


2005 | 2006 | 2007 | 2008 | 2009 | 2010 | 2011

Proposal: Storing interwiki sorting at local system message[edit]

As some of you know i am writing an alternative implementation of the interwiki bot in java. Most interwikis bots are using the pywikipediabot framework and also AWB need the interwiki sorting order. I don't know if there are more interwiki frameworks, but they all need to know the interwiki sorting order for adding interwikis to pages. At the moment each framework has its own config file storing information about the interwiki sorting order for each wiki. If there is a new wiki all config files must be updated manually and it getting more complicated if an wiki want to change its order used. This takes some time even at the pywikipediabot framework. Sometimes its even complicated for the human developers not knowing a alphabet or the correct transliteration to identify the correct order.

I am talking about the interwiki sorting order at source code. I know that there are already some exentions to mediawiki changing the interwiki order on a rendered page. But this does not help tools working at the source code. The content of Interwiki sorting order can not really be processed by a parser. Thats why e.g. AWB uses en:Wikipedia:AutoWikiBrowser/IW. My suggestion is to store the information about the interwiki sorting order at the MediaWiki namespace of each wiki with some duplicated information sourced out to metawiki.

At the moment there are six general sorting orders:

  • A By order of (latin) alphabet, based on language code
  • B By order of (fy) alphabet, based on language code (with i=y)
  • C By order of alphabet, based on local language
  • D By order of alphabet, based on local language by first word
  • E By order of latin alphabet, based on local language (by first word)
  • F By order of roman alphabet, based on local language (by first word)

A and B can easily calculated by the tools itself and need not be stored somewhere i think. E and F are only used once, so this information could be stored completely on this wiki. I would like to offer to following procedure:

On each wiki there is a system message MediaWiki:Interwiki config/sorting order containing the sorting order for this wiki with one interwiki code per line. On srwiki sr:MediaWiki:Interwiki config/sorting order would look like:

ace
af
ak
als
am
ang
ab
ar
an
arc
roa-rup
frp
arz
…

For the sorting order codes are read line by line with skipping duplicate codes (so the first position is always used).

  • If this system message does not exist, sorting order A is used (e.g. dewiki or most other wikis).
  • If expected codes are missing they are append at the end of the list using sort order A.
    This makes it possible for hewiki and huwiki to define e.g. he:MediaWiki:Interwiki config/sorting order containing only
    en
    
    So wikis can move some language to the top without taking care about the rest.

Sorting order C and D could be stored at metawiki and a placeholder keyword starting with meta- is used in the local system message to reduce redundant informations. e.g if the keyword for C is meta-aphabet-local the sorting order could be found at MediaWiki:Interwiki config/sorting order/aphabet-local (with meta-* = meta:MediaWiki:Interwiki config/sorting order/*)

ace
af
ak
als
am
ang
ab
ar
an
arc
roa-rup
frp
as
ast
gn
av
…
meta-aphabet-local
  • For urwiki which whould like to have ar,fa and en on top it would be

ar
fa
en
meta-aphabet-local
Because of the condition skipping duplicated lines from above, you can simply replace the meta- keyword by the special list read from metawiki to ar fa en af ak als am ang ab ar an arc roa-rup frp as ast gn av ….

For the calculated orders A and B the keyword starts with general- (e.g. general-alphabet-code for A and general-alphabet-code-iy B) to distinguish them from code stored at meta (meta-) or used language codes.


The above description is very technical because its important that developer can read it without having any questions on some special cases. So its sounds much more complicated than it really is. For the local community it is getting easier to define there own sorting order and not all tool developers need to care about new wiki any more. In most cases only a meta admin has to add a new wiki to the system messages.

The initial work for setting up this system can be done by a global admin or sb. with editprotected right. Because most wikis use the default ordering this new system message must only be created on few wikis.

Quick migration for all existing tools is quite easy, because they can write a script which fetches the information from wikis and automatically creates to config file. The pywikipediabot project already uses such a script for autocreating the namespaces names at family files and AWB simple has to copy the sorting order from meta to en:Wikipedia:AutoWikiBrowser/IW. Also toolserver users could read the configuration from the replicated db servers. Later they could modify their framework to read the information live from wiki if they want, but i think having a computer readable config and then using an automated config file creation script would be already an improvement for all tools. Merlissimo 17:09, 31 March 2011 (UTC)

Discussion[edit]

Support Support --Akkakk 17:41, 31 March 2011 (UTC)
Support Support excellent idea. Seb az86556 17:59, 31 March 2011 (UTC)
Support Support GameOn 05:55, 13 June 2011 (UTC)
Support Stödjer How soon can this be implemented? -- Lavallen 06:01, 13 June 2011 (UTC)


I don't understand all the technical stuff here, but I would love to see something that


Where is the definition of the displayed name of the language stored? At some stage the displayed name of language kbd seems to have been changed from Къэбэрдеибзэ (see Interwiki sorting order/table) to Адыгэбзэ (see top entry in the languages side bar of en:Abaza language, which links to the kbd wikipedia); but the sort orders had not been updated.
  • It would be good if changes to the displayed name of the language also resulted in appropriate changes to the sort orders.
Coroboy (talk) 05:47, 15 August 2011 (UTC)

So this is now announced for half an year without any oppose now (and i informed all bot framework developers). I will request global editinterface right and create the system messages a suggested above. I think most programmiers will need some inital time for implementing this, so we'll keep this python list for some time (2-3 month?). Merlissimo 11:25, 4 November 2011 (UTC)

I have implemented to config as provided at the old version to all wikis and added lez and shi. I'll add some examples during the next days. I hope the desciption is complete. But some details - i already described at the proposal above - can be pointed out by adding exmaples.
But first i will implement this to my own bot. Then i can use the script as validator to show that no lancode is missing on any wiki. Merlissimo 23:42, 7 February 2012 (UTC)
While implementing i recognize that its not possible the read system messages containing an "/" using the message module. So its only possible to read this as normal page content. Should we move the config sites to "-" (local config would be on MediaWiki:Interwiki config-sorting order and meta config e.g. on MediaWiki:Interwiki config-sorting order-native-languagename) or should i change the description for reading the last revision instead? If would prefer the first option, so developers still have both option. I announced to allmessages method because i though it would be easer because you don't have to care about existing last revisions. Merlissimo 15:41, 8 February 2012 (UTC)
I have moved all config pages from slash to hyphen. Merlissimo 17:18, 8 February 2012 (UTC)
I implemented the new config way to my bot and i could verify that all local configs are ok. Later i'll write a script checking periodically if all local configs are ok.
I would also suggest to add a rule, that a local configs can simply be deleted by stewards if that was not maintained by local admins for some month. This won't be a problem soon, but perhaps in a few years if a local admin that added the config got inactive. Then this could be a problem for very small wikis who are using their own full sorting order (wikis only adding some top interwikis and are using an auto- oder meta-config for the rest aren't a problem).
Currently only srwiki and svwiktionary are using its own full interwikis sorting rule. I'll write a scripts that notifies these local communities about missing codes after e.g. a langcom approval. Merlissimo 01:02, 9 February 2012 (UTC)