User:Invadibot/范围/元维基-2

From Meta, a Wikimedia project coordination wiki
This page is a translated version of the page User:Invadibot/scope/meta-2 and the translation is 60% complete.

目标

此任务的目标是:

  • 使用相对URL连接在一些HTTP连接变量中,以便在不改变使用协议的情况下允许用户导航;
  • 另外,为一些已经损坏的跨wiki连接进行修复,这种变化不会改变文字显示。

Fixing all mixed-content warnings will be a long effort by both the MediaWiki core and extension developers and the project communities. A number of templates, CSS, and Javascript on projects are improperly referencing resources, and as such, they are being loaded incorrectly. All resources should be referenced using protocol-relative URLs now (//<resource-url> vs http://<resource-url>).

[...]

All of the links in our content have changed from being protocol-specific to protocol-relative. This content is cached in our squid layer, and in our parser cache. We don’t wish to clear our entire cache immediately to fix this, as it would cause severe performance issues. Instead we will either clear the cache slowly over time, or we’ll let it clear naturally.

程序:对条件的变量

基于相对协议URL的应用

Protocol-relative URL is applied if the link found (all conditions required):

  • has an external link format, which means:
    • it is between single squared brackets and
    • it starts with an URL;
  • has an HTTP defined protocol (not HTTPS);
  • points to (wikipedia/wikinews/wikisource/wikibooks/wikiquote/wikiversity/wiktionary/wikivoyage/wikidata/wikimedia/wikimediafoundation/mediawiki).org domain names, case sensitive;
  • is not inside these tags: categorytree, comment, charinsert, dynamicpagelist, gallery, hiero, imagemap, inputbox, invoke, math, nowiki, pagelist, pagequality, pages, poem, pre, property, score, section, source, syntaxhighlight, templatedata, timeline;
  • is not in the exceptions list, and neither the page is in it.

跨wiki格式的应用

Interwiki format is also applied if the link found (all conditions required):

  • has an external link format, which means:
    • it is between single squared brackets and
    • it starts with an URL;
  • has not an HTTPS defined protocol;
  • points to (wikipedia/wikinews/wikisource/wikibooks/wikiquote/wikiversity/wiktionary/wikivoyage/wikidata/wikimedia/wikimediafoundation/mediawiki).org domain names, case sensitive;
  • is not inside these tags: categorytree, comment, charinsert, dynamicpagelist, gallery, hiero, imagemap, inputbox, invoke, math, nowiki, pagelist, pagequality, pages, poem, pre, property, score, section, source, syntaxhighlight, templatedata, timeline;
  • has a defined text to show, which means it contains some text after the URL, separated by a space;
  • has not a canonical URL format;
  • points to a defined page after /wiki/ path;
  • is not in the exceptions list, and neither the page is in it

范围

Changes that this task carries out can be made on all editable pages of a wiki. They are needed to allow users to navigate with HTTP or HTTPS, and to maintain the protocol in use along the navigation.

代码

Regular expressions used in this task, ready to run with Pywikipediabot in user-fixes.py file, are available here:

# -*- coding: utf-8 -*-
# <nowiki>
fixes['wmp-prurls'] = {
     # ----
     # From <https://meta.wikimedia.org/wiki/User:Invadibot/scope/meta-2/user-fixes.py>.
     # By David Abián and Roan Kattouw.
     # ----
     # This program is free software: you can redistribute it and/or modify
     # it under the terms of the GNU General Public License as published by
     # the Free Software Foundation, either version 3 of the License, or
     # (at your option) any later version.
     # 
     # This program is distributed in the hope that it will be useful,
     # but WITHOUT ANY WARRANTY; without even the implied warranty of
     # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
     # GNU General Public License for more details, 
     # <http://www.gnu.org/licenses/>.
     # ----
     # To debug this script, please go to
     # <https://meta.wikimedia.org/wiki/User:Invadibot/scope/meta-2/user-fixes.py>.
     # The goals and procedures are explained in
     # <https://meta.wikimedia.org/wiki/User:Invadibot/scope/meta-2>.
     # ----
     # Thanks for your help!
     # ----
     'nocase': False,
     'recursive': True,
     'regex': True,
     'msg': {
          # Please add an edit summary for your project,
          # if not defined, and update the script in
          # <https://meta.wikimedia.org/wiki/User:Invadibot/scope/meta-2/user-fixes.py>.
          'an':u'[[:m:User:Invadibot/scope/meta-2|Bot]]: Apanyando vinclos enta prochectos Wikipedia y aplicando adrezas URL de protocolo relativo',
          'en':u'[[:m:User:Invadibot/scope/meta-2|Bot]]: Fixing links to Wikimedia projects and applying protocol-relative URLs',
          'es':u'[[:m:User:Invadibot/scope/meta-2|Bot]]: Arreglando enlaces a proyectos Wikimedia y aplicando direcciones URL de protocolo relativo',
          'fa':u'[[:m:User:Invadibot/scope/meta-2|ربات]]: تصحیح پیوند به پروژه‌های خواهر و تبدیل کردن پیوندها به خنثی در برابر پروتکل',
          'foundation':u'[[:m:User:Invadibot/scope/meta-2|Bot]]: Fixing links to Wikimedia projects and applying protocol-relative URLs',
          'gl':u'[[:m:User:Invadibot/scope/meta-2|Bot]]: Arranxando ligazóns a proxectos Wikimedia e aplicando enderezos URL de protocolo relativo',
          'meta':u'[[:m:User:Invadibot/scope/meta-2|Bot]]: Fixing links to Wikimedia projects and applying protocol-relative URLs',
          'test':u'[[:m:User:Invadibot/scope/meta-2|Bot]]: Testing links to Wikimedia projects',
     },
     'replacements': [
          (ur'\[http://([^@:/ ]+\.)wik(ipedia|inews|isource|ibooks|iquote|iversity|tionary|idata|ivoyage|imedia)\.org/', ur'[//\1wik\2.org/'),
          (ur'\[http://wik(ipedia|inews|isource|ibooks|iquote|iversity|tionary|idata|ivoyage|imedia)\.org/', ur'[//wik\1.org/'),
          (ur'\[http://(www\.)?mediawiki\.org/', ur'[//\1mediawiki.org/'),
          (ur'\[http://(www\.)?wikimediafoundation\.org/', ur'[//\1wikimediafoundation.org/'),
          (ur'\[//(www\.)?mail\.wikipedia\.org/', ur'[//lists.wikimedia.org/'),
          (ur'\[//(www\.)?([^@:/ (www)]+)\.wikipedia\.org/wiki/([^\s\]\?\|]+) ([^\]]+)\]', ur'[[:w:\2:\3|\4]]'),
          (ur'\[//(www\.)?([^@:/ (www)]+)\.wikinews\.org/wiki/([^\s\]\?\|]+) ([^\]]+)\]', ur'[[:n:\2:\3|\4]]'),
          (ur'\[//(www\.)?([^@:/ (www)]+)\.wikisource\.org/wiki/([^\s\]\?\|]+) ([^\]]+)\]', ur'[[:s:\2:\3|\4]]'),
          (ur'\[//(www\.)?([^@:/ (www)]+)\.wikibooks\.org/wiki/([^\s\]\?\|]+) ([^\]]+)\]', ur'[[:b:\2:\3|\4]]'),
          (ur'\[//(www\.)?([^@:/ (www)]+)\.wikiquote\.org/wiki/([^\s\]\?\|]+) ([^\]]+)\]', ur'[[:q:\2:\3|\4]]'),
          (ur'\[//(www\.)?([^@:/ (www)]+)\.wikiversity\.org/wiki/([^\s\]\?\|]+) ([^\]]+)\]', ur'[[:v:\2:\3|\4]]'),
          (ur'\[//(www\.)?([^@:/ (www)]+)\.wiktionary\.org/wiki/([^\s\]\?\|]+) ([^\]]+)\]', ur'[[:wikt:\2:\3|\4]]'),
          (ur'\[//(www\.)?([^@:/ (www)]+)\.wikivoyage\.org/wiki/([^\s\]\?\|]+) ([^\]]+)\]', ur'[[:wikivoyage:\2:\3|\4]]'),
          (ur'\[//(www\.)?wikidata\.org/wiki/([^\s\]\?\|]+) ([^\]]+)\]', ur'[[:d:\2|\3]]'),
          (ur'\[//(www\.)?mediawiki\.org/wiki/([^\s\]\?\|]+) ([^\]]+)\]', ur'[[:mw:\2|\3]]'),
          (ur'\[//(www\.)?wikimediafoundation\.org/wiki/([^\s\]\?\|]+) ([^\]]+)\]', ur'[[:wmf:\2|\3]]'),
          (ur'\[//(www\.)?meta\.wikimedia\.org/wiki/([^\s\]\?\|]+) ([^\]]+)\]', ur'[[:m:\2|\3]]'),
          (ur'\[//(www\.)?outreach\.wikimedia\.org/wiki/([^\s\]\?\|]+) ([^\]]+)\]', ur'[[:outreach:\2|\3]]'),
          (ur'\[//(www\.)?wikitech\.wikimedia\.org/wiki/([^\s\]\?\|]+) ([^\]]+)\]', ur'[[:wikitech:\2|\3]]'),
          (ur'\[//(www\.)?commons\.wikimedia\.org/wiki/([^\s\]\?\|]+) ([^\]]+)\]', ur'[[:commons:\2|\3]]'),
          (ur'\[http://toolserver\.org/', ur'[//toolserver.org/'),
          #
          # One of the next lines can be uncommented and adjusted depending
          # on the project in which this script is going to run.
          #
          #(ur'\[\[:?m:([^\]]+)\]\]', ur'[[:\1]]'),        # Meta-Wiki
          #(ur'\[\[:?d:([^\]]+)\]\]', ur'[[:\1]]'),        # Wikidata
          #(ur'\[\[:?mw:([^\]]+)\]\]', ur'[[:\1]]'),       # MediaWiki
          #(ur'\[\[:?outreach:([^\]]+)\]\]', ur'[[:\1]]'), # Outreach
          #(ur'\[\[:?commons:([^\]]+)\]\]', ur'[[:\1]]'),  # Commons
          #(ur'\[\[:?wikitech:([^\]]+)\]\]', ur'[[:\1]]'), # Wikitech
          #(ur'\[\[:?w:en:([^\]]+)\]\]', ur'[[:\1]]'),     # Wikipedia (replace "en" by the language code)
          #(ur'\[\[:?n:en:([^\]]+)\]\]', ur'[[:\1]]'),     # Wikinews (replace "en" by the language code)
          #(ur'\[\[:?s:en:([^\]]+)\]\]', ur'[[:\1]]'),     # Wikisource (replace "en" by the language code)
          #(ur'\[\[:?b:en:([^\]]+)\]\]', ur'[[:\1]]'),     # Wikibooks (replace "en" by the language code)
          #(ur'\[\[:?q:en:([^\]]+)\]\]', ur'[[:\1]]'),     # Wikiquote (replace "en" by the language code)
          #(ur'\[\[:?v:en:([^\]]+)\]\]', ur'[[:\1]]'),     # Wikiversity (replace "en" by the language code)
          #(ur'\[\[:?wikt:en:([^\]]+)\]\]', ur'[[:\1]]'),  # Wiktionary (replace "en" by the language code)
          #(ur'\[\[:?wikivoyage:en:([^\]]+)\]\]', ur'[[:\1]]'), # Wikivoyage (replace "en" by the language code)
          #(ur'\[\[:?(foundation|wikimedia|wmf):([^\]]+)\]\]', ur'[[:\2]]'), # Foundation Wiki
          #
     ],
     'exceptions': {
          'title': [
               '\.(css|js|php|py|sh)',
               '([Bb]lack|[Gg]r[ae]y|[Ww]hite)[ _]?[Ll]ist',
               '([Ss]abliera|[Ss]and[ _]?([Bb]ox|[Pp]ut|[Cc]haschte|[Kk]assen?|[Kk]assinn|[Ll][aå]dan)|([Zz]ona|[Pp][aáà](g|ch)ina)[ _]?de[ _]?([Pp]r(ue[bv]as?|o[bv][ae]s|e[bv]atinas?)|[Tt]estes?))', # You can occasionally comment this line for testing purposes.
               u'(صفحه[ _]تمرین|گودال)', #for Persian, no need to make it very general
          ],
          'inside': [
               (ur'\[//(www\.)?([^@:/ (www)]+)\.[a-z]+\.org/wiki/[^\s\]\?\|]+ (.*?\[\[.*?\]\].*?)+\]'),
               (ur'\[//.{500}.*?\]'),
               (ur'\[http://(www\.)?(apt|bayes|bayle|brewster|commonsprototype\.tesla\.usability|commons\.prototype|cs|cz|dataset2|de\.prototype|download|dumps|ekrem|emery|en\.prototype|ersch|etherpad|fenari|flaggedrevssandbox|flgrevsandbox|gallium|ganglia|ganglia3|harmon|hume|ipv4\.labs|ipv6and4\.labs|jobs|mlqt\.tesla\.usability|mobile\.tesla\.usability|m|nagios|noboard\.chapters|noc|observium|oldusability|project2|prototype|results\.labs|search|sitemap|snapshot3|stafford|stats|status|svn|test\.prototype|torrus|ubuntu|wiki-mail|yongle)\.wikimedia\.org'),
               (ur'\[http://(www\.)?(arbcom\.[a-z]+|download|m|static|wg\.[a-z]+)\.wikipedia\.org'),
               (ur'\[http://(www\.)?[^@:/]+\.m\.wikipedia\.org'),
               (ur'\[//(www\.)?(ten|test|test2)\.wikipedia\.org'), # To prevent: test.wikipedia -> [[w:test:]]
          ],
         'inside-tags': [
               # You can occasionally comment some of these exception tags,
               # under your own risk.
               'categorytree',
               'comment',
               'charinsert',
               'dynamicpagelist',
               'gallery',
               'hiero',
               'imagemap',
               'inputbox',
               'invoke',
               'math',
               'nowiki',
               'pagelist',
               'pagequality',
               'pages',
               'poem',
               'pre',
               'property',
               'score',
               'section',
               'source',
               'syntaxhighlight',
               'templatedata',
               'timeline',
          ]
     }
}
# </nowiki>