Add topic
From Meta, a Wikimedia project coordination wiki
Latest comment: 7 years ago by Johan (WMF) in topic Translators!

Preprocessor fixes Part.2[edit]

Hi User:CAnanian (WMF), User:SSastry (WMF), here's what we added to Tech News last week:

  • Markup that looks like code for language variants might need to be fixed. If -{ is used in transclusions or web addresses it has to be escaped appropriately. You can use -<nowiki/>{ for transclusions and %2D{ in web addresses. A transclusion could for example be when you use -{ in a template: {{1x| sad :-{ face }}. This is because of some code fixes to the preprocessor and affects all wikis.[1][2]

What do we want to say this week? Do we need to communicate more urgency? Can we provide a longer list of wikis, and update the existing ones? Asking now because the issue needs to be frozen in less than 48hs, as you know. --Elitre (WMF) (talk) 15:33, 10 May 2017 (UTC)Reply

Since User:CAnanian (WMF) has to get those wiki dump greps in place, maybe hold off till he finishes that and we can then issue another tech news update after that? Unless you think it is useful to repeat last week's message without any new information, and then have the new update whenever that is ready? SSastry (WMF) (talk) 19:32, 10 May 2017 (UTC)Reply
I should be able to update the set of wikis w/in 48hrs. So let's repeat the message, perhaps. The new addition would be the "long tail" of small wikis, most of which probably only have a case or two to fix up. Cscott (talk) 19:37, 10 May 2017 (UTC)Reply
If the breaking change happens on the week of the 22nd, we can also skip this week and mention next week instead. --Elitre (WMF) (talk) 14:22, 11 May 2017 (UTC)Reply
Elitre (WMF), what??? It looks like you are going to make the change in december or something, according to conversations on fixup talk page (let's check this for medium and small wikis in july, so they can start fix...). IKhitron (talk) 14:25, 11 May 2017 (UTC)Reply
This month is when this patch should be merged, AFAIK. (The Parsing team will do several changes this year.) Where is December mentioned? However, pinging User:CAnanian (WMF) and User:SSastry (WMF) as well. --Elitre (WMF) (talk) 14:35, 11 May 2017 (UTC)Reply
Wierd. December as a month nowhere, but looks like there is a lot of time. Pinging User:Amire80. IKhitron (talk) 14:38, 11 May 2017 (UTC)Reply
I'm still failing to understand when will this actually happen. May 22nd? December – which year?
This is a big, breaking change, so:
  • It really shouldn't be rushed.
  • The list for all wikis, which I requested, and which Cscott promised to provide, is essential. Editors communities must be given time to over it — at least a couple of weeks, depending on the size.
  • Though it will probably work, <nowiki /> doesn't sound like necessarily the best solution. The <nowiki /> tag has particular functional uses, and shouldn't be used for resolving every exotic wiki syntax issue. A rewrite of the whole problematic construct may be more useful. I'll be better equipped to understand this once I'll see the list from the previous point. --Amir E. Aharoni (talk) 14:52, 11 May 2017 (UTC)Reply
It's not actually a big breaking change, as far as we can tell. -{ isn't used all that much on wikis without language converter enabled. Even when it is used, it only actually breaks markup when it is used in an unbalanced form inside a template -- outside of a template or inside extension tags etc it just ends up as literal text. In addition, the preprocessor precedence rules automatically resolve many potential cases of the form -{{ (which are intended to be template invocations, and are still parsed that way). The counts on the fixup page are only "high" (where "high" means "552 uses on English wikipedia, out of 5,402,833 articles (0.01%)") because it turns out that the infobox for chemical elements tends to hit this particular case often. Once the core patch is merged, we'll be able to use parser warning categories (or Parsoid linter rules) to allow even better on-going cleanup for the few cases which remain. (Perhaps you are being misled by using the on-wiki wikitext search tools, which provide a huge number of false positives since they don't account for all of the mitigating factors above?) Cscott (talk) 15:40, 11 May 2017 (UTC)Reply
Depends on the point of view. For me, 552 cases on enwiki are 551 cases too much to do this. IKhitron (talk) 15:45, 11 May 2017 (UTC)Reply
FWIW, we're down to 4 cases on enwiki in the latest dump. All credit goes to User:DePiep. Thanks! Cscott (talk) 17:20, 11 May 2017 (UTC)Reply
Cscott, yes, it is quite possible that I'm thinking of completely wrong numbers. That's why the first thing I asked for is the precise numbers in every wiki.
And make no mistake, I absolutely support the proposed bold changes in wiki syntax to make parsing more robust overall. I just don't want them to be too breaking :) --Amir E. Aharoni (talk) 15:48, 11 May 2017 (UTC)Reply
Given that even on enwiki, only a few 100 pages were affected, our expectation is that only a small set of pages will be affected by this change, if at all. We were planning to merge the core patch soon (based on what we found with the dump grep) since we figured the page fixes can happen in parallel. But, of course, the dump greps that cscott is doing will tell us more. If we find that there is a large set of pages and / or a lot of breakage, we can push out the merge further. Separately, we are also working to introduce a linter category that will identify this on a more active and ongoing basis -- dump grepping is a one-off tool. As for your second comment about nowiki usage, rewriting language variant support is not on the plate. If a better solution emerges for this, we can readily implement that and linter / dump-grep can readily let us identify this unwanted nowiki usage and fix that then. We really don't want to add that as a blocker. SSastry (WMF) (talk) 15:24, 11 May 2017 (UTC)Reply

Given everything Cscott mentioned above, it could be a good idea to highlight which kind of articles are the most affected, so users of related wikiprojects can intervene. I really like how User:DePiep summarized that at mw:Parsoid/Language_conversion/Preprocessor_fixups/2017-03-20_list:_edits#Edit_Rules: chemical names, species description and module documentation pages. --Elitre (WMF) (talk) 16:19, 11 May 2017 (UTC)Reply

Now that a new list of wikis is available, I'd say we need to point to this in our announcement. --Elitre (WMF) (talk) 16:50, 11 May 2017 (UTC)Reply
I'll work with that list.
If that TechNews announcement includes a request to edit, it would be helpful to ask volunteers for non-Latin scripts (and then, the rtl hebrew, arabic ones).
Unrelated, piggyback edit: enwiki people suggested to not edit URLs into %2D{, but into %2D%7B since the curly bracket is reserved in CS1,2 (CitationStyle modules). I will do so. -DePiep (talk) 17:39, 11 May 2017 (UTC)Reply
I should consider mentioning some of the wikis that may be more impacted, although I'm expecting most of the results to be actually false positives, so it's more of an invitation to check. We could say "Please check (and fix if necessary)", or whatever we want really. DePiep, let's define a way to flag what you will be working on, so that we don't duplicate efforts? I see you mark languages as done: would it be too much for you to also mark when you're starting to check one wiki? Not all of those of course!, maybe only the ones with over 20 results, or something :) --Elitre (WMF) (talk) 17:47, 11 May 2017 (UTC)Reply
New listings, 20170501, are leading.
My check is: "1. Take page from a list, 2. Find -{ . 3. Judge to edit". If not found, must have been edited = OK too.
Worked on: enwiki (just finished). Will work on: dewiki, itwiki, nlwiki. Today/tomorrow. Not sure yet about others; I prefer bigger lists first.
Not sure about how/where to coordinate 'working on' notifications.
I understand I do not have to keep your plans and strive-dates in mind (not the publication, not the roll-out scheme). I'm just an editor, and others can join in by now. I publish my edit knowledge, but cannot 1:1 instruct others I guess.
Per 20170520 there might be a new wikidump and so new lists.
-DePiep (talk) 20:38, 11 May 2017 (UTC)Reply
Will add "Working on this, DePiep" with any wiki in the new overview list. -DePiep (talk) 20:41, 11 May 2017 (UTC)Reply
That's what I was suggesting, DePiep :) It will reassure many people. --Elitre (WMF) (talk) 08:31, 12 May 2017 (UTC)Reply
Hewiki done. Amire80 and IKhitron (talk) 21:43, 11 May 2017 (UTC)Reply
... And we managed to do it without <nowiki />. <rant>I don't like using <nowiki /> without a particularly good reason. <nowiki /> doesn't appear even once in the article space in he.wikipedia (and if you do every see it there, it must have been added very recently, and I haven't fixed it yet).</rant>
What we did do:
  • Replace some formulas with <math> or <chem>.
  • Replace some minus characters with the ־ character, which is unique to the Hebrew alphabet (sorry, other alphabets). It should be used instead of a minus in those contexts anyway.
  • Add the {{Hyphen}} template, already available in English. It inserts a hyphen-minus as &#45;. It's a hack, but not worse than <nowiki />.
Thanks to Cscott for making the list! --Amir E. Aharoni (talk) 21:51, 11 May 2017 (UTC)Reply
Amir E. Aharoni and IKhitron, Awesome! :) Can you update SSastry (WMF) (talk) 22:24, 11 May 2017 (UTC)Reply
Should <nowiki/> be prevented at all, and an alternative be used? I'd like to hear. -DePiep (talk) 10:02, 12 May 2017 (UTC)Reply
I've kept the <nowiki/> suggestion in Tech News for now, since the {{Hyphen}} template isn't available on all wikis reached by the newsletter. /Johan (WMF) (talk) 14:02, 12 May 2017 (UTC)Reply

New text[edit]

How would you phrase an update for Tech News in a couple of sentences? I'm still a bit confused as to what the new information is that wasn't in the last latest issue. /Johan (WMF) (talk) 22:46, 11 May 2017 (UTC)Reply

Sorry, I'm on it. The new info is that there is a new list now covering all the wikis, and we have more suggestions about how to fix based on editors' experience. --Elitre (WMF) (talk) 07:54, 12 May 2017 (UTC)Reply
  • Markup that looks like code for language variants might need to be fixed. If -{ is used in transclusions or web addresses it has to be escaped appropriately. You can use -<nowiki/>{ for transclusions and %2D{ in web addresses. This is because of some code fixes to the preprocessor and affects all wikis. Please help check the full list of wikis and fix if necessary (false positives are possible) to avoid that breaking occurs later this month. Some users have provided more detailed guidance about what to fix. [3][4] [5][6]
Johan (WMF), I kept most of the original content so that people can just copy over. How does this look? --Elitre (WMF) (talk) 08:15, 12 May 2017 (UTC)Reply
I will also notify separately those communities where numbers look a bit high - like sister projects, although I think in the Wikisources case those may be all bad OCR cases, so probably not worth fixing. --Elitre (WMF) (talk) 08:34, 12 May 2017 (UTC)Reply
OK, thanks. /Johan (WMF) (talk) 11:09, 12 May 2017 (UTC)Reply
Subbu, we're going with Johan's version. Thanks all! --Elitre (WMF) (talk) 16:14, 12 May 2017 (UTC)Reply


It seems the Translate extension isn't working on Meta at the moment. I've set up a mw:User:Johan (WMF)/Tech News translation 201720 copy of the issue, with the same items and translate tags, where the tools is working. Let's assume this is fixed by the time I deliver Tech News on Monday. You can translate there, and then I can copy and paste your translation to Meta, if you haven't had time to fix it yourself. Or solve it some other way. We'll have translations, at least. /Johan (WMF) (talk) 03:52, 13 May 2017 (UTC)Reply

This turned out to be a script error affecting more than the Translate extension; everything should be back to normal now. (: /Johan (WMF) (talk) 05:29, 13 May 2017 (UTC)Reply