Wikimedia Forum/Archives/2020-06

From Meta, a Wikimedia project coordination wiki

New request

I added a new request here. https://meta.wikimedia.org/wiki/Classical_Arabic مروية سمر الهناد (talk) 00:30, 1 June 2020 (UTC)

What you created is a request for an entirely new sister project. What you should have done is to have added a request for a new language. Ruslik (talk) 20:15, 1 June 2020 (UTC)

Icon template attribution

Are you required to attribute icon templates (e.g. {{Done}}) to its source (on enwiki) when copying them to other WMF wikis? I've checked around and so far there's not really much of an effort to do this (see wikisource wikivoyage ourselves wikispecies wikinews wikidata wikibooks wikiquote) but I'm wondering if it's really required to do this (per a concern at Wikiquote) Dibbydib (talk) 23:28, 5 June 2020 (UTC)

Update request

On Requests for new languages please change “Wikipedia Awadhi“ from approved to created. Thanks!!! --151.49.47.168 18:40, 6 June 2020 (UTC)

bugfix request: problem displaying Meitei script (mni) in Chrome on Windows

The font settings for the Meitei script (mni) need updating. I've noticed a bug with displaying the script. on both en.wikipedia and today also here. I have fonts installed that contain the script: Noto Sans Meetei Mayek. and the Microsoft font family en:Nirmala UI, but it's displaying in something else that gives me tofu text. I get the problem using chrome on windows, a very common setup, so it's probably not just me. And if it requires the reader to change settings there should be a note to explain how maybe?

the table on https://meta.wikimedia.org/wiki/User_language/mni as shown in different browsers on windows 10:
example
output format the output i see on:
chrome edge firefox
ꯏꯟꯗꯤꯌꯥ font-family:'Noto Sans Meetei Mayek','Nirmala UI','Noto Sans';
ꯏꯟꯗꯤꯌꯥ {{lang|mni|ꯏꯟꯗꯤꯌꯥ}}
ꯏꯟꯗꯤꯌꯥ {{Script|Mtei|ꯏꯟꯗꯤꯌꯥ}} (works on en.wikipedia but not here)
ꯏꯟꯗꯤꯌꯥ none

Irtapil (talk) 05:44, 9 June 2020 (UTC) updated: Irtapil (talk) 06:20, 9 June 2020 (UTC)

That page displays correctly for me (as far as I can tell), using Eeyek Unicode and Lohit Bengali on Firefox under GNU/Linux. What settings do you have for your mw:ULS webfonts? Nemo 06:11, 9 June 2020 (UTC)
@Nemo bis: I followed the link to Universal Language Selector, but i'm not sure which settings you mean? If my account gives different results in each browser, does that rule out the thing you were talking about with MW:ULS? Irtapil (talk) 14:58, 9 June 2020 (UTC)
thanks @Nemo bis: (or @Nemo:?) yeah, the problem is only in Chrome, it looks ok in other browsers (i updated above with more examples) but chrome is a common browser, so i suspect not just me, i'll check that setting. Irtapil (talk) 06:20, 9 June 2020 (UTC)
If you're logged in with some browser but not the others, it might be an issue with your preferences. Just making sure we're comparing apples with apples. Nemo 06:28, 9 June 2020 (UTC)
@Nemo bis: Checked login effect, made no difference. (Logged out on edge, looked in an incognito on Chrome, and private window on Firefox, but should have the same effect?) It looks like edge is using Nirmala and Firefox is using Noto Sans Meetei Mayek. Irtapil (talk) 06:55, 9 June 2020 (UTC)

I have made the above proposal to reform the Requests for comment process, which for right now serves as the only dispute resolution forum for many wikis. --Rschen7754 21:17, 14 June 2020 (UTC)

Edit request: Module:WikidataIB

@Billinghurst: I've a quick edit request for Module:WikidataIB. Could an admin please update it with the code below (from w:Module:WikidataIB). It is necessary for the proper functioning of Template:WiR table row.

Extended content
-------------------------------------------------------------------------------
-- qualsToTable takes most of the usual parameters.
-- The usual whitelisting, blacklisting, onlysourced, etc. are implemented.
-- A qid may be given, and the first unnamed parameter is the property ID, which is of type wikibase item.
-- It takes a list of qualifier property IDs as |quals=
-- For a given qid and property, it creates the rows of an html table,
-- each row being a value of the property (optionally only if the property matches the value in |pval= )
-- each cell being the first value of the qualifier corresponding to the list in |quals
-------------------------------------------------------------------------------
-- Dependencies: parseParam; setRanks; parseInput; sourced;
-------------------------------------------------------------------------------
p.qualsToTable = function(frame)
	local args = frame.args

	local quals = args.quals or ""
	if quals == "" then return "" end

	args.reqranks = setRanks(args.rank)

	local propertyID = mw.text.trim(args[1] or "")
	local f = {}
	f.args = args
	local entityid, props = parseInput(f, "", propertyID)
	if not entityid then return "" end

	args.langobj = findLang(args.lang)
	args.lang = args.langobj.code

	local pval = args.pval or ""

	local qplist = mw.text.split(quals, "%p") -- split at punctuation and make a sequential table
	for i, v in ipairs(qplist) do
		qplist[i] = mw.text.trim(v):upper() -- remove whitespace and capitalise
	end

	local col1 = args.firstcol or ""
	if col1 ~= "" then
		col1 = col1 .. "</td><td>"
	end

	local emptycell = args.emptycell or " "

	-- construct a 2-D array of qualifier values in qvals
	local qvals = {}
	for i, v in ipairs(props) do
		local skip = false
		if pval ~= "" then
			local pid = v.mainsnak.datavalue and v.mainsnak.datavalue.value.id
			if pid ~= pval then skip = true end
		end
		if not skip then
			local qval = {}
			local vqualifiers = v.qualifiers or {}
			-- go through list of wanted qualifier properties
			for i1, v1 in ipairs(qplist) do
				-- check for that property ID in the statement's qualifiers
				local qv, qtype
				if vqualifiers[v1] then
					qtype = vqualifiers[v1][1].datatype
					if qtype == "time" then
						if vqualifiers[v1][1].snaktype == "value" then
							qv = mw.wikibase.renderSnak(vqualifiers[v1][1])
							qv = frame:expandTemplate{title="dts", args={qv}}
						else
							qv = "?"
						end
					elseif qtype == "url" then
						qv = mw.wikibase.renderSnak(vqualifiers[v1][1])
						local display = mw.ustring.match( mw.uri.decode(qv, "WIKI"), "([%w ]+)$" )
						if display then
							qv = "[" .. qv .. " " .. display .. "]"
						end
					else
						qv = mw.wikibase.formatValue(vqualifiers[v1][1])
					end
				end
				-- record either the value or a placeholder
				qval[i1] = qv or emptycell
			end -- of loop through list of qualifiers
			-- add the list of qualifier values as a "row" in the main list
			qvals[#qvals+1] = qval
		end
	end -- of for each value loop

	local out = {}
	for i, v in ipairs(qvals) do
		out[i] = "<tr><td>" .. col1 .. table.concat(qvals[i], "</td><td>") .. "</td></tr>"
	end
	return table.concat(out, "\n")
end

Thank you! T.Shafee(Evo﹠Evo)talk 09:48, 8 June 2020 (UTC)

@Evolution and evolvability: I have just re-imported the module. I think that this is one script that just needs to be fully aligned with its origin, rather than patched. FWIW usually Meta:RfH would get the most attention for such a request.  — billinghurst sDrewth 06:59, 15 June 2020 (UTC)
@Billinghurst: Good idea - thank you. T.Shafee(Evo﹠Evo)talk 08:00, 15 June 2020 (UTC)

A small request

Can someone delete my user page and talk page on Commons? I do not wish to participate in that project any more. I tried asking the admins there, but they just ignored it. (I would also like my uploads removed, to whatever extent reasonable, but I doubt this is going to happen anyway.) — Keφr 16:08, 17 June 2020 (UTC)

We can't help you here as Commons have their own admins. Usually talk pages can't be removed, but it's up to local policies on every wikis. Stryn (talk) 16:49, 17 June 2020 (UTC)

I proposed a new project which may be a complement of Wikidata. Feel free to discuss it.--GZWDer (talk) 06:35, 19 June 2020 (UTC)

Invitation to Wikimedia Café Sat 20 June 2020

meeting online live in one hour!

Everyone is invited to the monthly Wikimedia Café video chat on Saturday 20 June 2020. See the project page for details on this upcoming meeting and notes on previous meetings.

Wikimedia Café is a modest, one-hour, monthly online meeting which for the past few months has had fewer than 10 attendees. At these meetings anyone can propose to discuss any topic of broad Wikimedia community interest, as if we all were able to meet in person over coffee. The meetings themselves are an experiment in small group Wikimedia community conversation with video chat, phone access options, and online shared notetaking. Thanks. Blue Rasberry (talk) 15:34, 20 June 2020 (UTC)

Syntaxerkennung funzt nicht

Moin!

Beim Kollegen User:Bahnmoeller funzt aus völlig unerklärlicher Ursache das Erkennen von normalen Zeichen, die als Syntaxzeichen funktionieren sollten, nicht (Zwei seiner Beiträge hier habe ich entsprechend ausgebessert). So werden Doppelpunkte an Zeilenanfängen nicht als Einrückungen erkennt, und 4 Tilden nicht als Unterschrift. Ich habe ihn diesbezüglich schon auf seiner Benutzerdisk auf deWP angesprochen, aber er hat keine Erklärung. Kann das hier irgendwer irgendwie erklären? Grüße vom Sänger ♫(Reden) 20:07, 24 June 2020 (UTC)

Er hat keinen Doppelpunkt eingefügt sondern ein "Triangular Colon", das normalerweise nur für IPA verwendet wird. Die Tilde war kein sondern eine "Combining tilde overlay". Beide wurde wohl über das "Special characters"-Eingabewerkzeug eingefügt. --Count Count (talk) 13:32, 25 June 2020 (UTC)

Ich tippe was ein ̈ das war die STernchentaste auf dem Nummernblock

ː das ist der Doppelpunkt beim .

̃ das war strg alt Tilde

nix special characters

Und nun aus dem "Insertable wiki markup" die vier Tilden Bahnmoeller (talk) 16:16, 25 June 2020 (UTC)

Ich mach das jetzt mal einfach ganz groß kopiert von Dir:
̈ das war die STernchentaste auf dem Nummernblock
ː das ist der Doppelpunkt beim .
̃ das war strg alt Tilde
Und jetzt das Ganze aus meiner Tastatur:
* das war die Sternchentaste auf dem Nummernblock
: das ist der Doppelpunkt beim .
~ das war Alt Gr Tilde
Du siehst den Unterschied? Grüße vom Sänger ♫(Reden) 16:44, 25 June 2020 (UTC)

Was ich tue ist auf de, en, fr, und commoms immer gleich. Nur auf meta funktioniert es nicht. Und da sagt mein Rasiermesser, das auf meta wohl etwas anders ist als in den anderen Wikimedia-Installationen. Ich werde wohl zum kopieren übergehen. Bahnmoeller (talk) 10:34, 27 June 2020 (UTC)

Das würde ich auch so sehen, allerdings wäre es schon intressant, welche Skripte oder Einstellungen das hier verursachen. Wo wäre denn, wenn nicht hier im Forum, der richtige Platz, so etwas anzusprechen? Grüße vom Sänger ♫(Reden) 10:36, 27 June 2020 (UTC)

Why does global rollback include autopatrol?

Tracked in Phabricator:
Task T256688

Global rollback includes the autopatrol right. This.. doesn't quite make sense to me? A global rollbacker may be able to spot and revert obvious vandalism on any wiki ("COCKS!!"), but if a global rollbacker tries to create or improve an article on a project in a language they are not fluent in, I would imagine that autopatrol would not be desirable. Similarly on Commons, users on Commons are only granted autopatrol if they are believed to have a sufficient understanding of copyright law. I doubt global rollbackers are put to the same test. — Alexis Jazz (ping me) 09:25, 29 June 2020 (UTC)

For the note: autopatrol was added in the group with this discussion. I wonder if we can have autopatrolrestore to replace it with autopatrol, like there is autoreviewrestore for FlaggedRevs. -- CptViraj (talk) 09:47, 29 June 2020 (UTC)
The autopatrol right has its justification. GR make many Error fixes in many projekts and remove vandalism, this need not be checked.--𝐖𝐢𝐤𝐢𝐁𝐚𝐲𝐞𝐫 👤💬 09:58, 29 June 2020 (UTC)
@CptViraj: I thought rollbacks were already patrolled automatically, but I'm not sure. autopatrolrestore for mw-undo and mw-rollback could work. @WikiBayer: That's only true for the anti-vandalism work. Unless you expect global rollbackers to create an alternative account for regular use, this is not ideal. — Alexis Jazz (talk or ping me) 10:09, 29 June 2020 (UTC)
Oh yes I forgot, rollbacks are automatically patrolled. Then ya it could work for undo though. -- CptViraj (talk) 10:21, 29 June 2020 (UTC)
Not certain that the additional permissions are available by default, see Special:GlobalGroupPermissions/global-rollbacker for what is available.  — billinghurst sDrewth 15:13, 29 June 2020 (UTC)
In general, these users should be trustworthy enough to not clog up patrol backlogs all over the place - keep in mind that on many projects patrolling applies to every namespace. — xaosflux Talk 16:28, 29 June 2020 (UTC)
@Xaosflux: Patrolling isn't just about trust. If you aren't familiar with the customs of any given project, you shouldn't be autopatrolled there. And nobody is familiar with the customs of all projects. — Alexis Jazz (talk or ping me) 21:13, 29 June 2020 (UTC)
Just for reference, stewards and Jimbo Wales have autopatrol on all projects via their groups as well. — xaosflux Talk 01:51, 2 July 2020 (UTC)
I'd argue that those groups shouldn't have global autopatrol either, though Jimbo ('founder' group) is a special case as nobody else will ever be added to the 'founder' group and if Jimbo never makes edits on any wiki that require patrolling, there would be literally no possible use case. Also, the 'founder' group is in part "a traditional and largely honorary thing". — Alexis Jazz (talk or ping me) 08:59, 2 July 2020 (UTC)
OK, leave about founder group as a traditional and largely honorary thing. What next? Global Flow creator, Global interface editors, Global sysops, Staff, Stewards and System administrators all those groups have autopatrol on all projects. Global rollbacker have adequate demonstrated trust from the community. Regards, ZI Jony (Talk) 10:46, 2 July 2020 (UTC)
Its useful for projects with recent change patrolling enabled where counter-vandalism may involve manually editing to account for merge conflicts, or when only part of an edit is problematic. DannyS712 (talk) 00:51, 30 June 2020 (UTC)
@DannyS712: The problem is it can't be turned off. Even the most trusted user shouldn't be autopatrolled when editing a wiki they aren't familiar with beyond anti-vandalism. And rollbacks are already autopatrolled, which leaves just mw-undo. And if you are removing only part of an edit or you manually edit to avoid an edit conflict on, say, jawiki, perhaps your edit should be patrolled. You might have accidentally removed something that shouldn't have been removed, overlooked a part of the vandalism or the article may contain unspotted vandalism done prior to the edit you have just reverted. And if you remove so much vandalism on jawiki without error that it bothers the patrollers.. Well, they can still just make you autopatrolled, right? The alternative here is that I start discussions about this on local projects, explain the pros and cons, and some might opt out of global rollbackers. I think that would be worse. — Alexis Jazz (talk or ping me) 05:21, 30 June 2020 (UTC)
For not being able to turn it off, for page creation, see w:Wikipedia talk:New pages patrol/Reviewers#Autopatrol and global rollback. As for rollbacks being autopatrolled, source please DannyS712 (talk) 05:23, 30 June 2020 (UTC)
@DannyS712: Not sure a bot is the ideal solution in this case. Here's what I meant about the patrolling thing: Rollback shouldn't patrol without log entry. — Alexis Jazz (talk or ping me) 08:37, 30 June 2020 (UTC)
Try it the other way around: having a vote on every single wiki to grant autopatrol to these 91 users who are in most cases clueless about local policies and language. For example, it didn't take me very long to find this edit by User:1997kB (global rollbacker) that reduced the number of shops from 25000 (25.000) to 25 (25,000). (and even marked that edit as minor) w:Decimal separator for more info. I'll grant you that in this particular case 1997kB wasn't a global rollbacker yet, but that's nitpicking. For a more recent example, see [1] where you inserted the template "Birth date and age".. which doesn't exist on dewiki. — Alexis Jazz (talk or ping me) 09:36, 30 June 2020 (UTC)
I was editing that wiki because I was reviewing draft on enwiki ([2]) and has nothing to do with global rollback. As you said I was not GR back then, so this edit (even it was not right) went through local processes and citing it here is misleading. [Please ping if anybody need further clarification]. ‐‐1997kB (talk) 10:04, 30 June 2020 (UTC)
@1997kB: Are you saying that if you would have had global rollback (autopatrol) at that time, you wouldn't have made the edit? This isn't anything against you (frankly if your username had been Z1997kB I probably wouldn't have picked you), it's just about global autopatrol in general. — Alexis Jazz (talk or ping me) 10:11, 30 June 2020 (UTC)
What I am trying to say is a 3-4 month old account (as I was) can't have GR. GR applications go through long established community process and as I have seen people don't support users who are clueless about global rollback guidelines. ‐‐1997kB (talk) 00:15, 1 July 2020 (UTC)
For dewiki, I am already an `autochecked user` and an `editor`, so gr has nothing to do with it. Also, the dewiki edit you linked was imported from enwiki, where I made the edit in question - see w:de:Special:Redirect/logid/120250064 DannyS712 (talk) 11:05, 30 June 2020 (UTC)

Comment Comment Where is this discussion going? This is a rights allocation that has been in place for twelve years and there is no evidence of abuse nor of community-reported problems. We are talking about 75 users who have passed through a community consultation after demonstrating their value to the global community. The list is an well-credentialed list of contributors to Wikimedia. It seems to have become an argument in search of a problem.  — billinghurst sDrewth 11:31, 30 June 2020 (UTC)

Further, the creation of a phabricator ticket and its linking is quite presumptive with a discussion at this point.  — billinghurst sDrewth 11:45, 30 June 2020 (UTC)
When someone falls back to the argument "it's been like this for (insert amount of time)", you know you've struck a nerve. That argument can be used to defend men-only voting rights, the death sentence, slavery, colonization, apartheid and so on. It's not a useful argument for anything. As for the Phabricator ticket, it's just a feature request. It doesn't ask for enabling it anywhere. It could be used globally, locally or even on non-WMF wikis which are often forgotten to exist. Any local project for example might decide to grant rollback to a relatively new user who is interested in fighting vandalism. The reverts by that user may not require patrolling, but other edits to articles still might as the user may not be familiar yet with all the policies. The whole discussion was triggered by the Tulsi Bhagat case as he lost his global sysop+rollback rights for "abusing global autopatrol", despite there being no way to disable autopatrol and it not being a conscious decision to enable autopatrol for the offending edits. You too seem to think that autopatrol is just about trust, but it isn't. Understanding of local policy and language is required, and since nobody has an understanding of all local policies on Wikimedia.. — Alexis Jazz (talk or ping me) 10:28, 1 July 2020 (UTC)
(wearing my community member hat) I can understand where this concern comes from, but I think that the pros of granting global rollbackers autopatrol outweigh the cons, especially for projects with recent change patrolling. No, GRs don't know local policies everywhere, but they can generally be trusted to contribute and improve rather than harm the wikis. --DannyS712 (talk) 10:38, 1 July 2020 (UTC)
(wearing my developer hat) The work needed to implement such a feature request would likely only be warranted if this right was going to be used on wmf wikis, and had signoff from whatever the relevant WMF team is. While the new EditResult class should make this a bit easier for detecting manual reverts, it would still be a fair amount of work to write the code for, and, having made this mistake myself, I know that developers are less likely to write the code if they don't know if it will be approved and merged. Until there is consensus that such a right is needed, the phabricator ticket is indeed a bit premature. --DannyS712 (talk) 10:38, 1 July 2020 (UTC)

Does autopatrol not simply mean that an edit can be presumed to not be vandalism? Even on dewiki, which uses the "escalated" version of autopatrol, FlaggedRevs, reviewing only means a page is not vandalised, not that the edit is high-quality or whatever. --MF-W 11:58, 1 July 2020 (UTC)

@MF-Warburg: This may differ from project to project, but I know that at least on Commons, being known to not be a vandal is not sufficient to get autopatrol. Edits that may need to be reverted or adjusted need to be patrolled. So on Commons, without any need for malicious intent, your edits need patrolling in case you accidentally upload copyvio or overcategorize stuff. On Commons, files are only categorized in the deepest subcategory, so a picture that is in Category:Titanic shall not be in Category:Boats. But newbies don't know that. I sometimes make small changes to articles in languages I don't understand 100%. Generally altering or adding an image caption for an image I replace or add. These captions often take the form of "Last name in year", which requires me to figure out how to say "in" in that other language. I don't think I've ever gotten it wrong, but such changes should be patrolled, even though I'm not a vandal. — Alexis Jazz (talk or ping me) 13:01, 1 July 2020 (UTC)
commons appears to be fine giving this to those that make a large amount of good edits that do not need to be patrolled, to reduce the clutter on Watchlists. - "edits" are pretty broad. — xaosflux Talk 15:38, 1 July 2020 (UTC)
@Xaosflux: The guideline is 500 useful edits total, but it gets judged on a case-by-case basis. If you have 100 useful anti-vandalism reverts but haven't demonstrated good knowledge of copyright, you likely wouldn't be granted autopatrol. If you have 500 useful anti-vandalism reverts but also recently uploaded some photos of the Burj Khalifa, chances are your request for autopatrol would be denied. Also, 100 edits are a drop in a bucket on Commons. — Alexis Jazz (talk or ping me) 00:34, 2 July 2020 (UTC)
@Alexis Jazz: does commonswiki have a "global rights policy" specifying how global rights may/may not be used? — xaosflux Talk 15:41, 1 July 2020 (UTC)
@Xaosflux: Nothing I'm aware of. — Alexis Jazz (talk or ping me) 00:34, 2 July 2020 (UTC)
  • Global rollback contains autopatrol because, back when it was created, many/some projects still required all edits to be patrolled (not just page creations), and GRs were thought to be trusted. GR is still a very difficult group to be admitted to, the users are highly trusted, there have been no actual issues with the autopatrolled access beyond one incident that was at least 50% misunderstanding on enwiki. There is no need to change the status quo. – Ajraddatz (talk) 16:07, 1 July 2020 (UTC)
  • I agreed with Ajraddatz, if there is any consent in commons then you can open a proposals discussion on COM:VP/P. I think bot DannyS712 proposed for enwiki that also can work for commons, or can made global rights policy for commonswiki. As Ajraddatz said many projects still required all edits to be patrolled, like commons, wikidata, bnwiki and many more where RC patrolling enabled. Regards, ZI Jony (Talk) 17:10, 1 July 2020 (UTC)
    I'm repeating myself, but autopatrol isn't just about trust. You can have the best intentions in the world and still not get autopatrol until you understand the language and policies of any given project. I still believe autopatrol for edits other than reverts (and things like global file renames and username changes) is a local matter, not a global one. Another method (that doesn't require new MediaWiki features) could be to grant autopatrol to all global rollbackers on all projects on which they have at least 50 edits and distribute a message where each project can look up which users were granted autopatrol. Then autopatrol could be removed from global right sets without suddenly overloading patrol backlogs, and local projects can control how to deal with it. — Alexis Jazz (talk or ping me) 00:34, 2 July 2020 (UTC)
  • As per Ajraddatz said, I suggest ťo make a new Global Patroller in global user group so it's very help for patrolling wiki, that is able to get, than global rollbacker, it should wait for a long time to get it, with highly trusted reason. Global patroller can simply to make a user care about wikimedia project, and build cooperate in small situation. Syman51 19:02, 1 July 2020 (UTC)
    Global Patroller will not be possible, for that user should understand all local patrolling policy. Per Bangla Wikipedia new article patrol policy, no article clearing tags or 'conflicts, sources / quotes / references, couldn't be added within 48 hours. CSD judgment are not covered by this policy, like copyright, spam, advertising, own biography, etc. Regards, ZI Jony (Talk) 10:46, 2 July 2020 (UTC)

Having that right attached makes no sense, is unrelated, and IMO a bad idea. IMO you should remove it. North8000 (talk) 05:57, 3 July 2020 (UTC)

  • I'm inclined to remove autopatrol from the GR group unless autopatrol could be actually "turned off". Enwiki can unpatrol an article via page curation and has AFC process but other projects cannot turn it off. While GRs may know what they are doing, will try to comply with local policies and sometimes consult before act on projects where they do not hold local autopatrol, it wouldn't hurt to double-check their article creations/ revisions until they get autopatrol. If they are not active members of that community, it wouldn't impose a burden on that wiki's workflow. If they are active, they can request for local autopatrol when eligible.--94rain Talk 14:21, 3 July 2020 (UTC)

Proposal: That WMF ask Google to stop indexing certain bot-generated articles

  • Problem: Google is picking up on of ceb.wikipedia geography articles that have been created by bot. The articles are more than 95% spot-on, but when they are wrong they are very wrong. The incorrect information may even be in the title. One example is that the bot produced three articles for the same island in the same location, but each article claimed the island was in a different county. This would not be a problem to the English-speaking world if it wasn't that the titles of the articles are in English, either completely or in part. Because of this, the bot generated information gets fed into Google results for searches made in English. Due to the thousands of bot generated articles it is not practical for humans to completely curate the ceb.wikipedia articles anytime soon. As many of the bot generated articles have some value, it does not seem reasonable to demand that ceb.wikipedia change its standards and delete all the articles en masse.
Proposal: Instead, the WMF should ask Google to stop indexing all ceb.wikipedia geography articles in English speaking lands such as North America and Canada that depict local geographies like counties, cities, towns, buildings, islands, reefs, etc. Articles on states, provinces, and countries may still be indexed. There should be an opt-in whitelist mechanism for certain important locations like Manhattan or Oahu, but ceb.wikipedia must indicate that they have curated these articles with a human reviewer.
Additionally, Google needs to not index all wikipedia clones of the blacklisted ceb.wikipedia geography articles. The WMF should ask Google to do this.--Epiphyllumlover (talk) 19:30, 12 June 2020 (UTC)
Now that this is in the right venue, I'll comment on its merits. First, this doesn't require an explicit request to Google, and could be done by noindexing the pages. This could be accomplished by a configuration change, followed by an edit to ceb:Template:paghimo ni bot. Second, why is it the duty of the Wikimedia Foundation to care about individual wikis failing to properly curate their content? * Pppery * it has begun 20:02, 12 June 2020 (UTC)
Noindexing sounds like a great idea. In addition, the WMF should coordinate with Google so that all Wiki-clones of noidexed bot generated clones do not show up on Google either. Google needs to develop software that tracks which wikipedia pages are noindexed, and then use it as a screen against clones. The WMF should suggest this. As for the duty question, the WMF doesn't have a duty to get information right but rather an incentive if they want people to take the encyclopedia seriously. An analogy could be made with the National Weather Service offices--before releasing their data they coordinate with other regional offices to make sure that their forecasts do not contradict each other. There is nothing inherently wrong with, say, a rural area near the border of two National Weather Service districts having wildly different forecasts on the same day in a border area, but the NWS people are concerned that if the regional offices contradict each other, the people reading the forecasts will think they are probably both unreliable. It is more of a consumer confidence measure to preserve their own status and an authority than a duty to the public.
The problem is that if you live in VBNM County and search "ASDF Island" and you personally know that "ASDF Island" is in your county. But if Google comes back with results from ceb.wikipedia titled on the search engine page "ASDF Island (ZXCV County)"--you will think "That is completely wrong; of course it is wrong--its Wikipedia!" Several months ago I was confused and called up the county courthouse of the neighboring county to double check. The county employee was not pleased with me, from the nearby county, questioning the ownership of his county's island. He explained that it had always been part of his county. (I'm not an irredentist, really!).
If all the page titles were in Cebuano it would not be a problem, but location titles in many languages are frequently kept in their original languages. I would imagine that if the English Wikipedia titled their article "El Paso (Oklahoma)" the people who work on the Spanish Wikipedia would not be pleased that when Spanish speaking people type "El Paso" Google rings up both their wikipedia article on "El Paso (Texas)" and the enwiki article on "El Paso (Oklahoma)"--it would make people not take the es.wikipedia seriously. If the English speaking article was titled "The Pass (Oklahoma)" there would be no trouble to the Spanish speakers because only the English speakers would be led astray. But place names don't work that way. If one language gets it wrong, it confuses people searching Google in all the languages. So if there were thousands of errors like "El Paso (Oklahoma)" and "Mexico City (Honduras)" and the people managing the English speaking wikipedia only managed to fix a few every year, it would be reasonable for the people working on the es.wikipedia to want Google stop indexing all the non-curated English wikipedia articles.
Since some people have reservations with potentially granting censorship authority to the WMF, then as an alternative someone could translate a kindly request to the Cebuano wikipedia and ask them to no-index the geography pages tagged with the bot tag (it appears at the top of all the bot pages). If they are willing to do it on their own than all WMF would have to do is coordinate with Google to get the clones no-indexed too.--Epiphyllumlover (talk) 22:15, 12 June 2020 (UTC)
How about "El Paso (Baja Oklahoma)"? :-). Smallbones (talk) 23:35, 27 June 2020 (UTC)
Epiphyllumlover, you need to understand that the WMF does not control Google, and it's very unlikely that Google would take any notice of them over content on their own sites, and even more unlikely that it would over content on mirrors. Google does what it thinks is best for itself, which is to list the results of search queries according to its own algorithms, not some special pleading from the owners of web sites. Phil Bridger (talk) 20:19, 23 June 2020 (UTC)
So you are saying that it couldn't hurt to ask and that the WMF should do so? :) --Guy Macon (talk) 05:55, 29 June 2020 (UTC)
  • What evidence do we have that the ceb.wiki search results are a problem for anyone in the real world? Simply being indexed in Google doesn't mean that your page will ever be shown to a real user. You'd need to find what kind of keywords, language and geography ever gets ceb.wiki in the first page of results, and whether it displaces some more suitable result. Judging from the unique devices, with 100k "users" per month it's hard to believe there is any rampant excess visibility. The spikes of 500-600k might point to temporary changes in Google traffic. Nemo 06:04, 29 June 2020 (UTC)
That's the right question, I think. Because I check geographical names for Vicipaedia (Latin), I often look on Google for strange word combinations. I typically get a small number of results, with, somewhere in the list, Vicipaedias in unexpected languages -- e.g. Indonesian, because both Indonesian and Latin use the word "universitas". I never yet noticed Cebuano in my results, but, if I did, it would be because of a chance homonymy like that.
And, meanwhile, a Cebuano-speaking user of Google might actually want those pages and might improve one of them. There's no reason to prevent that. Andrew Dalby (talk) 13:03, 30 June 2020 (UTC)
Delete - GeoNames is absolutely not a reliable source of information. Anyone can add anything they want to GeoNames and it is subject to vandalism just like Wikipedia.[3] GeoNames is also a giant vacuum-cleaner of geographic databases and has no standards for inclusion. Even locations that were simply points in a geographic survey a hundred years ago can have entries in GeoNames. Finally, if a community doesn't have the resources to maintain a set of articles, they shouldn't have those articles. Not only does it erode Wikipedia's reputation for reliable information, but it pollutes the internet with endless copies and mirrors of bogus information. If the articles aren't deleted, I also support noindexing. Kaldari (talk) 14:24, 28 July 2020 (UTC)
Hi Kaldari, thanks for joining the discussion! Sorry for pestering you, but if you still have access to the Google Webmaster stuff then maybe you can check whether you have any partial answer to my doubts above. Nemo 15:55, 28 July 2020 (UTC)
@Nemo bis: I'm a real world-user, and it is terrible, and I have spent a few nights sorting out dupli- to quadruplicates, and daydreaming of create ways to hurt robots, if bots were actual hardware (dropping them off at IBM customer support is the cruellest idea so far).
I work with historical data, specifically I am trying to synchronise about a dozen databases of Shoah victims. That means, for example, finding out which of several Polish villages with the same name is someone's birth place, or if "Bad Freienwalde" and "Freienwalde" are the same (they are: one database uses historical names, others current, and "Bad" is like an honorary title that cities in Germany apparently get and lose depending on air quality or whatever).
I use Wikidata for this, with either custom software or OpenRefine. In principle, and for some countries, this works rather well. Except Wikidata tracks changes from cebwiki and creates items for every page. As a result, small places, especially in Germany, often have five or more identical entities. I am not exaggerating, just look at this atrocity: Kategoriya:Alemanya_paghimo_ni_bot. As but a single example, there are thirteen (update: eighteen, when the variant with double 'a' is included) pages for a place called Ach, which might be one or two, or maybe it's a river crossing through the region:

Aach (kapital sa munisipyo) Aach (lungsod) Aach (munisipyo sa Alemanya, Baden-Württemberg Region, Freiburg Region, lat 47,85, long 8,85) Aach (munisipyo sa Alemanya, Rheinland-Pfalz, lat 49,78, long 6,60) Aach (suba sa Alemanya, Baden-Württemberg Region, lat 47,73, long 9,23) Ach (suba sa Alemanya, Baden-Württemberg Region, lat 47,87, long 10,03) Ach (suba sa Alemanya, Baden-Württemberg Region, lat 47,93, long 9,64) [Ach (suba sa Alemanya, Baden-Württemberg Region, lat 47,95, long 9,82) [Ach (suba sa Alemanya, Baden-Württemberg Region, lat 48,40, long 9,80) [Ach (suba sa Alemanya, Bavaria, lat 47,57, long 10,15) [Ach (suba sa Alemanya, Bavaria, lat 47,58, long 10,68) [Ach (suba sa Alemanya, Bavaria, lat 47,65, long 10,80) [Ach (suba sa Alemanya, Bavaria, lat 47,70, long 11,16) [Ach (suba sa Alemanya, Bavaria, lat 47,78, long 11,11) [Ach (suba sa Alemanya, Bavaria, lat 47,82, long 10,55) [Ach (suba sa Alemanya, Bavaria, lat 48,73, long 11,06) [Ach (suba sa Alemanya, Bavaria, lat 48,76, long 11,57)[Ach (suba sa Alemanya, lat 47,90, long 10,12)

...and that's only the ones they helpfully added to the same category. There are duplicate categories, as well, with more Achs.
And this clearly isn't just a one-time mistake! Their attempts failed because identical titles can't be created, so they started adding "(city)" and "(town)" and so on. When they ran out of synonyms for cities, they just added geo-coordinates. The psychology that would compel someone to keep doing this, even though it is neither helpful nor lucrative nor appreciated is fascinating.
Delete sounds great. Thankfully, it does not seem to be ongoing right now, and svwiki has done a lot of work to clean up the mess the same user created there before they were banned. In the meantime, not synchronising with that pile of data that truly deserves the term dump is helpful to contain the damage, which Wikidata is actually doing as far as I can tell --Matthias Winkelmann (talk) 10:08, 12 August 2020 (UTC)
Delete --Tibet Nation (talk) 20:57, 20 September 2020 (UTC)
Question: Is there any reason not to just add __NOINDEX__ to ceb:Plantilya:Paghimo ni bot? (Edit: just realized this would require a phab request to change the value of $wgExemptFromUserRobotsControl, never mind.) PiRSquared17 (talk) 05:07, 26 August 2020 (UTC)