Jump to content

Talk:Spam blacklist: Difference between revisions

From Meta, a Wikimedia project coordination wiki
Latest comment: 4 years ago by Jdx in topic Proposed additions
Content deleted Content added
Line 24: Line 24:


=== Links used for self-promotion by globally banned user [[Special:CentralAuth/Projects|Projects]] (aka George Reeves Person) ===
=== Links used for self-promotion by globally banned user [[Special:CentralAuth/Projects|Projects]] (aka George Reeves Person) ===
* {{LinkSummary|bonevsone.tripod.com}}
<pre>\btrafford\.com/bookstore/bookdetail\.aspx\?bookid=SKU-000695816\b
* {{LinkSummary|antagonists.webnode.com}}
\bamazon\.com/Lubeks-Threelogy-Sweet-Science-Blockbuster-ebook\b
* {{LinkSummary|mywikibiz.com}}
\bbarnesandnoble\.com/w/lubeks-threelogy-the-sweet-science-2-jan-lubek\b
* {{LinkSummary|aordycz.blogspot.com}}
\bgoogle\.com/books.*q=.*lubek's%20threelogy
* {{LinkSummary|publichealthinsurance.xyz}}
\blibly\.net/book/f4c2458279c81f7e2709dd59e6e54f4c\b
* {{LinkSummary|oocities.org}}
\btwitter\.com/Lubek16\b
* {{LinkSummary|encyclopediasupreme.org}}
\bonevsone\.tripod\.com\b
* {{LinkSummary|libly.net}}
\bantagonists\.webnode\.com\b

\boocities\.org/georgereevesproject\b
*{{BLRequestRegex|\btrafford\.com/bookstore/bookdetail\.aspx\?bookid=SKU-000695816\b}}
\bencyclopediasupreme\.org/(?:Rocky|0000)\b
*{{BLRequestRegex|\bamazon\.com/Lubeks-Threelogy-Sweet-Science-Blockbuster-ebook\b}}
\bmywikibiz\.com/(?:0000|2356|Rocky_Marciano|User:(?:Books|Boxstuf))\b
*{{BLRequestRegex|\bbarnesandnoble\.com/w/lubeks-threelogy-the-sweet-science-2-jan-lubek\b}}
\byoutube\.com/watch\?v=(?:CTZIT2rYgGM|WlxM-BmCSvo)\b
*{{BLRequestRegex|\bgoogle\.com/books.*q=.*lubek's%20threelogy}}
\btheundefeated\.com/features/muhammad-ali-rocky-marciano-super-fight-battle-of-undefeated-heavyweights # without \b
*{{BLRequestRegex|\blibly\.net/book/f4c2458279c81f7e2709dd59e6e54f4c\b}}
\bgeocities\.ws/cmby2k\b
*{{BLRequestRegex|\btwitter\.com/Lubek16\b}}
\baordycz\.blogspot\.com\b
*{{BLRequestRegex|\bonevsone\.tripod\.com\b}}
\bpublichealthinsurance\.xyz/booker/pzijagaaqbaj\b
*{{BLRequestRegex|\bantagonists\.webnode\.com\b}}
</pre>
*{{BLRequestRegex|\boocities\.org/georgereevesproject\b}}
*{{BLRequestRegex|\bencyclopediasupreme\.org/(?:Rocky|0000)\b}}
*{{BLRequestRegex|\bmywikibiz\.com/(?:0000|2356|Rocky_Marciano|User:(?:Books|Boxstuf))\b}}
*{{BLRequestRegex|\byoutube\.com/watch\?v=(?:CTZIT2rYgGM|WlxM-BmCSvo)\b}}
*{{BLRequestRegex|\btheundefeated\.com/features/muhammad-ali-rocky-marciano-super-fight-battle-of-undefeated-heavyweights}} # without \b
*{{BLRequestRegex|\bgeocities\.ws/cmby2k\b}}
*{{BLRequestRegex|\baordycz\.blogspot\.com\b}}
*{{BLRequestRegex|\bpublichealthinsurance\.xyz/booker/pzijagaaqbaj\b}}

I am too lazy to search for examples but I'm preatty sure that you have already seen many of these links. --'''[[User:Jdx|<span style='color: #02d6b0;'>jdx</span>]]''' <sup>[[User talk:Jdx|<span style='color: #924685;'>Re:</span>]]</sup> 12:08, 8 February 2020 (UTC)
I am too lazy to search for examples but I'm preatty sure that you have already seen many of these links. --'''[[User:Jdx|<span style='color: #02d6b0;'>jdx</span>]]''' <sup>[[User talk:Jdx|<span style='color: #924685;'>Re:</span>]]</sup> 12:08, 8 February 2020 (UTC)



Revision as of 06:42, 9 February 2020

Shortcut:
WM:SPAM
WM:SBL
The associated page is used by the MediaWiki Spam Blacklist extension, and lists regular expressions which cannot be used in URLs in any page in Wikimedia Foundation projects (as well as many external wikis). Any Meta administrator can edit the spam blacklist; either manually or with SBHandler. For more information on what the spam blacklist is for, and the processes used here, please see Spam blacklist/About.

Proposed additions
Please provide evidence of spamming on several wikis. Spam that only affects a single project should go to that project's local blacklist. Exceptions include malicious domains and URL redirector/shortener services. Please follow this format. Please check back after submitting your report, there could be questions regarding your request.
Proposed removals
Please check our list of requests which repeatedly get declined. Typically, we do not remove domains from the spam blacklist in response to site-owners' requests. Instead, we de-blacklist sites when trusted, high-volume editors request the use of blacklisted links because of their value in support of our projects. Please consider whether requesting whitelisting on a specific wiki for a specific use is more appropriate - that is very often the case.
Other discussion
Troubleshooting and problems - If there is an error in the blacklist (i.e. a regex error) which is causing problems, please raise the issue here.
Discussion - Meta-discussion concerning the operation of the blacklist and related pages, and communication among the spam blacklist team.
#wikimedia-external-linksconnect - Real-time IRC chat for co-ordination of activities related to maintenance of the blacklist.
Whitelists
There is no global whitelist, so if you are seeking a whitelisting of a url at a wiki then please address such matters via use of the respective Mediawiki talk:Spam-whitelist page at that wiki, and you should consider the use of the template {{edit protected}} or its local equivalent to get attention to your edit.

Please sign your posts with ~~~~ after your comment. This leaves a signature and timestamp so conversations are easier to follow.


Completed requests are marked as {{added}}/{{removed}} or {{declined}}, and are generally archived quickly. Additions and removals are logged · current log 2024/07.

SpBot archives all sections tagged with {{Section resolved|1=~~~~}} after 7 days and sections whose most recent comment is older than 15 days.

Proposed additions

This section is for proposing that a website be blacklisted; add new entries at the bottom of the section, using the basic URL so that there is no link (example.com, not http://www.example.com). Provide links demonstrating widespread spamming by multiple users on multiple wikis. Completed requests will be marked as {{added}} or {{declined}} and archived.

Links used for self-promotion by LTA Alex9777777 (aka Aleksej Pechkurov)





  • Regex requested to be blacklisted: \blovifm\.com\b
  • Regex requested to be blacklisted: \bvk\.com/pechkurovaleksej\b
  • Regex requested to be blacklisted: \bgoogle\.com/share/ZazhHI6x\b
  • Regex requested to be blacklisted: \byoutube\.com/channel/UCoKBMvSoQxpWJQvGCms5QdA\b
  • Regex requested to be blacklisted: \bmusicbrainz\.org/artist/0bcffd70-5637-4334-91f6-f5aec5f57bef\b
  • Regex requested to be blacklisted: \btwitter\.com/AGPechkurov\b
  • Regex requested to be blacklisted: \bgenius\.com/artists/Pechkurov-aleksej\b
  • Regex requested to be blacklisted: \bsoundcloud\.com/pechkurov_aleksej_official\b
  • Regex requested to be blacklisted: \blyricstranslate.\com/ru/pechkurov-aleksej # without \b
  • Regex requested to be blacklisted: \bdrive.google.com/file/d/1jNY8PuAUkycPLKhRwyCXPX3nwjNyPese\b

Some of them have been recently used on en:Draft:Pechkuroy ( Blogger), simple:Pechkuroy ( Blogger) and uk:Печкуров (Блогер). He also likes to add links to Google search: https://www.google.com/search?kgmid=/g/11c5ss6vkf&hl=en-BY&kgs=246cd2fd882c6c3e&q=%D0%90%D0%BB%D0%B5%D0%BA%D1%81%D0%B5%D0%B9+%D0%9F%D0%B5%D1%87%D0%BA%D1%83%D1%80%D0%BE%D0%B2&shndl=0&source=sh/x/kp&entrypoint=sh/x/kp, https://www.google.com/search?source=hp&ei=vkhMXbGbMe2orgS6sJ3YDw&q=lovifm+top+songs&oq=lovifm+top+&gs_l=psy-ab.1.0.0.8453.11741..14071...0.0..0.102.1136.13j1......0....1..gws-wiz.....0..0i131j0i30j0i10i30.TtJchDO79wQ. --jdx Re: 12:08, 8 February 2020 (UTC)Reply

Links used for self-promotion by globally banned user Projects (aka George Reeves Person)

















  • Regex requested to be blacklisted: your regex contains an '='; please prefix the link with 'regex=' within the template to render the link correctly
  • Regex requested to be blacklisted: \bamazon\.com/Lubeks-Threelogy-Sweet-Science-Blockbuster-ebook\b
  • Regex requested to be blacklisted: \bbarnesandnoble\.com/w/lubeks-threelogy-the-sweet-science-2-jan-lubek\b
  • Regex requested to be blacklisted: your regex contains an '='; please prefix the link with 'regex=' within the template to render the link correctly
  • Regex requested to be blacklisted: \blibly\.net/book/f4c2458279c81f7e2709dd59e6e54f4c\b
  • Regex requested to be blacklisted: \btwitter\.com/Lubek16\b
  • Regex requested to be blacklisted: \bonevsone\.tripod\.com\b
  • Regex requested to be blacklisted: \bantagonists\.webnode\.com\b
  • Regex requested to be blacklisted: \boocities\.org/georgereevesproject\b
  • Regex requested to be blacklisted: \bencyclopediasupreme\.org/(?:Rocky
  • Regex requested to be blacklisted: \bmywikibiz\.com/(?:0000
  • Regex requested to be blacklisted: WlxM-BmCSvo)\b
  • Regex requested to be blacklisted: \btheundefeated\.com/features/muhammad-ali-rocky-marciano-super-fight-battle-of-undefeated-heavyweights # without \b
  • Regex requested to be blacklisted: \bgeocities\.ws/cmby2k\b
  • Regex requested to be blacklisted: \baordycz\.blogspot\.com\b
  • Regex requested to be blacklisted: \bpublichealthinsurance\.xyz/booker/pzijagaaqbaj\b

I am too lazy to search for examples but I'm preatty sure that you have already seen many of these links. --jdx Re: 12:08, 8 February 2020 (UTC)Reply

Proposed additions (Bot reported)

This section is for domains which have been added to multiple wikis as observed by a bot.

These are automated reports, please check the records and the link thoroughly, it may report good links! For some more info, see Spam blacklist/Help#COIBot_reports. Reports will automatically be archived by the bot when they get stale (less than 5 links reported, which have not been edited in the last 7 days, and where the last editor is COIBot).

Sysops
  • If the report contains links to less than 5 wikis, then only add it when it is really spam
  • Otherwise just revert the link-additions, and close the report; closed reports will be reopened when spamming continues
  • To close a report, change the LinkStatus template to closed ({{LinkStatus|closed}})
  • Please place any notes in the discussion section below the HTML comment

COIBot

The LinkWatchers report domains meeting the following criteria:

  • When a user mainly adds this link, and the link has not been used too much, and this user adds the link to more than 2 wikis
  • When a user mainly adds links on one server, and links on the server have not been used too much, and this user adds the links to more than 2 wikis
  • If ALL links are added by IPs, and the link is added to more than 1 wiki
  • If a small range of IPs have a preference for this link (but it may also have been added by other users), and the link is added to more than 1 wiki.
COIBot's currently open XWiki reports
List Last update By Site IP R Last user Last link addition User Link User - Link User - Link - Wikis Link - Wikis
vrsystems.ru 2023-06-27 15:51:16 COIBot 195.24.68.17 192.36.57.94
193.46.56.178
194.71.126.227
93.99.104.93
2070-01-01 05:00:00 4 4

Proposed removals

This section is for proposing that a website be unlisted; please add new entries at the bottom of the section.

Remember to provide the specific domain blacklisted, links to the articles they are used in or useful to, and arguments in favour of unlisting. Completed requests will be marked as {{removed}} or {{declined}} and archived.

See also recurring requests for repeatedly proposed (and refused) removals.

Notes:

  • The addition or removal of a domain from the blacklist is not a vote; please do not bold the first words in statements.
  • This page is for the removal of domains from the global blacklist, not for removal of domains from the blacklists of individual wikis. For those requests please take your discussion to the pertinent wiki, where such requests would be made at Mediawiki talk:Spam-blacklist at that wiki. Search spamlists — remember to enter any relevant language code

Troubleshooting and problems

This section is for comments related to problems with the blacklist (such as incorrect syntax or entries not being blocked), or problems saving a page because of a blacklisted link. This is not the section to request that an entry be unlisted (see Proposed removals above).

WD and blacklisted links

There are now several threads on this page regarding the removal/whitelisting of links that are needed on WD. Time to write out things in a more general way so we can then discuss.

First: YES, I fully agree that any subject on WD should have a link to their official website, regardless of whether it is blacklisted. The problem is that it will result in many technical problems as we are talking here about the global blacklist. This blacklist is used by 800+ WikiMedia wikis and thousands of non-WikiMedia wikis. I do not know whether outside of the 800+ WikiMedia wikis any other wikis use WD for data, so I will keep it to 800+ possible cases of 'disruption' per allowed WikiData item (noting that some property calls do display WD data of one item on multiple pages on one wikipedia).

  1. whitelisting the item on WD enables WD to save the item. However, any item that uses the WD item (e.g. if facebook.com was blacklisted on meta and one would whitelist en:Donald Trump's facebook on WD and add it to his properties you might disrupt editing of that one page on 207 wikis (if all of them use the WD data); for en:Pornhub (which is globally blacklisted) it could disrupt editing of that one page on the 47 wikis that currently connect to the item; for some subjects it may be several hundreds of wikis). All wikis that use the WD data will have to individually whitelist the same link, which allows that link then to be used on any page on that Wiki, and hence negates the global blacklisting (for PornHub that was the problem, as it is for many spam top domains). (for those with the incentive (which spammers have: it pays their bills) and the technical know-how: this can be (and has been/is) abused to link to any link anywhere on any wiki that followed the WD suit of whitelisting).
  2. excluding the top domain here allows both WD and any local wiki to save the data, and it would not disrupt any of the other wikis. However that allows all wikis to use that link everywhere. Again, that negates the reason for blacklisting the top domain (for those with the incentive (which spammers have: it pays their bills) and the technical know-how: this can be (and has been/is) abused to link to any link to that domain anywhere on any wiki).
  3. whitelisting or excluding a neutral landing page (/about page e.g.) does give a reasonable way to stop random abuse (the random school kid will not add pornhub's /about as it is a) not as fun, and b) less obvious). The local whitelisting on WD needs also whitelisting everywhere else, a problem that is not reflected with the global exclusion of /about (but that requires a large adaptation to the meta spam-blacklist). Of course, linking to the /about is 'not correct' for WD (it is not the homepage), while that is less problematic for the other wikis (it is a representative page of the company; it is the standard practice on en.wikipedia).
  4. Exluding WD from using the global spam-blacklist (or override the global spam-blacklist by a blanket whitelist) would enable WD to do whatever they want, it would however result in the same disruption as described in 1 and 2. Moreover, WD would also get all the spam they do not want except if they then .. blacklist everything that is spammed locally (and yes, some spammers start at WD nowadays, as you might spam multiple wikis with one edit).

These methods are all available, but for all the possible wikis and possible pages that would likely disable editing on hundreds to thousands of pages per wiki, and disallowing using WD data on all those who do not use it yet (but for the latter, they can't add it locally at the moment either).

I can imagine a solution where data on WD can be set correctly, but be blocked from being used by client wikis. That would however need a separate flag to be defined for each WD item, which needs a phabricator ticket and implementation. But I am open to other solutions, as this does need a proper solution. --Dirk Beetstra T C (en: U, T) 12:03, 14 January 2020 (UTC)Reply

First of all, thank you for summarizing this issue.
A minor note: option (1) is not as disruptive as it may seem, as it is possible to edit a page which actually imports blacklisted URLs from Wikidata. Still, whitelisting on WD is indeed far from being a satisfactory solution.
A technical solution has come into my mind. It does not require a new flag for each item, but still requires much work to implement. The idea is that every WD item has its own whitelist which is editable only by sysops (similarly to editnotices) and affects only that WD item and the pages on other wikis which are associated with that item. --colt_browning (talk) 13:10, 17 January 2020 (UTC)Reply
@Beetstra and Colt browning: Another possible way is to ask Sysadmins e.g. @Reedy (WMF): to request running SQL command *ALTER TABLE* to force adding of a P856-related field, force setting the value(s), and Lock=EXCLUSIVE(?) edit action of that field, but this means that after adding it, we have to see this malpractice everyday: there's a P856 value on an item, which can't edit normally, we can't add further P856 values, nor add qualifiers for it, and nor remove it (or doesn't even have edit button, xor always have a grey edit button that can't click?). --Liuxinyu970226 (talk) 04:43, 18 January 2020 (UTC)Reply
@Colt browning: I don’t think it works. For en.wikipedia the next edit will try to put the link in the db for that page which is disallowed due to the blacklist. I’ll try to test that.
@Liuxinyu970226: that still gives the problem as described for any wiki that uses it, you are basically doing option 1 in a different way. —Dirk Beetstra T C (en: U, T) 11:42, 18 January 2020 (UTC)Reply
@Beetstra:[citation needed] --Liuxinyu970226 (talk) 10:29, 22 January 2020 (UTC)Reply
@Liuxinyu970226: What needs a citation? --Dirk Beetstra T C (en: U, T) 10:40, 22 January 2020 (UTC)Reply
Do you want a demonstration? Go ahead, get '\bpornhub\.com\b' whitelisted on wikidata (even just for the demonstration, I can even de-list it here for the sake of demonstration) and add it as official website to d:Pornhub. Then follow up on en.wikipedia to try and add '{{Official website}}' (without the nowikis) on en.wikipedia (or call the property transclusion directly) on all those 47 wikis and see what you get. You can even do it now by adding '{{Official website}}' (without the nowikis) to en:Cloud mining (no, you did not try that but I have suggested that earlier). Now do that for every item in WD that has a globally blacklisted official website and see how many pages will face troubles over our 800s of wikis. --Dirk Beetstra T C (en: U, T) 10:56, 22 January 2020 (UTC)Reply
@Beetstra: So what do you think of my proposed technical solution? Should I write it up as a phab ticket? I understand that an idea has very little value, only implementation is valuable, but still. Also, this was a great proposal; if you are going to propose it again in 2020, please let me know, I'll call for votes in my home wiki. --colt_browning (talk) 09:08, 26 January 2020 (UTC)Reply
@Colt browning: I would add it as an option to the phab-ticket I created. It will then be up to the developers to see what is most feasible. --Dirk Beetstra T C (en: U, T) 10:16, 26 January 2020 (UTC)Reply
Comment Comment I wonder if Wikidata's interface (i.e. the Wikibase) can be developed to store SNI informations or not, by this way, 1. users don't have to type any kinds of domains for such "sensitive properties" like official website, they have ways to enter some random Checksums; 2. every Wikipedias have to have a gadget to reverse-populate the domain by checking SNIs; 3. and simply, every websites must therefore support https in order to store in Wikidata, and should better use EV SSL certificates. --Liuxinyu970226 (talk) 10:57, 22 January 2020 (UTC)Reply
That would still result in a local wiki to decode the checksum and have the external link in the final product, which still means that it got 'added' to en.wikipedia which is the part that is prohibited. Blacklisted means blacklisted. Links that are blacklisted cannot be displayed. You can easily test this as well, add 'moc.buhnrop//:sptth' in a template, and write a second template that inverts the text so it displays https://pornhub.com and you will see that it will not want to save your inversion code using template. Do you really think that it is this easy that you can think of tricks to circumvent the blacklist that have not yet been implemented by spammers who get paid to have their links spread all over. It is technically made impossible to link to blacklisted links. --Dirk Beetstra T C (en: U, T) 11:27, 22 January 2020 (UTC)Reply
(by the way, with this you get closer to the abuse trick that I allude to above) --Dirk Beetstra T C (en: U, T) 11:47, 22 January 2020 (UTC)Reply
Wikipedias can be spammed just because "it's shown"? Then this discussion can be closed right now, if solutions based on ESNI, TLS v3.0, QUIC, or even post quantum encryption (which are all kinds of the de facto Google's encryption mechanisms) even can't solve what you're concern, then you're pointing a Fermat-like issue, which is the reason I'll cease my efforts on discussions. --Liuxinyu970226 (talk) 13:14, 22 January 2020 (UTC)Reply
@Liuxinyu970226: I am sorry, but either I don't understand what you are trying to do, or indeed, links that are blacklisted cannot be 'shown' on-wiki. As far as I understand, you want to be able to have (just as an example) d:Q936394 to have https://pornhub.com as the value for the property 'official website', right? --Dirk Beetstra T C (en: U, T) 13:22, 22 January 2020 (UTC) (I tried that: but failed, obviously --Dirk Beetstra T C (en: U, T) 13:24, 22 January 2020 (UTC)Reply
@Beetstra: Yes, but not only this one, I'm asking a valid solution to bypass this blacklist, to which it even unfairly restricted the administrators of Wikidata. --Liuxinyu970226 (talk) 02:42, 23 January 2020 (UTC)Reply
@Liuxinyu970226: Yes, I understand that it is for literally all links that are the official website of a subject. They are a serious problem (and not only on WikiData, also e.g. en.wikipedia would like to be able to link to the official website of a subject. It is a small but constant stream of (both de-blacklist and whitelist) requests on en.wikipedia).
The deeper problem is that the spam-blacklist extension is black-and-white. Things blacklisted are disallowed everywhere, any namespace, any page, by anyone. And where it involves this blacklist (the global one) it is then also on all 800+ wikimedia projects and the thousands of projects outside that use this blacklist. Most of the sites on this list are utterly useless material (viagra spam, etc.), but a reasonable amount are of a, albeit very, very limited, use on our projects (again, back to pornhub: for about 50 pages throughout our wikis have that as the official website - noting that barring a very few exceptions the other 10s of millions of pages on all those wikis combined do not need the link).
We can allow holes in that system through local whitelisting, but that needs to be done with care. For some pages one could easily allow only the top domain (whitelist '\bexample\.com$', or exclude the top domain here '\bexample\.com.' or more complex) but that means that that that top domain can be added anywhere (again back to pornhub: up to 10 hits each day are there because that top domain link is used to replace a school website).
The blacklist being this black-and-white means that we do have a problem. That was already recognized 14 (!!!) years ago (task T6459), and I personally have been trying now for a couple of years to get the spam blacklist overhauled (see Community_Wishlist_Survey_2017/Miscellaneous/Overhaul_spam-blacklist and Community Wishlist Survey 2019/Admins and patrollers/Overhaul spam-blacklist). The spam blacklist breaks stuff, it is too crude. I totally agree that we need a solution, but as I see it currently, I see no real workable solution (maybe except for excluding a neutral non-top domain landing page on the global spam blacklist - it is not really what WD would want but it is currently as close as we can get - anything else allows for spam (or wider: abuse). It would require a major rewrite of many rules here on the spam blacklist (which can be done easier if we adapt our script) but anything else needs a serious phab ticket to overhaul the spam blacklist (which I do not see WMF doing ...). --Dirk Beetstra T C (en: U, T) 05:27, 23 January 2020 (UTC)Reply
Also, even under currently blacklist settings, it looks like that terrorism things e.g. [1] can't be prohibited, how do we think that such edits aren't "spam"s? --Liuxinyu970226 (talk) 03:43, 23 January 2020 (UTC)Reply
@Liuxinyu970226: The blacklist has nothing to do with terrorism, that is totally out of it's scope. The spam blacklist is about links to websites. That may be something for the AbuseFilter which is more suited for that. --Dirk Beetstra T C (en: U, T) 05:27, 23 January 2020 (UTC)Reply
The problem for WD is that the additions there can be abused at the sister wikis. If you need something links outwards, can't you craft something within your system? If we are looking at something that is inward facing, then implement something that is not an active url. The drive for WD perfection seems to come at a cost to everyone else. I am already seeing enough abuse of WD by the same publicity spamhauses that are trying to invade enWP, and there are less defences at WD, and at this stage I see that the removal from the blacklists is just going to worsen the situation.  — billinghurst sDrewth 07:45, 25 January 2020 (UTC)Reply

Temporary solution

As I don't see that task T243484 will be solved anywhere soon (it will need technical changes, testing, etc. etc.) there is currently only one workable solution that we could implement but which will need all parties to be willing to deviate from the 'perfect' solution:

  • We open here a special request section where requests can be posted to exclude a neutral landing page from the global spam-blacklist. Those pages are generally /about or /information pages, not the top domain. When requested by established editors this pretty much defaults to support to change the rule (but with understanding that there will be (rare) exceptions). Please do not request top domains (even while technically possible), it will not be granted.

This excludes a working link that can be used on any wiki, and will hence not result in problems when WD data is being re-used on other wikis. Any other solution that I currently see will result in editing problems on all client wikis. It will require quite some work by admins here (adapting rules), and willingness from WD to have non-perfect data in their fields (at least until task T243484 is solved), but the alternative will be the current status quo. --Dirk Beetstra T C (en: U, T) 07:11, 26 January 2020 (UTC)Reply

Comment
  • What if there is no "neutral landing page" except for the front page (which, however, has no improper content)? --colt_browning (talk) 09:00, 26 January 2020 (UTC)Reply
    • @Colt browning: That is rather exceptional (whitelisting neutral landing pages is common practice on en.wikipedia), but that can then likely be catered for. Note that 'improper content' is not really the reason that we don't allow the frontpage (the frontpage of the organisation is hardly ever 'inappropriate', it is that the frontpage is the abused page). Being 'inappropriate' is also generally not the reason that we blacklist, it is that the page is spammed / abused. --Dirk Beetstra T C (en: U, T) 10:03, 26 January 2020 (UTC)Reply
  • and in the case of no "neutral landing page" (as in the case of Sci-Hub). --colt_browning (talk) 08:45, 27 January 2020 (UTC)Reply
    • @Colt browning: sci-hub.ren/#about would do ... it is really a rare exception. --Dirk Beetstra T C (en: U, T) 10:02, 27 January 2020 (UTC)Reply
      • Why not just sci-hub.si/#? Works for other websites as well. --colt_browning (talk) 10:48, 27 January 2020 (UTC)Reply
        because as a generic solution it is inexact, and doesn't point to about pages at many sites; it is also equally abusable for Beetstra's previous examples.  — billinghurst sDrewth 11:42, 27 January 2020 (UTC)Reply
        I'd agree that it is inexact, and that I would prefer it to be specific (and sometimes a neutral landing page is just the better place to send people to - schoolkids are smart, you can just wait for them to figure out that <porn-site.com>/# is working and has the same 'shock' effect as the top domain; many of the notable (and more 'decent') porn sites do have a non-shock SFW page somewhere). It may be a good one for the odd case where there really is no neutral point of landing. Anyway, I do not disagree with the exclusion in your vote, and we can see that on a case-by-case basis. --Dirk Beetstra T C (en: U, T) 11:56, 27 January 2020 (UTC)Reply
  • This would solve the problem of official website being blocked by the spam blacklist, but what about other properties that links to external URLs such as official blogs, terms of service URL, privacy policy URL or website account on (with a URL qualifier)? --Trade (talk) 22:29, 27 January 2020 (UTC)Reply
    • @Trade: as stated as header, this is supposed to be a temporary solution, task T243484 is supposed to result in a proper solution. The main problem is for the official websites, these are regularly re-used on practically all connected wikis (which may in some cases go up to 200 client wikis). I agree that there are other website qualified but those (for now) could generally just be whitelisted on WikiData. If that data is (heavily) re-used then I would make it fall under this same case (there is no reason that we cannot whitelist '\bexample\.com\/(FAQ$|About$)'). --Dirk Beetstra T C (en: U, T) 05:25, 28 January 2020 (UTC)Reply
  • TBH, the reason why I didn't provide opinions on supporting or not, is in general about the SSL KeyIDs, the SSL certificate combinations of one site may be different by every its servers (afaik one Japanese example of this is Pixiv, where they use one KeyID for their main domain, and some other KeyIDs for Pawoo, Pixiv Sketch, Pixiv Comics, Pixiv company IR info, etc.), by such circumstances, the entried urls may not be worked as-is (it may or may not be a 304 redirect, or it only shows an "under construction" or likely placeholders). --Liuxinyu970226 (talk) 04:46, 3 February 2020 (UTC)Reply
    • @Liuxinyu970226: I am sorry, but I still don't understand what this has to do with KeyIDs. This is about here having a speed procedure to exclude a neutral landing page for blacklisted domains so that can be used as the 'official website' in items that need an official website. What you seem to be talking about is at the moment far outside of the capabilities of the (global) software regarding the spam-blacklist. --Dirk Beetstra T C (en: U, T) 05:29, 3 February 2020 (UTC)Reply
Support
Not support

Discussion

This section is for discussion of Spam blacklist issues among other users.