Talk:Spam blacklist/Archives/2020-01

From Meta, a Wikimedia project coordination wiki
Jump to navigation Jump to search
Warning! Please do not post any new comments on this page. This is a discussion archive first created on 01 January 2020, although the comments contained were likely posted before and after this date. See current discussion or the archives index.

Proposed additions

Symbol comment vote.svg This section is for completed requests that a website be blacklisted

Request for 3 domains

As per my request for the account locks here this is being used to spam some garbage site and is potentially a source for hoaxes. Praxidicae (talk) 15:56, 2 January 2020 (UTC)
@Praxidicae: Added Added to Spam blacklist. --Martin Urbanec (talk) 16:02, 2 January 2020 (UTC)
@Martin Urbanec: I added another from this spam set that's being used for the same purpose. Running a report now. Please add this as well. Also pinging @Ohnoitsjamie: as this might be of interest to you...Praxidicae (talk) 16:27, 2 January 2020 (UTC)
@Praxidicae: Added Added to Spam blacklist. --Martin Urbanec (talk) 16:29, 2 January 2020 (UTC)
Good catch. I blacklisted schooltips on en for repeat spamming obits. Ohnoitsjamie (talk) 16:36, 2 January 2020 (UTC)

Indian financial scheme spam

Request moved from English wikipedia SBL per request due to cross-wiki spam. See COIBot reports for and for both English and Hindi wikipedias. Ravensfire (talk) 03:33, 2 January 2020 (UTC)

@Ravensfire: Added Added to Spam blacklist. -- — billinghurst sDrewth 10:30, 3 January 2020 (UTC)

Redirection service. JzG (talk) 18:11, 4 January 2020 (UTC)

@JzG: Added Added to Spam blacklist. --Martin Urbanec (talk) 18:15, 4 January 2020 (UTC)

Link shortener reported at enWP blacklist. Can't verify much about it, alas. My filters block it as malvertising. JzG (talk) 10:53, 5 January 2020 (UTC)

@JzG: Added Added to Spam blacklist. Seems definitely as a malicious site. It's also not a normal shortener, through. --Martin Urbanec (talk) 12:11, 5 January 2020 (UTC)

Cross-wiki spam. Tgeorgescu (talk) 15:15, 31 December 2019 (UTC)

It was a short burst spamming from a user, so I am not certain that it is needed from a single short spell. I would prefer that we monitor rather than blacklist, and have set COIBot to do its monitoring and reporting.  — billinghurst sDrewth 10:24, 3 January 2020 (UTC)
@Tgeorgescu: Declined Declined at this time  — billinghurst sDrewth 03:02, 15 January 2020 (UTC)

Proposed removals

Symbol comment vote.svg This section is for archiving proposals that a website be unlisted.

This news site is approved by Google News. I thing we want this type of site for reference links. but now I saw this site is black listed so now I would like you to please remove this site from blacklist because anybody want this type of site for references. —The preceding unsigned comment was added by Wikialinaparker (talk)

@Wikialinaparker: X mark.svg not globally blacklisted so there is nothing for us to do here. It looks to be blacklisted at English Wikipedia Defer to w:en:Mediawiki talk:spam-blacklist  — billinghurst sDrewth 11:29, 16 January 2020 (UTC)

Troubleshooting and problems

Symbol comment vote.svg This section is for archiving Troubleshooting and problems.

d:Q21980377 (Sci-Hub)

Need to add the new official URL,, with the Recommended rank (other URLs need not be changed as well as their ranks). Sci-Hub is globally blacklisted for a good reason, but this is the element devoted to the website itself, so it needs this link, especially since other sci-hub.* URLs are already there. Perhaps the simplest way to do it is to de-blacklist it temporarily. --colt_browning (talk) 17:32, 10 January 2020 (UTC)

{{declined}} You can whitelist it locally at Wikidata with mediawiki:Spam-whitelist. --Martin Urbanec (talk) 17:10, 11 January 2020 (UTC)
@Martin Urbanec: No, it should stay blacklisted. Otherwise, people will add links to Sci-Hub to the elements of scientific articles (in good faith) which is bad because Sci-Hub violates copyright (see blacklisting discussion). If there is a way to whitelist an URL for just a single element, please explain how to do it. Or, do you mean temporary whitelisting? --colt_browning (talk) 19:46, 11 January 2020 (UTC)
@Colt browning: temporary whitelisting is an option, but that will affect all wikis that use the wikidata item to display the official link (as it is then on that page, editing on each wiki will be problematic). This likely needs a phab ticket to really get to a proper solution, as any official website that is blacklisted runs into this problem. I don't think we can do anything really here, and would rather strongly suggest against doing something on WikiData due to the effects that will have. --Dirk Beetstra T C (en: U, T) 08:24, 12 January 2020 (UTC)
@Beetstra: Since other URLs in that Wikidata item are also affected by the blacklist, all wikis already cannot use Wikidata to display the official link, so adding the new URL won't harm them. Also, this Wikidata element is used by an external website which is not affected by blacklisting, so there is perfect sense in updating the Wikidata element. With this in mind, maybe still consider temporary whitelisting/de-blacklisting? Anyway, I agree that it needs a proper solution and will prepare a Phabricator ticket. --colt_browning (talk) 09:29, 12 January 2020 (UTC)
@Colt browning: the point is, that the other URLs in that Wikidata item are causing problems on other wikis - they cannot be 'used' on other wikis. Complete de-listing causes a horde of problems (quite some material on that site should never be linked to), local whitelisting will solve the WD problem, but we still have/get the horde of other problems. This really needs another solution - either we need a global whitelist for /about pages so we can avoid this, or a completely different solution (e.g. a flag on WD that the data is there on WD and locally whitelisted, but cannot be 'pulled' onto other wikis). None of that is now there. --Dirk Beetstra T C (en: U, T) 10:55, 12 January 2020 (UTC)
@Colt browning and Beetstra: Maybe use negative lookahead in the blacklist to make sure it's not /about? Not sure what WD wants to link to through. --Martin Urbanec (talk) 11:17, 12 January 2020 (UTC)
@Martin Urbanec: A good idea. In fact, if we globally whitelist links to frontpages only (smth like \/\/sci-hub\.\w*\/?$), it solves the issue completely. Sci-Hub is blacklisted because of the copyrighted content, so the link to the front page is harmless (and no one is going to spam it, I guess). --colt_browning (talk) 11:27, 12 January 2020 (UTC)
@Colt browning: No, it doesn't. The frontpage is often the one that is the source of the abuse/spam, allowing only the frontpage like you did is enabling the same rubbish that it is supposed to stop. My base example here is, the front page is on en.wikipedia only abused a handful of times a day (example: .. why does a Russian submarine need a link to the frontpage of pornhub? Or an Ohio school: And all the spam companies that are being blacklisted.
@Martin Urbanec: Yes, but only on request we could exclude with a negative lookahead (a recurring example where you don't want to do this standard is Encyclopedia Dramatica, who will just abuse what you whitelist). But I agree, that is likely not what WD wants. For the Wikipedia's you want a representative landing page, for WikiData you want the data ... --Dirk Beetstra T C (en: U, T) 12:40, 12 January 2020 (UTC)
@Beetstra: Well, Sci-Hub is not Pornhub. Its frontpage doesn't violate anything. It is blacklisted because it gives access to copyrighted content, and users add direct links to the copyrighted content on Sci-Hub. People add links to Pornhub because they think it's funny, it's just common vandalism. Also, there are lots of websites that are interested to have links from Wikipedia (e.g., some news websites), so they add links to particular pages or the frontpages, and this is common spam. But Sci-Hub is a completely different case. It's not spam, it's not vandalism, it's just people linking copyrighted content in good faith. --colt_browning (talk) 12:48, 12 January 2020 (UTC)
@Colt browning: people can also add links to websites to spam, pornhub is indeed a particular example but that does not mean that it does not happen with companies as well - sci-hub would not be the first company that replaces any (wikilink to) Sci-Hub with a link to their frontpage to spam. And if not, then Sci-Hub would be rather an exception than a rule. --Dirk Beetstra T C (en: U, T) 12:52, 12 January 2020 (UTC)
@Beetstra: Yes, it is an exception, and that's exactly what I'm talking about. Also, it's not a company. See w:Sci-Hub. I bet if you check the spam filter hits (I don't know whether it's possible) you will find only direct links to articles on Sci-Hub, not the kind of spamming you're talking about. --colt_browning (talk) 12:58, 12 January 2020 (UTC)
@Colt browning: you don't know how to check, but you bet that it will be only that. I know Sci-Hub is not a company per sé, but also state funded musea need to show that their website is efficient and have been found spamming. Not being a company is not a reason not to spam.
Now I do agree that for Sci-Hub at the very least most of the 'abuse' is on direct linking to (likely) copyvio material. But I'd prefer a proper solution which is more general, and this is not it. (and I am still pondering whether it is possible to abuse this through some template magic). --Dirk Beetstra T C (en: U, T) 13:10, 12 January 2020 (UTC)
you don't know how to check, but you bet — that's the whole point: if you check my prediction and find it correct, that would convince you better. Scientific method at its best. Not being a company is not a reason not to spam — I've never said it is (although in this case it's not simply not a company or a state organization, it is a single person who is already banned at least on ruwiki anyway). I do agree — well, thanks. --colt_browning (talk) 13:20, 12 January 2020 (UTC)

@Colt browning: but that is the point, it is not my job to show that you are correct. It is on you to show that a site is not problematic, and I have given 2 reasons why we do not exclude top domains, and I hinted at a not disclosed reason that excluding top domains does allow for the technical possibility to abuse (though you need to know how to, but it is rather easy - people try to evade blacklists all the time and this opens one of the ways of doing this, and I recall having seen this trick before).

It does not matter, I used the word 'organisations' for a specific reason, meaning to cover a one-person organisation as well. The single-person owner of sci-hub did set it up to make a point, and other people who agree with that point (and even those who disagree with that point) can try to force the point. The text 'find <this petition> on' or even 'type in your browser and type <this petition> in the search box and vote' has been used to circumvent that spam blacklist block. People find their ways. There is no need to make that as easy as possible.

Do note that this global spam-blacklist is to protect 800+ mediawiki wikis and thousands of external wikis that chose to use this blacklist. Your mileage may vary but I would need very strong arguments why we have to change this spam-blacklist practice to enable just one of them. You are not convincing me that excluding the top domain is the best solution, but maybe other admins here can be convinced to do exactly that. It is just an advice that excluding top domains is in my opinion a very bad idea. This is in desperate need of a proper Phab ticket that solves the problem properly, we likely need a flag on WD that states that certain external links (or other data) on WD should not (i.e. NEVER) be used on any client-Wiki as that would block editing the client page on all wikis that are client - that allows WD to have the data (through local whitelisting of some kind) while not possibly disturbing editing on hundreds or thousands of wikis). --Dirk Beetstra T C (en: U, T) 11:12, 14 January 2020 (UTC)


Symbol comment vote.svg This section is for archiving Discussions.

This is a very old entry added by Special:Diff/82686 (in 2004!) and we do not have discussion logs on why this was added. A search on Google tells that we probably do not want to remove the entry, but can we at least add a \b so it does not match stuff like (the official site of a Japanese composer)? We have a request on Japanese Wikipedia to add this latter site to local spam-whitelist, but given the aged entry I feel more prudent to request for adding \b instead of blindly adding a whitelist entry.--ネイ (talk) 15:19, 13 January 2020 (UTC)

@ネイ: Yes check.svg Done good catch, thanks for the notification. I have done some of the others in that time and space, not that I wandered too far around the list.  — billinghurst sDrewth 22:48, 13 January 2020 (UTC)
Thanks - I have replied to the request in jawiki accordingly.--ネイ (talk) 02:00, 14 January 2020 (UTC)