Talk:Spam blacklist

From Meta, a Wikimedia project coordination wiki
This is an archived version of this page, as edited by VasilievVV (talk | contribs) at 15:31, 9 May 2008 (→‎good way to prove a point!: re). It may differ significantly from the current version.

Latest comment: 16 years ago by VasilievVV in topic Discussion
Shortcut:
WM:SPAM
The associated page is used by the Mediawiki Spam Blacklist extension, and lists strings of text that may not be used in URLs in any page in Wikimedia Foundation projects (as well as many external wikis). Any meta administrator can edit the spam blacklist. There is also a more aggressive way to block spamming through direct use of $wgSpamRegex. Only developers can make changes to $wgSpamRegex, and its use is to be avoided whenever possible.

For more information on what the spam blacklist is for, and the processes used here, please see Spam blacklist/About.

Please post comments to the appropriate section below: Proposed additions, Proposed removals, or Troubleshooting and problems, read the messageboxes at the top of each section for an explanation. Also, please check back some time after submitting, there could be questions regarding your request. Per-project whitelists are discussed at MediaWiki talk:Spam-whitelist. In addition to that, please sign your posts with ~~~~ after your comment. Other discussions related to this last, but that are not a problem with a particular link please see, Spam blacklist policy discussion.

Completed requests are archived (list, search), additions and removal are logged.

snippet for logging: {{/request|989786#{{subst:anchorencode:section name here}}}}

If you cannot find your remark below, please do a search for the URL in question with this Archive Search tool.

Spam that only affects single project should go to that project's local blacklist

Proposed additions

This section is for proposing that a website be blacklisted; add new entries at the bottom of the section, using the basic URL so that there is no link (example.com, not http://www.example.com). Provide links demonstrating widespread spamming by multiple users. Completed requests will be marked as done or denied and archived.

Multiple sites by Mikhailov Kusserow

User:



Links:

















They all reside on different IPs:

I am closing the 8 SpamReportBot reports (to keep that list clear). Discussion here please. --Dirk Beetstra T C (en: U, T) 14:18, 22 April 2008 (UTC)Reply

All in all, the links seem legit, though in some cases whole linkfarms were added in one edit. Mikhailov Kusserow has a userpage on many of the wikis I checked (SUL?). I have asked id:Pengguna:Mikhailov_Kusserow (which appears to be one of the bigger accounts) to help us out here. Awaiting discussion. --Dirk Beetstra T C (en: U, T) 15:03, 22 April 2008 (UTC)Reply
I agree that the links look relevant, but there appears to be a distinct bias here, which is not cause for blacklisting unless it continues egregiously, but is cause for concern. Very likely the link placement may be unwanted upon closer inspection, but that is not for us to decide here. – Mike.lifeguard | @en.wb 19:12, 24 April 2008 (UTC)Reply

Update - after a message on my talk page I have left another message on the user's en wp talk page (at their request) again pointing to this link. Maybe give it a couple of days but after that I think we must consider listing these in the absence of any explanation --Herby talk thyme 11:23, 28 April 2008 (UTC)Reply

My Mistake

No, I don't think so. We are watching the external link additions on 722 wikipedia, and saw that you added a handful of new links to a handful of wikipedias. As the links have not been used by anyone else, that triggers our system. For IPs that generally means that it is spam, but here this was done by an established editor. That normally means that we can ignore, but Herythyme found that one of the sites was under development, and showed concerns. I still think all is fine, but your input to clarify this would be great. See m:Talk:Spam_blacklist#Multiple_sites_by_Mikhailov_Kusserow, I am sure that this is a mistake, and a short explanation of these links will clear the matter. Thanks already! --Dirk Beetstra T C 17:18, 25 April 2008 (UTC)Reply

Hi - the best thing would be if you would go to the link that Dirk has provided above & comment there. You have been placing links to eight sites across a number of wikis which does look like excessive link placement to the community there. An explanation of why you see that as relevant will allow us to review the matter. Thanks --Herby talk thyme 11:03, 26 April 2008 (UTC)Reply

Sorry if I make a mistake in all wkipedia. I just want to give contributions, that's all. I don't have a plan to make a spam. Thank You for all your Reminders. Mikhailov Kusserow 04:45, 3 May 2008 (UTC)Reply

I'm waiting your comment in meta.wiki or en.wiki. Mikhailov Kusserow 04:47, 3 May 2008 (UTC)Reply
Thanks for the posting Mikhailov however we do need to know why you think these sites relevant & why you have been placing links. As I said on your talk page on en wp a few days ago you need to discuss the matter on this page not a personal talk page. Thanks --Herby talk thyme 06:54, 3 May 2008 (UTC)Reply

Six links from cross wiki for discussion













There are several users involved, generally the users who are only active on one wiki seem to revert vandalism (as the bots involved). User on more than one wiki (not implying that they did something wrong):



But the link gets sometimes reverted while the Asikhi was not active with that link on that wiki. Maybe 'older' spammers (pre-database?).

Please provide some discussion, I am closing the reports and will point the discussions here. --Dirk Beetstra T C (en: U, T) 09:11, 23 April 2008 (UTC)Reply

As far as msapubli.com is concerned the link being placed "looks" relevant as it is quite long. However (for me) it redirects to the home page which appears to have no relevance to Wikipedia and contains the wor "affiliated" which makes me wonder. I will look far more closely at the others. This batch concerns me --Herby talk thyme 09:26, 23 April 2008 (UTC)Reply
I have real doubts about the validity of these sites to Wikipedia as a whole --Herby talk thyme 10:53, 23 April 2008 (UTC)Reply
I agree with Herby, these are not good links, and I would blacklist, but note that there are many links on several wikis that must be removed. – Mike.lifeguard | @en.wb 21:13, 25 April 2008 (UTC)Reply
I think this should be listed but I am concerned about the number of existing links that need dealing with - any ideas? --Herby talk thyme 08:00, 26 April 2008 (UTC)Reply

Italian portal spam

Sites spammed








Related domains


Spammers








See w:WT:WPSPAM#Italian portal spam for more info. All the IPs are cross-wiki spammers. 143.238.211.63 12:51, 26 April 2008 (UTC)Reply

Bump. 143.238.211.63 13:02, 29 April 2008 (UTC)Reply
Sorry, got lost wading through all the reports. Looks good for listing to me. There are some links that will need removal; doing now... – Mike.lifeguard | @en.wb 21:30, 30 April 2008 (UTC)Reply
OK, left itwiki alone; they may well want whitelisting. hewiki wasn't happy about it - I suspect they'll be re-added shortly. Nevertheless, these are clearly cross-wiki spammers. – Mike.lifeguard | @en.wb 22:04, 30 April 2008 (UTC)Reply

referenceforbusiness.com

Please consider blacklisting referenceforbusiness.com. It's the sister site of previously blocked stateuniversity.com block discussion here, owned by Advameg Advameg discussion here. A search on referenceforbusiness.com yielded its insertion into over 20 articles on the English-language Wikipedia, including Company American Pop Corn Company, Frederick W. Smith, Dwight Schar, and List of acquisitions by Symantec. I have not found a specific spam user, but I think instead the site is typically added as a reference by editors who are adding cites to articles and come across referenceforbusiness.com as a Google result for obscure topics. --Zippy 05:55, 29 April 2008 (UTC)Reply

If this is a solely en wp issue then the blacklisting should be local (here) thanks --Herby talk thyme 16:35, 29 April 2008 (UTC)Reply
Thank you for the suggestion - I was not aware of the en.wikipedia list. I did a check, and links to referenceforbusiness.com also appear on de.wikipedia, nl.wikipedia, fr.wikipedia, it.wikipedia, ja.wikipedia, and zh.wikipedia, albeit to a reduced degree compared to en.wikipedia. (links go to search results) --Zippy 22:30, 29 April 2008 (UTC)Reply


175 links on top 10 wikis (here). Certainly needs looking at - out of time myself now --Herby talk thyme 15:50, 30 April 2008 (UTC)Reply

This type is really difficult to deal with. There is certainly some valid use of the link, there is some excessive use of the link.
If we list it some valid editors will be affected, if we don't it is likely the linkage will grow & grow. Other views very welcome, thanks --Herby talk thyme 08:02, 2 May 2008 (UTC)Reply
I can see RfB as being a valid use as in "a user in good faith adds it as a citation for an article," but I don't think referenceforbusiness.com is ever valid as a reference, due to verifiability and reliability concerns. I'm having a hard time coming up with a case where I'd not remove it as a link. --Zippy 11:36, 2 May 2008 (UTC)Reply
I'm moving that way I think. I can see why it might be an idea to use it but equally I would have thought any really important stuff should be found elsewhere? Those interested may want to look here as well. --Herby talk thyme 11:51, 2 May 2008 (UTC)Reply
All over, there are discussions on talk pages about the suitability of this domain as a reference. I don't think it is ok for that purpose. But I'm not sure that necessitates blacklisting globally. – Mike.lifeguard | @en.wb 12:32, 2 May 2008 (UTC)Reply

bloomsdaynyc.com

Please blacklist



which is being spammed by



--Erwin(85) 21:03, 7 May 2008 (UTC)Reply

Added AddedVasilievVV 21:08, 7 May 2008 (UTC)Reply

Cocktailteam.net

Looking at this it seems to largely added by IPs



Views welcome (not suggesting blacklisting is necessary but....) --Herby talk thyme 09:03, 9 May 2008 (UTC)Reply

Proposed additions (Bot reported)

This section is for websites which have been added to multiple wikis as observed by a bot.

Items there will automatically be archived by the bot when they get stale.

Sysops, please change the LinkStatus template to closed when the report is dealt with. More information can be found at User:SpamReportBot/cw/about

These are automated reports, please check the records and the link thoroughly, it may be good links! For some more info, see Spam blacklist/help#SpamReportBot_reports

If the report contains links to less than 5 wikis, then only add it when it is really spam. Otherwise just close it, if it gets spammed broader the bot will reopen the report.

Please place suggestions on the automated reports in the discussion section.

List
User:SpamReportBot/cw/nakedafrica.net
User:SpamReportBot/cw/hnl-statistika.com
User:SpamReportBot/cw/therasmus-hellofasite.it
User:SpamReportBot/cw/prolococusanese.interfree.it
User:SpamReportBot/cw/rprece.interfree.it

Proposed removals

This section is for proposing that a website be unlisted; please add new entries at the bottom of the section. Remember to provide the specific URL blacklisted, links to the articles they are used in or useful to, and arguments in favour of unlisting. Completed requests will be marked as done or denied and archived. See also /recurring requests for repeatedly proposed (and refused) removals. The addition or removal of a link is not a vote, please do not bold the first words in statements.

usbig.net

This is a genuine, well-respected organization. It is not clear to me why it would be on the blacklist, and its appearance here renders editing several articles impossible. Guido den Broeder 20:25, 25 April 2008 (UTC)Reply

Same request for: basicincome.com, freiheitstattvollbeschaeftigung.de, globalincome.org. Guido den Broeder 20:34, 25 April 2008 (UTC)Reply

Extensive cross wiki link placement resulted in this listing - see here. I would certainly not be inclined to remove these for now, views --Herby talk thyme 07:47, 26 April 2008 (UTC)Reply
Er, well, they are relevant to all wikipediae. Guido den Broeder 01:44, 27 April 2008 (UTC)Reply
Guido, are you connected with any of these? --A. B. (talk) 22:11, 27 April 2008 (UTC)Reply
Seems possible --Herby talk thyme 07:07, 28 April 2008 (UTC)Reply
No. I am a member of BIEN and of the Vereniging Basisinkomen. Those two are not blacklisted. Guido den Broeder 07:48, 28 April 2008 (UTC)Reply


comicradioshow.com

This is a comic-fanzine since 1997 that try to make the the art of comic popular all over the world. There is no commercial interest exept for the artists themselve. I didn'd foud any reasons or comments (whatd does "+1" means) why the ComicRadioShow should be on this Blacklist. Maqz 11:32, 29 April 2008 (UTC)Reply

Here is the background for the blacklisting: User:SpamReportBot/cw/comicradioshow.com --Jorunn 12:42, 29 April 2008 (UTC)Reply
Aha, there are the reasons, but au contraire mon capitan!! This happends too fast for me! I and many other useres posted Links to the ComicradioShow (myself since 2004). I/others added them for interesting Information and aspects of the subject: For Example The 9/11 Comic-Report, The Goethe-Comic, The News that Batgirl-Series will be an homosexual superheroine. ComicRadioShow-Links are no Spam! The Next should be the one to the actual (Paul) Klee Comic! (Article2679) --Maqz 16:05, 29 April 2008 (UTC)Reply
Wikipedia articles can not have external links to every comic, song, video, book, computer game, painting, opera, etc. that covers a (vaguely) related topic to the article. Wikipedia isn't a web directory, please read de:Wikipedia:WEB. --Jorunn 22:14, 29 April 2008 (UTC)Reply
I already read de:Wikipedia:WEB yars ago, and I am aware of this fact not to link everything with my Site. I do not run a site like youporn and I do not sell any Comics for my benefit. You decide to put me on a Spam Blacklist, that forbids EVERY worthy Link (yes they still exist!) from my site from now on. But I have enough actual and permanent Comic-Information (Reviews, Interviews, Essays ect.) that are worthy to put into wikipedia (national and international). Please think about this and put me out of this general Blacklist, because there is a link, that is not relevant to the topic, there will be enough wikipediannd who will do a specific decision for this one link, not for the whole Site. --Maqz 09:12, 30 April 2008 (UTC)Reply
You are the site owner? 80.89.65.153 xwiki-contribsxwiki-date (alt)STIP infoWHOISrobtexgblockglistabuselogbullseye Thanks --Herby talk thyme 07:17, 30 April 2008 (UTC)Reply
??? You're welcome! Nice tool! So what dose that mean? I do contribute in more Topics (exept the Text at Timestamp : 2006-07-27T08:14:47 Julia Boenisch, 2006-07-27T08:07:39 Peter Boenisch (→ Familie : ) i do not know. What do you want from me. No Participation anymore. Deleting all contributions i made in all those years? The Articles too? Do i missunderstood this Wikipedia Thing? I don't think so! Please get me off this Blacklist. --Maqz 12:21, 30 April 2008 (UTC)Reply

Please remove this listing ASAP. --Asthma 02:55, 2 May 2008 (UTC)Reply

You can seek whitelisting on your local wiki. For de:wikipedia you can seek whitelisting at de:MediaWiki Diskussion:Spam-blacklist --Jorunn 08:45, 2 May 2008 (UTC)Reply
Whitelisting has been requested on de.wikipedia, see MediaWiki_Diskussion:Spam-blacklist#www.comicradioshow.com --Jorunn 10:32, 5 May 2008 (UTC)Reply
Yes, and there ist still the Request for deleting the comicradioshow from the meta-blacklist! Supporter-Request will be found on wikipeida.de.-Diskussion! --Maqz 16:43, 05 May 2008 (UTC)Reply
Cross wiki insertions of the link:
The link has also been added to lots of articles in de.wikipedia by the site owner:
Sample edits:
Maqz and IP 80.89.65.145 has also been inserting links to humidoronline.de








--Jorunn 16:01, 5 May 2008 (UTC)Reply
Two more IPs that has been inserting the links:




--Jorunn 00:07, 7 May 2008 (UTC)Reply

Very impressive! So these Lists are good proofs of what? There is a international-Linking of an (my opinion!) important collection of german and international webcomic-collection or an actual and informative Interview with Comic-Artis Hermnann. (No Mass-Spam-Linking of all our 2600 Articles!) All other issues are reasonmable too! So where's the beef? AND it's NOT a Mass-Spam-Cross-Linking! Massiv linking has a higher Number of Links, hmmm?! Besides this you see, that all Article Admins do (or do not) acept the posted CRS-Link. This general Blacklisting is unnecessary and not productiv! One "Victim" of this is the banning of the Link to the Essay from Christoph Kubina (long time well acepted on the Comic-Article). (What does that humidoronline-sentence mean? Does that have anything to do with the comicradioshow-Blacklisting?) --Maqz 10:57, 06 May 2008 (UTC)Reply

More than 90% of your crosswiki edits are for inserting those two links. You have added many external links to websites you run or are otherwise involved with, but very little actual article content cross wiki. You show no understanding for the need to limit external links and the fact that you shouldn't add links to your own website. You behave like a spammer, and that is why the link needs to stay blacklisted. --Jorunn 12:13, 6 May 2008 (UTC)Reply

Typically, we do not remove domains from the spam blacklist in response to site-owners' requests. Instead, we de-blacklist sites when trusted, high-volume editors request the use of blacklisted links because of their value in support of our projects. If such an editor asks to use your links, I'm sure the request will be carefully considered and your links may well be removed.

Until such time, this request is Declined.I would also like to blacklist humidoronline.de; comments on that? – Mike.lifeguard | @en.wb 16:11, 6 May 2008 (UTC)Reply

I can't find that the link humidoronline.de has been inserted since September 2007. Maybe we don't need to blacklist the link now? I suppose the bot will report if the link gets inserted again. --Jorunn 00:07, 7 May 2008 (UTC)Reply

cabinda.net query

See:

We have a query from an obviously frustrated satellite broadband user (likely in Africa -- very slow satellite modems are all many African countries have) wondering why it's blacklisted. It was never logged -- does anyone know anything about it or does someone need to go through the blacklist's 2534 edits to find out how it got there? Or should we just remove it?--A. B. (talk) 15:40, 1 May 2008 (UTC)Reply

Searching is pretty hit & miss really the times I've tried it - date wise it is around the start of 2007. I'm guessing that we would be unlikely to learn anything much interesting. I've rem'd it for now. We can always remove the # if anyone comes up with some good reasons. --Herby talk thyme 15:50, 1 May 2008 (UTC)Reply

petrsoukal.profitux.cz ; japonsko.profitux.cz ; petrsoukal.php5.cz

Dear Wikipedia,

I dont know why links to my pages: petrsoukal.profitux.cz japonsko.profitux.cz petrsoukal.php5.cz

are blacklisted. I just wanted to insert the links manualy (containing a lot pictures from my Ukraine, Marathon des sables and Fuji Mountain Race) in the sections of Wiki, where it make sence.

let me know petr.soukal@inmail.sk The preceding unsigned comment was added by 88.214.115.209 (talk • contribs) 15:11, 3 May 2008 (UTC)

The reason for the listing is the excessive link placement recorded COIBot/UserReports/195.24.158.22|here & here for example. Thanks --Herby talk thyme 15:39, 3 May 2008 (UTC)Reply

thanks, but what can I do to place the links again? all insers I did had the relation to the wiki contain? the contain is not spam and I had in average 20 accesses to ma pages via wiki. Kind regards Petr Soukal

We are writing an encyclopedia here, not an advertisement site or a linkfarm (may I suggest http://www.dmoz.org for that?).  Declined --Dirk Beetstra T C (en: U, T) 19:20, 4 May 2008 (UTC)Reply


photovolcanica.com

I request that an authorized editor remove my website from the global spam list. The site was automatically globally blacklisted as far as i can tell, presumably due to multiple international listings. The pages linked to were specific to individual volcanoes with detailed scientific referencing therein (please go to links to "Volcano Info and Photos on front page of website and then look at pages on stromboli, soufriere hills, erta ale, dallol etc. etc. to see the high quality of text involved). The smithsonian institute has commented on the high quality of the site and the site is established in the scientific community and is indeed linked to on many of the sites in volcano sections on wikipedia. Consequently the site is an important resource also for wikipedia users. Not all volcanoes have been linked into wikipedia anyway, since those with less valuable description were omitted. Request comment and removal. Thanks Dr Richard Roscoe

This looks like it could be a useful resource on some projects -- this non-commercial, scinetific site contains many unique photos. I suggest removing from this blacklist -- if the site's owner agrees to add no more of these links himself. Here are the relevant guidelines on the English Wikipedia; many others have similar guidelines:
What do others think?
--A. B. (talk) 12:31, 8 May 2008 (UTC)Reply
I agree - this is potentially useful, but I wouldn't want someone affiliated with the site to be adding links. That should be done by neutral editors when they feel that doing so would be of net benefit to the project. – Mike.lifeguard | @en.wb 17:06, 8 May 2008 (UTC)Reply
I'm with Mike here. This is as much about conflict of interest as it is about anything else. Having the site owner place the links would be completely wrong. --Herby talk thyme 07:06, 9 May 2008 (UTC)Reply

Thanks for considering my request so far. I promise not to place any links to the site myself in the future and apologize for doing so in the past. I was merely trying to make the resource available to people in different countries since a lot of research effort has gone into it. There was no ill-intent involved and their was no attempt made to conceil the linking. I appreciate that you all seem to have taken the time to look at the actual resource to see that it is a bona-fide site. I will contact the relevant Wikiproject Volcanoes / Geology talk pages as suggested by A.B. once this issue has been resolved. Richard

Done --A. B. (talk) 12:07, 9 May 2008 (UTC)Reply

Troubleshooting and problems

This section is for comments related to problems with the blacklist (such as incorrect syntax or entries not being blocked), or problems saving a page because of a blacklisted link. This is not the section to request that an entry be unlisted (see Proposed removals above).

Discussion

Looking ahead

"Not dealing with a crisis that can be foreseen is bad management"

The Spam blacklist is now hitting 120K & rising quite fast. The log page started playing up at about 150K. What are our options looking ahead I wonder. Obviously someone with dev knowledge connections would be good to hear from. Thanks --Herby talk thyme 10:46, 20 April 2008 (UTC)Reply

I believe that the extension is capable of taking a blacklist from any page (that is, the location is configurable, and multiple locations are possible). We could perhaps split the blacklist itself into several smaller lists. I'm not sure there's any similarly easy suggestion for the log though. If we split it up into a log for each of several blacklist pages, we wouldn't have a single, central place to look for that information. I suppose a search tool could be written to find the log entries for a particular entry. – Mike.lifeguard | @en.wb 12:24, 20 April 2008 (UTC)Reply
What exactly are the problems with having a large blacklist? --Erwin(85) 12:34, 20 April 2008 (UTC)Reply
Just the sheer size of it at a certain moment, it takes long to load, to search etc. The above suggestion may make sense, smaller blacklists per month, transcluded into the top level? --Dirk Beetstra T C (en: U, T) 13:16, 20 April 2008 (UTC)Reply
Not a technical person but the log page became very difficult to use at 150K. Equally the page is getting slower to load. As I say - not a techy - but my ideal would probably be "current BL" (6 months say) & before that? --Herby talk thyme 13:37, 20 April 2008 (UTC)Reply
I don't know how smart attempting to transclude them is... The spam blacklist is technically "experimental" (which sounds more scary than it really is) so it may not work properly. I meant we can have several pages, all of which are spam blacklists. You can have as many as you want, and they can technically be any page on the wiki (actually, anywhere on the web that is accessible) provided the page follows the correct format. So we can have one for each year, and just request that it be added to the configuration file every year, which will make the sysadmins ecstatic, I'm sure :P OTOH, if someone gives us the go-ahead for transclusion, then that'd be ok too. – Mike.lifeguard | @en.wb 22:12, 20 April 2008 (UTC)Reply
A much better idea: bugzilla:13805! – Mike.lifeguard | @en.wb 01:43, 21 April 2008 (UTC)Reply

Date/time in bot reports

In the bot reports there's a date and time given for each diff. However the given time isn't actually the time of the revision. Both the hour and the minutes differ so it's not simply another timezone. Does anyone know what the given date/time mean?--Erwin(85) 18:12, 20 April 2008 (UTC)Reply

It is the time on the machine the bots are running on. For me (I am in Wales, UK) it looks like the box is 5 hours and 2 minutes off. We could correct for that, but I guess it is more an indication of the spam-speed than something that is really necessery, the diffs give the correct times. Hope this explains. --Dirk Beetstra T C (en: U, T) 19:51, 20 April 2008 (UTC)Reply
Thanks. --Erwin(85) 12:29, 21 April 2008 (UTC)Reply

Feedback on logging & people who "help" please

I know this is not the most popular place in the world to work & I am grateful for almost anyone helping out. However we do have one or two people who are quite intransigent as far as following policy/practice whatever as far as logging is concerned.

Nakon shows pretty much indifference to working with practice in a number of ways. They prefer their own logging method, regex is incomplete so catches innocent sites (I've fixed it). As far as the bot ones tackled were concerned no comment as to why they were either listed or closed without listing was given and quite a few have been removed or re-opened by Drini, Dirk or myself.

Equally Raul654 seems unconcerned about leaving others to sort out any problems caused (& I am not sure they actually understand that this blacklist is only for cross wiki issues).

As I pointed out to Majorly today if were are preventing a site from being included in something approaching 30,000 wikis then I think our grounds need to be valid & visible except in emergencies (in this case it is merely a misunderstanding of how logging works I think). Otherwise the Foundation could genuinely face some heavy questioning.

Am I getting this completely wrong? If so do say, if not how do we deal with such "help"? Thanks --Herby talk thyme 09:02, 30 April 2008 (UTC)Reply

One of the reasons I avoid this area like the plague is because it is a pretty vile area to work in. Not only is it full of drama and various conflicts and issues, the whole thing is really complex to log, archive etc, without Herby questioning your every move on your talk page (including asking why I blacklisted a well known Wikipedia stalker's website...). I hadn't logged stuff before, and when I did for my latest addition, I followed the format of the entry above me, which inevitably was wrong. I never follow requests from the talk page, so providing diff links would be impossible. I tend to only add stuff asked elsewhere, on IRC or whatever. To me, it was unclear how to format the log (I was unaware they were in groups), and have received no help on this issue ("just log" isn't help). I'd like to help out when I can, but seeing as it appears to have annoyed one of the most active admins on this page, I'll stop helping. Majorly (talk) 09:22, 30 April 2008 (UTC)Reply
30,000 wikis? Are you sure about that? Majorly (talk) 09:25, 30 April 2008 (UTC)Reply
I have pointed out to Majorly that this is not about him. I realised that the logging instructions might not be clear. All help is welcome - however we must respect the fact that if it is not done as well as we can we cause issues for others who follow. As to the number affected A. B.'s comments here are his standard ones! --Herby talk thyme 09:41, 30 April 2008 (UTC)Reply
Well you learn something new everyday! I had thought it was only Wikimedia wikis that were affected. Still, we shouldn't be responsible for every site that isn't in the scope of our project, imo. Majorly (talk) 12:03, 30 April 2008 (UTC)Reply
Regarding logging, though I log everything, the reason that I provide varies. On en.wikipedia I blacklist/RevertList (latter for XLinkBot) with as proof a Special:Contributions of a user, or one of the bot reports. Those items are clear enough. If needed I create a 'request' on the request list, which I close immediately, and link to that. For me, the proof has to be enough, it does not have to be complete. I do believe it is important that the logging is done in the appropriate section (current month in the /log), and that there is a link to any form of proof (COIBot, SpamReportBot, special contributions, luxo, a request, whatever). Items on the blacklist that do not have a link to 'proof' should be removed immediately until proof has been provided. No excuses for that.
Regarding the formatting, practically all rules, except the more complex ones, should be encapsulated in '\b' tags. So example.com becomes \bexample\.com\b (also escape the period, though that is not strictly necessery, it may in some obscure cases be necessery; e.g. 'viacom' would be caught by '\bvi.com\b'). --Dirk Beetstra T C (en: U, T) 10:31, 30 April 2008 (UTC)Reply


Hi, just my few thoughts on this, I don't think to work here is 'vile' or something because of the logging, which is the most easy part of everything. There is even a nice snippet with a link which makes logging very easy. The most time taking part is to check the links to check if it is a crosswiki problem, etc. which probably keeps people away as soon as they realize that this actually is hard work.
I think logging should be a matter of course, I don't see any reason to question its need.
The only thing that I never log is an emergency adding that I remove later (tell me if I should), if I don't remove it, I add something to the talk and log it.
While at it, thanks a lot to all that keep this list running!
Best regards, --birdy geimfyglið (:> )=| 11:53, 30 April 2008 (UTC)Reply
I've worked closely with this list, first as just an editor, more recently as an admin, over about 18 months and about 1000 edits. Everything Herby has written is absolutely right (except possibly my guesstimate of the non-Wikimedia wikis affected that he quotes; all we know is that it's all Wikia plus thousands and thousands of others). Failure to properly log blacklist entries today creates a lot of work for others in the future. I've wasted half an hour or more stepping through hundreds of edits in the blacklist's edit history to find out who blacklisted something and why because they didn't bother to properly log an addition. These searches come in response to whitelist requests as well as problems our regular editors have adding unrelated links because of some glitch in the regex.
I encourage meta admins to follow Herby's instructions on this. If you find work here "vile", then just list the domains on the talk page and let other admins handle it. If you are going to blacklist a domain, then please take the time to do it right, including logging. Normally there should be a talk page entry as well that gives edit histories and/or diffs.
It's unfortunate if someone screws this up out of ignorance while trying to be helpful but we are all human and I've made plenty of mistakes. It's less acceptable to snap at Herby or others if they point out your mistake and ask you to fix it. It's downright uncollegial, unhelpful and arrogant to deliberately ignore the necessary processes here once they're pointed out. It's just creating problems for others. I hate to say this, but I don't think someone with such an attitude should be an admin here. --A. B. (talk) 19:29, 30 April 2008 (UTC)Reply
I'll also add that an "IRC request" is an inadequate justification for blacklisting, given the non-transparent nature of IRC and the lack of any record. I love Gladys Knight, but "I Heard It Through the Grapevine" is just a song, not a sound basis for blacklisting for more than a few minutes until a more complete justification can be recorded with cross-wiki diffs.
Links to "attack sites" can be controversial and those domains are best listed here on the talk page for discussion and consensus-building before blacklisting. Otherwise, we subsequently get disputes over "en.wikipedia imperialism" (or es.wikipedia or whatever); the link either gets removed here or else other projects just whitelist it locally. If something is not classic spam, it pays to build consensus here first. For instance, wikipedia-watch.org is blacklisted here but used on multiple other projects.
--A. B. (talk) 20:04, 30 April 2008 (UTC)Reply
In that case, I have made up my mind for sure. I was asked over Skype to add the website of a long-term Wikipedia stalker (has driven several female admins off the project). Even logging it will cause issues, as it draws attention to this user, which is the opposite of what we want. I got asked over IRC yesterday to add a URL that was appearing in HAGGER page moves. Again, absolutely no reason this will ever need removing, and again, logging it just draws attention to the troll behind it. I am not going to put up with doing this anymore if people are unreasonably demanding I log things which really should not be logged at all, and stating IRC request isn't enough. Of course it's enough - I gave a reason for the addition, which is precisely the same as adding it on the page - just wastes more time doing it that way. Anyhow, after an extremely brief time adding stuff to this page, I have decided I will no longer add stuff to this list, as it only causes issues and more problems that it's really worth. Thanks, Majorly (talk) 20:58, 30 April 2008 (UTC)Reply
I have removed all of my contributions to this list. If you're going to insist on following some confusing policy, I'm not going to deal with it any more. Nakon 21:06, 30 April 2008 (UTC)Reply
A link not ever needing removal doesn't mean that it will not be requested removed. Not being able to find any reason for why it was added to the blacklist might lead to the link being removed. --Jorunn 21:13, 30 April 2008 (UTC)Reply
Trust me, the stuff I add here will never need removing. I avoid this page, and only add stuff if people specifically ask me - and it's usually really bad stuff. Majorly (talk) 21:19, 30 April 2008 (UTC)Reply
@Nakon: >delisting every one of mine[1] As soon as You press enter You release Your contributions into GFDL. Where do You have the additions from that You added, from reports of bots/users? If so, removing is a disruption of this list and a destruction of their requests and work on this (some users work really hard to fight spam cross-wiki and to fill out very long reports; in my opinion such action is quite disrespectful). Thanks, --birdy geimfyglið (:> )=| 21:40, 30 April 2008 (UTC)Reply

Re Majorly. I can understand that there are cases where you don't want an entry tied to a user, diff, or actions. In that case I would make a clear entry in the log, that you are the person adding it, stating some reason, and e.g. adding a permanent contact address where you can be contacted (if you decide in ## years to leave, these things may still needed to be sorted out afterwards). It may even be logged to a private email sent to the foundation, which does not have to be disclosed to any here, as long as the foundation follows up on it when necessery. I think that would be a good solution. --Dirk Beetstra T C (en: U, T) 09:14, 1 May 2008 (UTC)Reply


good way to prove a point!

Nakon's last entry to the blacklist was to block the entire interfree.it domain - the rationale "IRC request from betacommand''. Now there is an appeal above (here) from what I imagine was not a subdomain that it was planned to block. The point is "how the hell do we know"! I've de-activated it for now - I guess someone should find out what was intended though frankly I cannot be bothered.

Yes I could be wrong - how the hell would I know & - Yes I am angry --Herby talk thyme 11:53, 1 May 2008 (UTC)Reply

Herby, please assume good faith. How about asking Nakon why it was added instead of cursing here on the talk page about it? Majorly (talk) 12:09, 1 May 2008 (UTC)Reply
Indeed - however previous enquiries on his talk page have met with no success (indeed I see he has now blanked it) and he seems to have detached himself from working here. As such those who do work here are left to deal with the results of others peoples work (as I have been doing for sometime).
I am certainly prepared to assume good faith as I imagine you might be. However above this section some people are saying that the logging is pointless, useless, unnecessary etc so it would be nice if they all assumed good faith with me & communicated in a constructive & pleasant manner.
Equally (I hope) it does rather prove the point about how logging with full rationale is rather less than optional & is actually vital to those who work on this page. Thanks --Herby talk thyme 12:43, 1 May 2008 (UTC)Reply
His stubborn attitude doesn't help (he removed all his entries from the list last night, then blanked his talk page). However, I don't suppose he'll be adding much else if he's detatching himself from Meta. If I ever add stuff again (which I doubt), I'll log it, but I think the fuss being made here is a little over the top imo. Especially when the stuff I add is stuff that should never be removed. Majorly (talk) 13:08, 1 May 2008 (UTC)Reply

(editconflict)

Please don't ask this of Herby, I think he assumed good faith explaining and asking everyone kindly to just log (his approach has just been blanked from the talkpage, so I understand him not to feel welcomed or expecting an answer on this talkpage), and I understand perfectly well his frustration.
Why, in case of such removal requests, should it be done that way: asking the sysop who added it? Instead of just checking it out by a single click?
That could in some cases also mean:
  • going through the version history, finding out who added it,
  • dropping the one who added it a message on his talk and
  • waiting for his answer - if there ever is one
What if the sysop does not rembember why because it was a year ago or if (which is not so impossible) he just left the project?
Instead of copying this snippet into a page and that was it.
This is imho just a stubbornness that is causing only redundant work and wasting peoples time. Thanks, --birdy geimfyglið (:> )=| 12:54, 1 May 2008 (UTC)Reply
I'm not sure what the problem is since logging is not that hard to do and I agree mostly with Herby and I do think things have been done a bit harshly the last couple of weeks and logging will prevent future discourse. I barely edit the spam blacklist since I'm on dial-up and that page it just too big for my browser (internet connection) to handle but I have added some to the blacklist and even if it takes me 10 minutes to add to blacklist, I still do it happily since as pointed out above nearly 30,000 wikis share our spam blacklist and we must make sure that we do think of them as well, Logging isn't really a problem and some tried top do it in a different way, Meta is about coordination and it will be really nice if every thing we do is well coordinated and not lying all over the place...--Cometstyles 13:05, 1 May 2008 (UTC)Reply
I think it is abundantly clear to all that help is not helpful unless we're all working from the same page. Logging is not optional. Please be considerate of those you are working with. Logging might not be policy, but we should all respect standard practice. It's there with good reason; much work is created when things are done improperly. Hopefully bugzilla:13805 will help in this respect, but it is not coming tomorrow. So until something like this is implemented, the current system of logging requests is the only way we can all cooperate effectively on this task. Though w:WP:POINT is only policy at enwiki, this is still unacceptable behaviour. If you are not going to cooperate and collaborate on this with everyone else, then at least do not hinder our efforts. – Mike.lifeguard | @en.wb 15:00, 1 May 2008 (UTC)Reply
Actually WP:POINT seems to be found on all the big Wikipedias (except Volapük):
--A. B. (talk) 15:19, 1 May 2008 (UTC)Reply
PS: I wonder if this means the Volapüktians are especially tranquil and don't need such a rule or … very disruptive and don't want a rule like that? They bear watching.

What about tool that once a day analizes whole spam blacklist history and dumps author of every line? — VasilievVV 15:23, 1 May 2008 (UTC)Reply

Useful, but we also need the reason each line (or set of lines) was added. – Mike.lifeguard | @en.wb 15:39, 1 May 2008 (UTC)Reply
Well, VasilievVV's tool would be a start in terms of sorting out the old entries -- at least we'd know whom to contact -- that's if they're still editing here and if they can remember the details. We still need a regularly maintained log, however. --A. B. (talk) 15:44, 1 May 2008 (UTC)Reply
Here's an example that originated today on en.wikipedia of an unlogged Meta blacklisting: Talk:Spam blacklist#cabinda.net query. Anyone care to track this one down -- who added it? why was it added? was the blacklisting justified then? is it still justified? Or do we remove it and move on? --A. B. (talk) 15:49, 1 May 2008 (UTC)Reply

I've done it: Spam blacklist/BlameVasilievVV 16:00, 6 May 2008 (UTC)Reply

  • Cool. I'd like to speak up in Herby's defence here, since I work both sides of this equation - requests for clarification as to why something was blacklisted, via OTRS, and requests for blacklisting. It is massively easier to authoritatively answer an OTRS or other complaint from a site owner if we have a proper log entry that you can actually find. I know I sometimes forget as well, I don't do bureaucracy well, but proper logging is an essential part of our public-facing duty. This is not like a single project blacklist where issues are localised and easily fixed, meta is a very small project but with very big impact. Just think of it as change control and learn to love it for the once in a lifetime that it digs you out of a hole. JzG 13:57, 9 May 2008 (UTC)Reply
re: Spam blacklist/Blame -- VasilievVV, is this a list of unlogged blacklistings? If so, I've been a very bad boy and will start working on my share. Thanks for putting this together. JzG is so right in what he says. --A. B. (talk) 15:28, 9 May 2008 (UTC)Reply
I dumped history of spam blacklist, parsed current version and looked through all history to find in which revision it was added. Some problem caused several blacklist blankings — VasilievVV 15:31, 9 May 2008 (UTC)Reply

LinkWatchers

I am working on both loading the old database (about 4 months worth of links) and to rebuild the current database (about 5-6 weeks worth of links) into a new database.

  • The old database is in an old format, and has to be completely reparsed..
  • The new database had a few 'errors' in it, and I am adding two new fields.

I am running through the old databases by username, starting with aaa.

As a result the new database does not contain too much data yet, and will be 'biased' towards usernames early in the alphabet.

This process may take quite some time, maybe weeks, as I have to throttle the conversion to keep the current linkwatchers 'happy' (they are still running in real-time). These linkwatchers are also putting their data into this new database, so at everything after about 18:00, April 29, 2008 (UTC) is correct and complete.

The new database contains the following data, I will work later on making that more accessible for on-wiki research:

  1. timestamp - time when stored
  2. edit_id - service field
  3. lang - lang of wiki
  4. pagename - pagename
  5. namespace - namespace
  6. diff - link to diff
  7. revid - the revid, if known
  8. oldid - the oldid, if any
  9. wikidomain - the wikidomain
  10. user - the username
  11. fullurl - the full url that was added
  12. domain - domain, indexed and stripped of 'www.' -> www.example.com becomes com.example.
  13. indexedlink - rewrite of the fullurl, www.example.com/here becomes com.example./here
  14. resolved - the IP for the domain (new field, and if found)
  15. is it an ip - is the edit performed by an IP (new field)

I'll keep you posted. --Dirk Beetstra T C (en: U, T) 10:31, 30 April 2008 (UTC)Reply