Talk:IP Editing: Privacy Enhancement and Abuse Mitigation

From Meta, a Wikimedia project coordination wiki
Jump to navigation Jump to search

Main project page (discuss)
Ideas for privacy enhancement (discuss)  · Improving anti-vandalism tools (discuss)


IP Editing: Privacy Enhancement and Abuse Mitigation Archive index
This page is to collect feedback for the privacy enhancement for unregistered users project.
Hoping to hear from you. You can leave a comment in your language if you can't write in English.
Filing cabinet icon.svg
SpBot archives all sections tagged with {{Section resolved|1=~~~~}} after 7 days and sections whose most recent comment is older than 45 days.

Contents

Support and Oppose List[edit]

This list documents who supports and opposes it. It is useful for a numerical summary of the tangled page below, and also to be clear about who thinks what. Feel free to add your own name to it.

Editors in support of this proposal[edit]

  1. Yger
  2. Gnom
  3. Jeblad

Editors in opposition to this proposal[edit]

  1. Computer Fizz
  2. MER-C
  3. Incnis Mrsi
  4. LightandDark2000
  5. Winged Blades of Godric
  6. Roy17
  7. Johnbod
  8. OhKayeSierra
  9. Vituzzu
  10. Benjamin
  11. 168.244.4.53
  12. Jni
  13. 2001:999:82:9855:DC90:883A:206A:2
  14. GreenMeansGo
  15. Раммон
  16. Veracious
  17. Brandmeister
  18. Millennium bug
  19. PaleoNeonate
  20. Udo T.
  21. Someguy1221
  22. ネイ
  23. Nosebagbear - I'd only support limiting vision to autoconfirmed users.
  24. Ajraddatz
  25. Bbb23
  26. Cullen328
  27. Nick-D
  28. BrownHairedGirl
  29. Ched
  30. Kudpung
  31. Pharaoh of the Wizards
  32. AlasdairW
  33. Llywrch
  34. 2A01:E35:39AE:B540:BD22:78D9:FBE5:D3B9
  35. Camouflaged Mirage
  36. 87.212.10.251
  37. Nyttend
  38. Deepfriedokra
  39. Matthiasb
  40. Blue Rasberry (talk) 20:51, 27 August 2019 (UTC)
  41. Gryllida
  42. Berean Hunter
  43. Edoderoo
  44. Ahmad252
  45. Victar
  46. OJJ
  47. Rschen7754
  48. Smallbones
  49. FocalPoint
  50. Indy beetle
  51. आर्यावर्त (talk) 07:06, 29 August 2019 (UTC)
  52. Arthur Rubin T C (en: U, T). I thought I was clear. I'm opposed, but would be willing to reconsider if better anti-vandalism tools are provided to all (including IPs), and provide better coverage than existing IP-based tools. 05:36, 28 August 2019 (UTC)
  53. Johnuniq (talk) 07:06, 28 August 2019 (UTC) I might have been too subtle below.
  54. Vehemently. Praxidicae (talk) 19:59, 28 August 2019 (UTC)
  55. Strongly, WMF start solving real problems first - you have open bugs and feature requests from 10 years ago. --Dirk Beetstra T C (en: U, T) 07:56, 29 August 2019 (UTC)
  56. Sunny00217
  57. Double sharp (talk) 02:38, 1 September 2019 (UTC)
  58. JFG
  59. Vermont
  60. — Draceane talkcontrib. 07:29, 3 September 2019 (UTC)
  61. Celia Homeford (talk) 12:58, 3 September 2019 (UTC)
  62. Very trivial issue that seems like a pointless chore to implement, let alone put into effect. Do I even need to stress why this will cause more problems than it solves? (Especially when others put it better than I can.) ToThAc (talk) 23:35, 4 September 2019 (UTC)
  63. Vojtasafr (talk)
  64. What a terrible idea Nomad (talk) 16:59, 5 September 2019 (UTC)
  65. Strongly oppose. --Homo ergaster (talk) 17:11, 5 September 2019 (UTC)
  66. All or nothing; implement mandatory registration or leave it as it is. PCHS-NJROTC (talk) 23:46, 8 September 2019 (UTC)
  67. No. Make it possible for the patrollers to also see ip's of logged-in accounts to track puppetry. --Madglad (talk) 03:33, 9 September 2019 (UTC)
  68. * Pppery * it has begun 03:44, 9 September 2019 (UTC)
  69. MrClog (talk) 18:49, 10 September 2019 (UTC)
  70. Trialpears - IP editors are aware that their IP will be shown and if that's a major problem they can register an account without even an email.
  71. Oppose.As noted below is voluntary and it is disclosed that the IP address will be visible to anyone anon making an edit. Now as some other editors have pointed that we can make this more clear to the IP editors by having a checkbox as pointed out below and through it is already clearly mentioned Your IP address will be publicly visible if you make any edits. and Saving the change you are previewing will record your IP address in this page's public edit history. but we can have a checkbox in addition to this to be even more clearer.Pharaoh of the Wizards (talk) 20:12, 15 September 2019 (UTC)
  72. Notrium

Comments/discussion[edit]

( incomplete list, will add more later feel free to add your own name ) Computer Fizz (talk) 19:45, 27 August 2019 (UTC)

I have moved this to the top for visibility and collection reasons. Also, to show the WMF because their pronouns and implications imply this is, in some universe, going to pass. Computer Fizz (talk) 02:17, 28 August 2019 (UTC)
Their stated goal here is to develop the anti-abuse infrastructure to the point of no longer needing to reveal IPs. Our anti-abuse infrastructure is not ideal as-is, so this is something that I (and I would hope others) would be open to. But yes, I would oppose implementing IP masking alone. – Ajraddatz (talk) 02:46, 28 August 2019 (UTC)
So are you saying i shouldn't have put your name under oppose? Computer Fizz (talk) 03:18, 28 August 2019 (UTC)
I'm saying that you seem to be reacting to the "privacy enhancement" part of this proposal, not the "abuse mitigation" part. – Ajraddatz (talk) 11:46, 28 August 2019 (UTC)

I don't feel like such lists are useful. This is not a vote and at least I don't want that someone add my name on list based on how he/she see my opinion. That's why I took off my name from there. Stryn (talk) 20:42, 28 August 2019 (UTC)

Yeah, that's not usually how RFC's work. Normally people add themselves. I think others have objected to being spoken for as well. SQLQuery me! 16:04, 29 August 2019 (UTC)
I never said this list would be the final decider in whether or not it passes. it's just a proof of concept, a visual representation of who thinks what, and to show to those who act like people enjoy this idea. And i'm sorry if you felt like i was trying to speak for you, when prepopulating the list from discussion i left out a lot of editors who i wasn't entirely sure what they thought. if you want to remove your name i won't be offended but i also don't see the harm in keeping it. Computer Fizz (talk) 23:26, 29 August 2019 (UTC)
I haven't checked every name in the list, but a quick scan of this page for the exchange between 87.212.10.251 and 24.151.50.175 suggests this list is going to be highly dubious. Let's put it another way Computer Fizz. Do you oppose improvements to anti-vandalism tools if it resulted in increased privacy? It's fine to suggest that you've explored all the options and have ruled out any alternatives to publicly displaying IP addresses. But I don't think that's really the case for the majority here. Not wanting change is an understandable reaction, but at this point we don't even know what the proposed change is going to be, so opposing it without fully understanding all the implications doesn't really make much sense. -- zzuuzz (talk) 23:57, 29 August 2019 (UTC)
My proposed change is to require people to create an account. The only real downside is that those ips that just make like one or two edits like fixing a typo then disappear forever probably won't wanna do that and it'll just go unfixed. That said, i still think it's miles better than the current proposal. Computer Fizz (talk) 01:24, 30 August 2019 (UTC)

Unique, automatically-generated, human-readable usernames[edit]

Sounds good[edit]

I proposed this in 2013 Wikipedia:en:User:Sphilbrick/User_naming_convention_proposal At the time I wrote that, I did not focus on the issue of privacy, but I did identify several problems that this approach would address, so while this particular approach may be motivated by privacy issues, if implemented correctly, it will simultaneously solve some other problems.--Sphilbrick (talk) 12:41, 1 August 2019 (UTC)

Sorry but I don't like your idea. It has a lot of problems but I'll just mention one: nobody is going to remember their own username if it's a seemingly random string of characters. -kyykaarme (talk) 20:02, 2 August 2019 (UTC)
I think you are misunderstanding the issue. This no particular need to remember your username — the entire point of this is to assign generated username for people who choose not to login. If you make your first edit without logging in you will be assigned a name. If at some later point in time to make another edit the system will keep track of you and will assign the same name you won't need to remember it. Secondly, it's not all that hard. It's a string that includes the date you first edited followed by a relatively short sequence, probably a number between one and a couple hundred. My specifics key can be made a little easier now because I originally wrote it before unified login track of the language. In summary, you don't need to remember your assigned username but if you want to it's easy.--Sphilbrick (talk) 21:17, 3 August 2019 (UTC)

It's more human readable, instead of numbers and dots now it's just numbers. Computer Fizz (talk) 17:51, 16 September 2019 (UTC)

IP Encryption?[edit]

On the thought of what text string should be presented instead of an IP address, the idea of presenting an encrypted IP address of some sort springs to mind. It's similar to how password checking works - a password is entered and encrypted, and then compared to a stored encrypted version, without the unencrypted version being stored. In the case of IP addresses, as long as a unique encryption algorithm is used, I'm thinking an encrypted IP address could be shown wherever a plain IP address currently is. That should still work for, say, checkuser checks where the checkuser can see that a registered account and an unregistered editor do or do not share the same encrypted IP without being able to see the plain IP, but with privacy protection in that you can't determine location information from it. And, given that there's no longer any private location information, maybe checkusers could then even be allowed to link registered accounts to encrypted IPs, where they're currently forbidden from doing so. You wouldn't be able to do range checks and range blocks from an encrypted IP with current tools, but presumably tools could be developed to, say, "Show contributions made by the /xx range in which the IP associated with this encrypted IP lies", or "Range block the /xx range associated with this encrypted IP". Anyway, it's early where I am and I haven't been up long and I might be talking nonsense - or I might be teaching Grandma to suck eggs. But I thought I'd offer my thoughts. Boing! said Zebedee (talk) 10:08, 2 August 2019 (UTC)

We'd rather hear something we're aware of ten times than miss important input because someone figured we'd already thought about it. (: /Johan (WMF) (talk) 10:13, 2 August 2019 (UTC)
Definitely an idea worth exploring. The problem with encryption is, what if the WMF is hacked and the key is stolen? Then an attacker can decrypt the IPs. On the other hand, if you perform a one-way hash on the IP, like is best practice with passwords, how do the checkusers get the IPs back? Also, hashes of IPs could be precomputed, so you'd need a salt, but then that needs to be kept secret... Basically, good crypto is hard, and if checkusers/WMF have a backdoor, then so does everyone. BethNaught (talk) 11:50, 2 August 2019 (UTC)
@BethNaught: One option would be to store the IP but only display the one-way hash. But this could eventually be broken. GoldenRing (talk) 12:13, 2 August 2019 (UTC)
I see two problems with this, one particular to this scheme and one that's been mentioned several times before:
  • Firstly, it's a privacy-protecting as your encryption is strong. Once the encryption is broken, it will reveal all IP addresses ever recorded using it. The encryption could be broken by (a) newly discovered vulnerabilities in the encryption scheme used (or old vulnerabilities if the choice is made poorly); (b) someone leaking the encryption key; (c) someone stealing the encryption key; or (d) someone breaking the encryption (brute force attacks etc). Only the first of these can avoided indefinitely, if you happen to choose a perfect encryption scheme with no implementation bugs. All of the rest will happen eventually; all you can do is make them as unlikely and as far-in-the-future as possible. Perhaps you could find ways of changing the key over time, though any scheme I can think of has enormous drawbacks; principally, that anyone who refers to an IP by the encrypted text on-wiki would expect that text to remain constant, so that others will understand in the future what they were talking about.
  • Secondly, WHOIS information is useful. When I'm dealing with an IP, my response will be quite different depending on whether WHOIS says it's a school, a university college, a business, a public library or a consumer ISP. One potential way around this would be to record edits against a CIDR range instead of an individual IP; this would allow us to WHOIS an IP edit and see where it came from but it would make it impossible to conclusively tie an individual edit to an individual editor. It might also make dealing with IP-hopping editors on dynamic addresses easier. But I'm sure others will think of problems with this scheme.
Overall, I have some sympathy for the suggestion above that we just turn off IP editing. I'm not really convinced by the argument that this would make dealing with vandals much harder; for anyone who really wants to avoid this sort of scrutiny, creating an account is already an option and if that makes it so much harder to handle them, it's hard to believe that the determined vandals are not doing this already. On the other hand, the bar of creating an account (as low as it may be) would deter some of the drive-by vandalism we get; it's hard to think that much of what I refer to as the "Welsh school lunchtime" contingent could be bothered creating an account to change the name of their head teacher to something endlessly amusing. GoldenRing (talk) 12:13, 2 August 2019 (UTC)
One problem with turning off IP editing is that lots of goodfaith editors start off making a few IP edits. So losing IP editing would raise a barrier against those editors and lose us a proportion of them. While if it is true that vandals and spammers do the minimum needed to do their vandalism or spam, we won't lose any vandalsim or spam, we'll just make it a little harder to spot. WereSpielChequers (talk) 09:22, 29 August 2019 (UTC)

This type of encryption scheme would be unworkable with regard to rangeblocks, for several reasons. Firstly, ISPs are inconsistent with how they both report and assign the subnets from a given allocation. A user could be confined to a private subnet that is not even listed on the WHOIS data, or he could be roaming over a shared pool that is somewhere between the largest and smallest networks listed on WHOIS. A Comcast-USA customer can be blocked with zero collateral, while it is impossible to block a T-Mobile-USA customer without also blocking 15-20 million other customers, and in either case you need to either see who is on the range or simply know about the ISP to configure the block correctly. If they are just going to hand us the ISP information, including location, then there was no point to the fancy encryption scheme to begin with. And if they are going to keep that a secret but let us see everyone on the same /XX range where XX can be anything, privacy is also out the window, because some of those people are not going to be shy about where they are. And if you think you can get around this by having some internal software automatically find the smallest range that catches all of someone's secret IPs, that is even worse. You could have the software report how many people share the range it's chosen, but you're taking a sledgehammer to a problem that could be excised with a scalpel. That is, if someone vandalizes Wikipedia from their home computer, and then they switch to their neighbor's wifi to vandalize it again, and you include both IPs to have some secret serverside thing make a rangeblock, congratulations, you've just blocked all of Chicago. But if you'd known the actual IPs, you could have just blocked the vandal. And regardless, even under that scheme where even the admin never knows what the IPs were, privacy is still out the window because when you do take out an entire city or country, a bunch of people are going to file unblock requests, and unless WMF is going to have us handle those blind, we're probably going to know which block caught them, and boom goes the privacy. In brief, there are a lot of very clever ways to mask IP addresses, but we already have an excellent one deployed - registration. Future technology developments should focus on abuse mitigation, not abuse facilitation. Someguy1221 (talk) 23:41, 2 August 2019 (UTC)

Synthetic user names[edit]

I written about this before, but can't quite remember where.

This can be done simply by using the existing IP address and do a modulus division. The remainder can then be used for lookup in a hashed name table. Doing this with several base numbers will give several names, and joining them will then give a synthetic name. Instead of a cryptic 123.231.012.210 you will get something like “Colorful Moccasin” or perhaps “Hungry Crocodile”. To make the name recognizable as anonymous they should be prefixed with something like “Anonymous”, so full names would be “Anonymous Colorful Moccasin” and “Anonymous Hungry Crocodile”.

The synthetic name can be stored as a cookie on the machine, and as long as this exist the synthetic name is reused for editing from that machine. As long as the cookie exist the machine will be identified with the same synthetic name.

To avoid that two different users on the same IP address is given the same synthetic name the IP address should be merged with a slowly changing random number. It should still update faster than the lease time on the IP address. That will make the system generate new synthetic names for each anonymous machine over time.

A variation is to replace or augment the random number with a digest from the agents header fields. That will make a more machine specific synthetic name, but note that nothing stops someone from using the exact same machine, OS, and browser as someone else from within the same school, university, or company.

A very interesting, yet somewhat difficult solution, is to make a user specific fingerprint. It will be difficult to make it similar enough that the same user is assigned the same synthetic name for each session. It is although possible to measure a fingerprint to see if it is likely that two sessions belong to the same user. A fingerprint will typically be w:keystroke dynamics, which will be collected in the browser and transferred to the server as content of a cookie. A user will then slowly over time establish a stable fingerprint. — Jeblad 14:39, 5 August 2019 (UTC)

This doesn’t address long-term abusers who use the “private” mode in browsers, or may delete cookies manually. These concerns are expressed on this talk: so many times (and by different authors) that I won’t bother to find a specific link for Jeblad. Moreover, some IP ranges are infested with physically distinct abusers, schools and cheap/leaky hostings serving as most known examples. Incnis Mrsi (talk) 16:35, 5 August 2019 (UTC)
A cookie can be used to maintain the same pseudonym on a laptop or similar device, it can not be used for abuse mitigation. [Said slightly different; you can maintain trustworthy information about a good editor on the editors device, but you can not maintain the same trustworty information about a bad editor on his device.] That is actually a faulty assumption made by developers of the present “cookie ban approach” aka Community health initiative/Partial blocks. I have made a note at Community health initiative/Partial blocks#User specific blocks won't work alone. I have also made a whole bunch of comments at Talk:Community health initiative/Blocking tools and improvements, they should be pretty informative why you can't use a cookie to maintain a block. — Jeblad 17:32, 5 August 2019 (UTC)

Regarding the automatically generated handles[edit]

> > Q: If we don’t see IP addresses, what would we see instead when edits are made by unregistered users?

> A: Instead of IP addresses, users will be able to see a unique, automatically-generated, human-readable username. This can look something like “Anonymous 12345”, for example.

I don't like this handle, would suggest

  • If random, make it non numeric, ie 'Mooing lemur' or 'Flying eggs' or another nickname that is more readable.
  • Anything customizable and marked as non-registered could also work if they store it in a cookie.

Thanks and regards, --Gryllida 00:00, 6 August 2019 (UTC)

Assign unregistered log ins a guest name[edit]

I have raised this issue a number of times, so I'm pleased something is now being done. It's quite unusual for a website to intentionally publicly reveal people's IP address, and to do so without giving users a full and appropriate explanation of the consequences of having their IP address revealed to everyone permanently.

Each unregistered log in can be assigned a guest name and number that is associated with that IP address. The guest name being selected randomly. So an IP address, such as 172.16.254.1, could become, for example, Guest NW567, and each use of 172.16.254.1 would be shown as Guest NW567. SilkTork (talk) 09:23, 6 August 2019 (UTC)

We could create a new user field - GuestUser, to run alongside User. This would immediately identify the user as someone who has intentionally not created an account, and such users would have the same limited access as IP users. Where an IP address is known to belong to a school or library, this could be identified as GuestUserLibrary and GuestUserSchool. SilkTork (talk) 09:33, 6 August 2019 (UTC)
Great idea. Perhaps we can add country code as well GuestUserUk, GuestUserUs etc. followed by Com, Edu, Lib, Xxx whatever so we have an idea where Nomen Nescio hails from.
Greetings & salutations, Klaas `Z4␟` V:  10:11, 6 August 2019 (UTC)
@SilkTork: sure, but it is also quite unusual for a website to intentionally allow anyone to edit.
Also, people do get warned that their edits are associated with their IP address ("Saving the change you are previewing will record your IP address in this page's public edit history"). I can agree that that warning could (or even should) be more elaborate, but it is there. --Dirk Beetstra T C (en: U, T) 10:22, 6 August 2019 (UTC)

L10n/i18n[edit]

Now I know I'm getting way ahead, but remember that MediaWiki is proud of its internationalisation and localisation, supporting hundreds of languages. The specific suggestions above are only starting points, I know, but any final solution will at least need to be:

  1. language-neutral and script-neutral or localised before it's deployed;
  2. gender-neutral (if we need to ask for gender preferences we defeat the purpose of non-registration);
  3. and yet sufficiently stable to not break signatures, SUL and so on.

There is no way we're going to have a sufficiently advanced dictionary to support a string of the kind "Anonymous <adjective> <noun>" in 400 languages, so we can discard that option outright. We could use the "User" namespace translation plus a number (and then format the number according to each language's convention), but this would already break (3): changing translation would be impossible because we'd need to rename the user across all wikis.

So in practice there aren't many options: we'll probably need the username to be just a number, or some neutral Unicode character plus a number, although the display might be different (cf. #IP address attribution is just rubbish). This is what StackExchange does, unsurprisingly, although they prefix the number with "user" because they're basically English-only. Nemo 16:24, 6 August 2019 (UTC)

We don't localize user names, or demand that user names must be localizable, so a claim we must localize anonymous (or pseudonymous) user names is invalid, at least for now and the foreseeable future. I'm not even sure it would be possible as user names are often constructed from real names, and we can not and should not try to localize or demand that real names should be localizable. We can although create a subsystem for transliteration that can handle real names and pseudonyms.
If we want to create synthetic names, and make them translatable (which would be contrary to present handling of pseudonyms) then we can use existing translated sets of terms. Note that we don't have to translate "Anonymous <adjective> <noun>" in 400 languages as the previous participant argue, it would create a problem, but translate "<adjective>" and "<noun>" in 400 languages, which would create a much simpler problem. This neglects problems with languages that has changes in grammatical case given adjective or noun, but I'm not sure we must (or need to) support this.
In short; I believe this is a constructed problem, and it is an attempt to escalate non-existing problems to keep the current system. — Jeblad 12:16, 16 August 2019 (UTC)
We do localise the part of "usernames" that is software-made, i.e. the namespace. We don't need to translate the usernames because it's the users' choice to use whatever language is appropriate to their activity. Nemo 07:54, 26 August 2019 (UTC)
While we might not have "Anonymous <adjective> <noun>" in 400 languages, it's okay to have graceful degradation and fallback to either a neutral number or an English user name when there's no translation into the language. ChristianKl❫ 17:01, 27 August 2019 (UTC)
Not a fallback to English, because that would produce a mass of untranslatable software-generated "interface"; yes a fallback to a language-neutral method. However, I see little point in developing a method which doesn't scale language-wise: better focus on a method which can scale. Nemo 19:06, 10 September 2019 (UTC)

Premise of the idea[edit]

Since this point has apparently not been considered, what about the people who don't mind seeing their IP address broadcast? The right of privacy is a right, not an obligation. Jo-Jo Eumerus (talk, contributions) 18:15, 1 August 2019 (UTC)

@Jo-Jo Eumerus: Once IP addresses are no longer shown by default for anonymous users, I imagine we could remove the IP address pattern from the username blacklist. Then you could just set your username to your IP address if you really wanted to. The only reason we don't allow that currently is because it confuses people into thinking that a logged-in editor is a logged-out editor (which would no longer be an issue). Ryan Kaldari (WMF) (talk) 19:24, 1 August 2019 (UTC)
I don't think that would work, as we have far too many existing edits attributed to IPs for it to make sense for some to suddenly be registerable. Note the description of this project specifically says that existing edits aren't going to be reattributed, which makes sense since replacing signatures in talk page histories would be far too disruptive. Anomie (talk) 21:18, 1 August 2019 (UTC)

:::Huh? How do we know that forcing people to register accounts - which that would require - isn't going to cause problems with participation? To say nothing that IPs are not fixed... Jo-Jo Eumerus (talk, contributions) 21:25, 1 August 2019 (UTC)

I can't think of a non-wiki website that displays the IP of contributors. I suspect that some forums still do, but I haven't encountered it for years. Bitter Oil (talk) 21:30, 1 August 2019 (UTC)
That's because there's hardly a blog or a forum nowadays that does not require registration. Kudpung (talk) 19:52, 3 August 2019 (UTC)
I'm a "moderator" (loosely equivalent to Sysop) on a bulletin board (which requires registration). (I can delete, edit, move, and release moderated posts, but I cannot block; only "admin"s can block, limit, or "moderate" users.) IP addresses are available to me on any flagged posts or posts marked for moderation. I don't recall signing a non-disclosure agreement....
I believe the sign-up process does require the user to (somehow) acknowledge that the IP address is stored. We find the IP list quite helpful. Of course, there are no specific tools which scan the full database, so I cannot tell whether such tools would help find sock puppets better than observation. — Arthur Rubin T C (en: U, T) 05:47, 28 August 2019 (UTC)

Simpler solution - turn off IP editing[edit]

All of the privacy concerns identified here could be addressed by disallowing anonymous editing. In 2019, registering an account (with no requirement for email or any other kind of validation) is a ridiculously low bar. There would be substantial cost savings, since all of the design, specification, and creation of software necessitated by this "enhancement" would be eliminated. More to the point, Wikipedia volunteer administrators would not experience a substantial shift in their decades long practice. All of this seems obvious, but someone had to say it. Bitter Oil (talk) 18:34, 1 August 2019 (UTC)

As someone who has spent a lot of time blocking autoconfirmed vandalsocks, I have to also say that blocking new accounts is like playing Whac-A-Mole. There are good reasons for keeping unregistered users on IP addresses - namely being able to effectively block them. zzuuzz (talk) 18:47, 1 August 2019 (UTC)
If anyonymous editing were just turned off, or changes were sufficiently disruptive, I could forsee that on EnWiki the practice of warning vandals before blocking would be abandoned under a deluge of vandalism. Sentimentally, I would support a zero-tolerance approach to vandalism, but in reality this would breed even more trigger-happy admins, further reducing the wiki's welcomingness and diversity. BethNaught (talk) 19:14, 1 August 2019 (UTC)
I think the WMF is telling us that they are going to hide IP addresses. Leaving things as is does not seem to be an option. Turning off IP editing is much less work to accomplish the same goals. Perhaps they could use the budget for this project to work on better tools for admins instead. Bitter Oil (talk) 20:27, 1 August 2019 (UTC)
Your edit summary is unappreciated. I know very well that the WMF can play games, but I also know that the community is a match for them. Whether the WMF is being deceptive or not with this initiative, we have an opportunity to steer the direction or even stop it. As to whether they are being deceptive, I am not going to engage in any drama-mongering at this stage.
Ironically, if it hadn't been for the edit summary, I would have been sympathetic towards your comment. BethNaught (talk) 20:41, 1 August 2019 (UTC)
Community consultations don't have sections called "implementation plans" and don't say things like "We'd like to figure out sensible early steps that a development team could work on soon" and don't have the "Director of Engineering" fielding questions. It isn't drama-mongering to point out this is a done deal. Accepting that will greatly reduce the drama potential. Bitter Oil (talk) 21:21, 1 August 2019 (UTC)
I can really see where this looks like a done deal. It looks like that to me too. The phab ticket looks like a developed plan to implement it, the research paper looks like a prepared justification, it appears input is mostly being solicited on how to implement it - and not if it should be implemented, and it looks like this has been discussed privately for at least a year if not longer. Even as someone that thinks that this could be implementable with lots of hard work, a few to several years down the line - that's how this appears. SQLQuery me! 02:09, 5 August 2019 (UTC)
Of course, the drawback of "accepting something to avoid drama" is that the accepted thing may be a bad one. And turning off IP editing is likely going to be bad, since it will reduce participation, and we don't know what the impact will be. Jo-Jo Eumerus (talk, contributions) 21:34, 1 August 2019 (UTC)
  • Turning off IP editing is an awful idea that goes against Wikimedia's ethos of "anyone can contribute". We already have so many editor retention problems across the spectrum that adding another bar to initial participation would severely exacerbate our biggest problem. — Bilorv (talk) 18:44, 2 August 2019 (UTC)
  • I support this proposal. Speaking as an EnWiki admin, vandals seem much more likely to operate from unregistered accounts than registered ones. Requiring people to take the simple step of registering an account will considerably thin out the amount of vandalism volunteer admins need to respond to, for almost zero loss to content improvements. As noted above, it would also be simple and very low cost to implement. Nick-D (talk) 02:06, 3 August 2019 (UTC)
  • While the concern of barring new editors further is valid (there are people who prefer not to have an account tied to their edits), considering most people already make accounts for social media and other websites, it won't be a barrier to most. Thus, the removal of anon editing may be the right choice.
Additionally, just hiding IP addresses will hurt our current system (at least, on the English Wikipedia) - for instance, marking shared IPs or finding proxies will be exponentially more difficult as we will only be able to find them based on behavioral evidence, which is subjective and not as good as a "whois" search that can provide a definitive answer.
Both of these do provide a good reason to remove anon editing, but there is the spirit of Wikipedia at stake here as well. Ultimately it will be up to those who created said spirit to decide if it will change. Kirbanzo (talk) 14:13, 9 August 2019 (UTC)
  • As a member of the English Wikipedia, I endorse this idea for ironically the same reason User:Bilorv opposes it. In the past few years, I've been seeing some absolutely ridiculous blocks on shared IPs and even /16 ranges by ignorant administrators who are more interested in attempting to prevent school children (and immature adults for that matter) from ever being able to write silly comments on articles (which will never work on an open wiki) than promoting the fundamental Wikipedian philosophy of an open encyclopedia project where any person in the world can contribute to human knowledge. It's 2019, an increasing number of people only have internet through their cell phone, yet admins are thinking nothing of blocking /16 ranges belonging to the top four American cell companies because the likes of some punk 13 year olds circumventing their six year school block grinds their nerves so much that stopping them outweighs recruiting and retaining new users in their minds. It only takes a minute to register an account, and doing this will restrict IP blocks to CheckUsers, who should be experienced in recognizing patterns of abuse rather than blocking four year high schools for "repeat vandalism" because some girl wrote how awesome she thinks she is on an article *one time* after the expiration of a five year block. There has to be a better way to address vandalism than these massive blocks, and with privacy issues in mind, requiring registration seems to be a no-brainer at this point. PCHS-NJROTC (talk) 11:48, 8 September 2019 (UTC)
    It takes one minute to create an account and then you forget the details the next time you come back and have to sign up again, and that's too much hassle and then we lose an editor. The solution to admins making uneducated blocks is to educate those admins. Even 60 seconds to create an account is a genuinely very high barrier to the average first productive edit, which is something like "this sentence is missing a full stop" that the reader expects to take 10 seconds to fix, not a full 3 minutes (sign up, now it's redirected you to another page so find your way back, try to find the mistake again and click edit and then the code on the editing screen makes no sense to you so you either give up or have to spend another minute working out which bit the actual text is). I think it's quite rare for someone to go "I absolutely and unequivocally want to start editing Wikipedia as a hobby". More common is someone seeing things they can improve and gradually being drawn in. Forcing account creation cuts off that low entry bar. — Bilorv (talk) 15:03, 8 September 2019 (UTC)
    Finally someone gets it. I've literally made the same argument about people not waking up in the morning with a burning desire to become a Wikipedian, yet the admins on the English Wikipedia continue to say "oh, if someone wants to edit from the IP I blocked that represents 50,000 people, they can always submit a request to create an account." I particularly see this with school networks, but it is becoming an increasing problem on wireless networks and corporate networks too; admins argue that schools do nothing but vandalize, but that's simply not true. Personally, if a school gives us 15 edits in a month, 14 of which are vandalism, I think it's worth minimizing blocks on it because that one person who fixes vandalism for the first time from the school library may become a valuable administrator who fixes 100,000 articles but may never make that first edit if greeted with "to edit, please create an account at home and log in with it here" while the 14 troublemakers can always find another network to act stupid from. Not to mention, its easier from a vandal fighter's prospective when youth do their vandalism from school than at home or somewhere else because I can always watch the /24 range for Sarasota County Schools or the /16 range for the Florida Information Resource Network for silliness and get rid of it, whereas a DSL range with a couple of regular contributors who refuse to create an account is more difficult to monitor for garbage. Unfortunately, no matter how much I scream at admins for making foolish blocks, their behavior continues and I begin to look like the bad guy over there beating a dead horse. Eliminating IP editing knocks out two birds with one stone by addressing this privacy concern and discouraging administrative stupidity, and while having to create an account probably will create an inconvenience for newbies, people who are serious about wanting to participate are more likely to take a minute to create an account the usual way than to request an account be created for them. PCHS-NJROTC (talk) 18:02, 8 September 2019 (UTC)

Make registration compulsory[edit]

Feel it is better to make registration compulsory further no email or verification is required for anyone wishing to edit this project and anyone can do it within a minute rather than this. Further vandalism from an account is easy to detect and total privacy is there. Pharaoh of the Wizards (talk) 11:51, 2 August 2019 (UTC)

Never ever. Fundamentally anti-wiki. Winged Blades of Godric (talk) 12:16, 2 August 2019 (UTC)
It could be an automatic registration, with automatic username, linked to IP address, which is used every time this IP address uses WP without registration. This way, the IP is protected the same way a registered user is, and these accounts are managed with the same tools than registered users. No big changes, but the IP are protected. --Jean-Christophe BENOIST (talk) 12:35, 2 August 2019 (UTC)
How do we determine sock-hops or do range block and all that ..... Winged Blades of Godric (talk) 13:14, 2 August 2019 (UTC)
Maybe a smart algorithm can generate username from which we cannot infer IP, but that are close together when they are in the same IP range.. Feasible for IPV4 IMO but hard for IPV6.. --Jean-Christophe BENOIST (talk) 08:25, 3 August 2019 (UTC)
For example, if we use "WhoIs" on an IPV4 or V6, it gives us the CIDR/Range of the IP network. The algorithm could generate a random ID for this network. Then, the weak bits (the place in the network) can also be obfuscated in a deterministic way (hash for example). For example, WhoIs 92.154.22.48 => 92.154.22.0/24 => AEXFUU4XAB. 48 => EFFC8. Username => AEXFUU4XAB-EFFC8. MediaWiki retains of course a private table which can do the reverse. --Jean-Christophe BENOIST (talk) 08:42, 3 August 2019 (UTC)
I agree that compulsory registration goes against the wiki way, and I think that the value of allowing anonymous and easy contribution has been discussed often over Wikip/media's existence. The idea that Jean-Christophe puts forward is one possible solution, though it would still make identifying contributions from a range very difficult. A compromise might be to strongly encourage account creation, through a pop-up message when saving an edit while logged out that says "hey, this will make your IP address visible, are you sure you want that?" That combined with making the account creation process easier (for example, a check to confirm username availability before pressing the register button would be nice) might be the right balance between privacy and maintaining our anti-abuse systems. – Ajraddatz (talk) 12:55, 2 August 2019 (UTC)

Wake up folks, it's 2019. People have no problem registering for Instagram, Telegram, Facebook, etc. Registering an account is a very low bar to participation. Make up a username, solve a captcha, you're done. All the concerns expressed here about privacy and access to information are solved. This project is an unnecessary waste of resources. Bitter Oil (talk) 14:43, 2 August 2019 (UTC)

Sure but from an enwiki perspective (forgive me) it will result in a lot of backlogs, particularly at SPI. If I had to guess, a lot of IP edits are one off interested editors correcting a minor spelling error, adding a source, generally improving an article. So let's say someone finds a spelling error at Apple, they'd be forced to make an account to correct it. They come back 3 weeks later, can't remember the password, didn't register with an email, so they create another, then another, then another and it looks like sock puppetry. I don't see an issue for the most part with the current IP editing system and think that creating barriers to good faith editors (particularly those that work in anti-spam and anti-vandalism) is better for any project. The exception to this is the handful of sensitive IPs, like people editing accidentally (or intentionally) from their work address but that doesn't seem like a good reason to throw out the entire system. I would be really interested in a report on IP statistics before any serious changes are even proposed that outline the percentage of IP edits vs. account edits (if this already exists, my apologies.) Praxidicae (talk) 14:54, 2 August 2019 (UTC)
  • I think there are (at least) two objections to making registration compulsory. One is political, in that "the encyclopedia anyone can edit" is a key part of the Wikipedia proposition. OK, I know it's not literally anyone and never has been, but eliminating unregistered editing would be a big political move and I think many Wikipedia critics would see it as evidence of failure. The other objection is as Praxidicae suggests, that it prevents people who are predominantly readers from fixing typos, spelling and grammar errors casually in the course of their reading - and having to register an account to fix a typo you've just spotted would surely make it so that a lot of people wouldn't bother. Do we have any stats anywhere for the number of edits made that way? I suspect it's a lot. Instagram, Telegram, Facebook etc are all well and good, but people want the full social service from those and unregistered use of them doesn't make any real sense. Boing! said Zebedee (talk) 15:47, 2 August 2019 (UTC)
One would think that the quantity and quality of IP edits would be something well understood by the WMF, especially before proposing a project like this. But it is not. If you read the research report associated with this project, it states

A ​2007 study​, conducted by a Wikipedia user, indicated that about 80% of vandalism comes from unregistered users, but that vandalism represented only about 20% of total edits made byunregistered users.

But that 2007 study involved a total of 250 edits (of which 89 were from anonymous users). A tiny study over a decade ago is a very questionable basis for any assumptions. It seems to me that the idea of IP editing being beneficial is an ideological one, not an observed conclusion. Bitter Oil (talk) 16:05, 2 August 2019 (UTC)
The point discussed here is anonymous is not anonymous for an IP and only for a registered logged-in user.Pharaoh of the Wizards (talk) 20:12, 2 August 2019 (UTC)
I'd like to dispute People have no problem registering for Instagram, Telegram, Facebook, etc. on the grounds that a) they are social networks and not encyclopedias so not an apples to apples comparison and b) we have no evidence that Instagram, Telegram, Facebook et al would not receive more input if they allowed unregistered users. It's a quantitative judgment that is needed here. Jo-Jo Eumerus (talk, contributions) 20:20, 2 August 2019 (UTC)
I was not suggesting that the sites are the same, just pointing out registering for sites is something that people understand in 2019. This was not the case when Wikipedia began, but much has changed since then. I have no doubt that Instagram, Telegram, Facebook et al would receive more "input" if they allowed unregistered users but it would not be the kind of content that they want. Bitter Oil (talk) 21:38, 2 August 2019 (UTC)
  • You realize that compulsory registration will create exactly the same problems as masked IP editing, don't you? There's no difference in the impact of anti-vandal work, of identifying socks, of not being able to track edits by a specific IP or IP range whether the IP is masked or the potential editor has to create an account. The only difference is in the likelihood that someone will succeed in making the first edit without jumping through the "create an account" hoop. Risker (talk) 01:40, 4 August 2019 (UTC)
    I agree on that: registering an account only masks the IP (which can be done in other ways as well). This limits just the occasional vandalisms, for which CUs and other elaborate verification is pointless. That's true that there is a limit on the number of accounts that an IP can make, but I don't think that's sufficient to limit the attacks from LTAs. --Ruthven (msg) 14:03, 6 August 2019 (UTC)
  • Requiring registration would be a negation of the entire point of the project. It would be completely unacceptable. --Yair rand (talk) 02:23, 5 August 2019 (UTC)
  • Each site may separately decide whether to allow edits (or other activity) without registration. It is up to the whole Wikimedia people to decide on specifications of unregistered edits—where they are permitted—because fighting abuse requires coordination. Imagine editors’ IPs were hashed/encrypted on the site X, left open on the site Y, and hidden from the world on the site Z. Confusion and chaos among Wikimedia volunteers would arise from it. Incnis Mrsi (talk) 08:36, 5 August 2019 (UTC)
  • There seems to be at least two theories as to what would happen if we switch off IP editing and require everyone to create an account. The optimists think that none of the goodfaith edits will be lost and that many vandals will be deterred by this step. The pessimists fear that some readers will remain readers and not make those first edits as an IP that precede creating an account, and that the vandals and spammers will still be with us, but just a little harder to spot as they will now all create throwaway accounts. I'm very much with the pessimists, but willing to see a proper research project to test this out. But if it is true that IP editing is part of the "secret sauce" of this site and is one of the things that allowed us to see off rivals such as Citizendium, then it would be a mistake to lose it. WereSpielChequers (talk) 04:55, 7 August 2019 (UTC)
  • I remember a study being posted in en:WP:SIGNPOST maybe a year ago that found that a few registered editors were responsible for more than half the total edits, but that their edits were mostly WikiGnoming and the burden of content creation actually fell on IPs and accounts with a couple edits. The IPs' adding a sentence here and there well outweighed all the concerted content creation efforts such as FA/GA. DaßWölf 08:25, 10 August 2019 (UTC)
  • Re: the idea of IP editors willing to jump through more hoops to edit, the general UX wisdom in the e-commerce world is that hoops such as registration before the call to action can easily drive off the vast majority of the potential users. While WP's IP editors might be more "ideologically motivated" so to say, that sentiment can realistically take them only so far. DaßWölf 08:25, 10 August 2019 (UTC)


In my personal experience as a long-time volunteer and bureaucrat on he.ws, lots of our hard-to-detect minor typos and scanning errors are fixed by a random passer-by who came to read a text and encountered the typo in the middle of reading. This user will most definitely not bother to register (or log into their account even if the do have one already) just to fix that typo. The benefits of allowing anonymous editing far outweighs the cost of monitoring them.--Naḥum (talk) 16:31, 21 August 2019 (UTC)

To be honest, I don't like this idea, however, if it ties the hands of two specific administrators on the English Wikipedia (who I am not going to name as a courtesy to them) who think making a significant portion of the world population request an account in order to edit Wikipedia is good for our project, I'm for it. Some admins seem to see blocking as punishment rather than prevention, and they can't accept the fact that some vandalism is going to slip through the cracks and letting some things go is how the project remains open as intended; limiting IP access to CheckUsers (and keeping current CheckUser policies) will ensure that editors are judged by their contributions, not where they are editing from, and that a maximum number of the world population has the ability t contribute. PCHS-NJROTC (talk) 00:18, 9 September 2019 (UTC)

Research on the Value of IP editing[edit]

Several people asked me to respond to the sub-proposal about making account creation mandatory (#4 in NKohli (WMF)'s summary & response) in my capacity as an academic researcher who has studied non-registered contributions to wikis. I have worked with a number of collaborators to put together a response at User:Benjamin Mako Hill/Research on the value of IP Editing which I will eventually move to a page in the Research namespace. The very short version is that we think that research suggests mandatory registration will deter a large number of valuable contributions. We think it is a bad idea.

The page is not an argument against the current proposal to obscure IP address information. In fact, at least one papers we refer to would provide evidence in support of the idea that the proposed change could encourage valuable contributions from anonymity-seeking users. Our argument is focused on the value of unregistered contributors rather than the importance of identifying these editors by their IP addresses.

The page we've put together includes citations to studies that we we believe provide evidence for the claims that (1) many IP-based edits are valuable; (2) blocking IP edits may discourage newcomers from making their first edit; (3) contributing without an account opens a pathway to deeper contributions; (4) past trials of this idea have been damaging; and (5) several other considerations.

I think the strongest evidence is simply that similar experiments have resulted in substantial decreases in valuable contributions. This includes both (a) experiments were users were asked (but not forced) to register before editing Wikipedia and (b) evidence from 130+ Wikia wikis that switched to requiring accounts. Both experiments deterred vandalism and caused enormous collateral damage. Some users will create accounts in order to make valuable contributions. Many others will not.

The page we've put together was written by and has been signed by a number of academic researchers who study Wikipedia contribution dynamics. —mako 00:54, 11 September 2019 (UTC)

I haven't yet read the paper, but this paper would be an argument against IP masking, as Wikipedias would be required to revert sensible statistic edits to avoid having random statistic edits remain in the Wikipedia. There would be even less of a way to determine whether (say) a change of birth date from April 10 to April 12 is a vandal or an intelligent edit than there is now. — Arthur Rubin T C (en: U, T) 00:57, 12 September 2019 (UTC)
Possibly? It seems like it would depend quite a lot on details of how it is done that haven't figured out yet. In any case, the summary doesn't attempt to speak to the IP masking proposal in any systematic way. Just to the counter-proposal to make accounts mandatory. We think the evidence pretty compelling suggests that doing that would incur serious costs. —mako 02:55, 12 September 2019 (UTC)

identification from external web sites[edit]

Also did you consider openID login? Or with facebook, github, twitter, mastodon, or another external site?

Best regards, --Gryllida 00:01, 6 August 2019 (UTC)

Now, this is something that may make sense. --Dirk Beetstra T C (en: U, T) 06:12, 6 August 2019 (UTC)
I think that there were concerns about this sort of thing on the following fronts:
  1. Partnership with big corporate tech giants.
  2. Privacy/doxxing concerns about linking Wikipedia identities to real life identities
  3. Privacy concerns about the data the tech giants want back in such arrangements
  4. Our commitment to open source software
WereSpielChequers (talk) 05:03, 7 August 2019 (UTC)
The WMF already has partnerships with "big corporatetech giants". Google and Facebook and Amazon are big donors as well as big consumers of Wikipedia information. OAuth and OpenID are both open standards, by the way. I would be very interested in learning more about "data the tech giants want back in such arrangements". Bitter Oil (talk) 22:24, 8 August 2019 (UTC)


I support this. Autoblocks and IP blocks would still be a thing so "whack-a-mole" is not the case, and would make it a lot easier to identify who made what edits. Or...maybe all IP edits could be pending changes? Computer Fizz (talk) 18:25, 13 August 2019 (UTC)
It could be an option, but it should not be required. Facebook, Twitter, etc are not accessible everywhere Wikipedia is. PCHS-NJROTC (talk) 18:34, 8 September 2019 (UTC)

IP editing is the opposite of anonymous[edit]

This has been well-known for a long time. When you edit under an account, you're as anonymous as you want to be, at least to the public face of Wikipedia (users with Checkuser have more visibility, but not a lot). When you edit logged out, your IP address is revealed every time you make an edit. If it's the goal of the WMF to better secure anonymous editors' IP addresses (which they should, we're years behind the times here) disabling IP editing is the simplest approach.

But we want people to be able to edit anonymously, without jumping through the hurdle of creating an account, right? It's one of our fundamental principles that anyone can edit. Well, that's easy too. The social network Whisper is designed to be fully anonymous, and they do something like this: when you install the app and make a first post, an account is automatically generated with a human-readable name (I'm pretty sure they just pick two random dictionary words and mash them together) and that account is linked to your device. No login, no choosing a name or generating a password, nothing to remember at all. Wikipedia should be easily able to do the same: when someone wants to edit but doesn't have an account, we just randomly make one up and assign it to them for that session. If they edit a bit and decide they want to keep that account (to use later, for posterity, whatever) we give them the option of actually creating a proper account under that name, or rename it to a different permanent name if they want (and the name is available). If not, when their session expires the account becomes inaccessible because there's no way to log in to it, pretty much the same as when a dynamic IP is reassigned. We of course log the IP address and make it available to checkusers for abuse management, the same way we log all account contributions now, but that log is effectively hidden to the vast majority of users and access is pretty tightly controlled.

Yes, it makes some forms of abuse detection marginally more difficult, but that is going to be a thing with IP addresses hidden no matter what way we do it. The "temporary" account's IP is still available to the software so we can still do things like autoblocking an account's underlying IP (so new "temporary" accounts can't be created for the duration of the block) and checkusers could still disable account creation on problematic IPs and ranges. Yeah it's some technical work but that seems to be why we're here. Ivanvector (talk) 16:13, 22 August 2019 (UTC)

This doesn't solve the problem that of 934 projects, only 36 have checkusers. GMGtalk 17:31, 22 August 2019 (UTC)
And that even on en-wiki, only a couple of dozen are Checkusers, and half are Arbs who rarely exercise it day to day. And they're busy with just the current style of sock investigations atm. Nosebagbear (talk) 18:09, 22 August 2019 (UTC)
Yes, it makes some forms of abuse detection marginally more difficult, but that is going to be a thing with IP addresses hidden no matter what way we do it. Or, speaking as a former enwiki checkuser, you're making that out to be a much bigger problem than it actually is. Ivanvector (talk) 13:39, 23 August 2019 (UTC)
Removing wholesale the ability for 96% of our projects to issue range blocks is hardly a trivial matter. GMGtalk 13:53, 23 August 2019 (UTC)

Administrator efficacy[edit]

I'm a cynic, but I'll try and approach this with an open mind. I have a few comments which I'll put in sections below. For those who haven't read the IP Masking Impact Report overleaf, whether you think it's a good idea or not, the report is worth a read. My general comment on the subject relates to this quote: "we should expect that masking will significantly disrupt workflows and general administrator efficacy.", and the comment on the project page which says, "a critical part of this work will be to ensure that our wikis still have access to the same (or better) level of anti-vandalism". If you know anything about enwiki, you'll know that we probably don't have the bandwidth for the extra abuse we'd be dealing with if we couldn't identify and target IP blocks. If anything we should be taking more action directed at ISPs instead of hiding them. I therefore wish this project luck, but hope that "no substantial change" will be seriously considered as a potential outcome for this project. zzuuzz (talk) 18:47, 1 August 2019 (UTC)

As a former member of Wikipedia:Abuse repsonse, I can appreciate Zz's position on this. Some ISPs do act on abuse reports; in fact, many of them have an entire team of people whose only job is to work on abuse reports. Comcast, CenturyLink, and Time Warner Cable are examples of this, and all three have worked with me on abuse issues relating to Wikipedia and Conservapedia. I once proposed that we create a tool to automate reporting abuse to ISPs, kind of like Twinkle, but that was shot down because people thought sending abuse reports was pointless. A lot of this is based on people not understanding how IPs work. People would submit reports on school IPs to WP:AR expecting these IPs to magically never vandalize again because of an abuse report, but the reality is kids are going to be kids, immature adults are going to be immature adults, and even if a school or employer takes action, it's only going to be against the ones that they have on record being silly. The time for abuse reports isn't when there are 100 instances of random vandalism from an IP used by 100,000 K-12 students, the time for abuse reports is when there is a pattern of abuse. For example, I've contacted Sarasota County Schools three times about patterns I observed, and in each instance, the patterns ceased but people still occasionally write "hi" or "poop" on an article. I monitor the /24 range from that district due to them attempting to work with us in '05 and due to it being local to me in real life. I don't think it's worth while to write to them every time a kid writes something like "megan is the awesomeist cheerleader" on a page as the schools and ISPs will start ignoring us, however, when I see one hundred edits from a school referencing Megan over the course of three months, Megan or her admirer needs to lose her internet privileges at school. PCHS-NJROTC (talk) 00:51, 9 September 2019 (UTC)

Strong oppose[edit]

This is a terrible idea for many reasons, and is the kind of nonsense one would expect from WMF staff that are totally out of touch with regards to how these projects are run. In addition to the reasons pointed out above:

  1. IP addresses being public has provided public scrutiny and deterrence of COI editing and other abuse, see e.g. w:United States Congressional staff edits to Wikipedia.
  2. The ability to WHOIS IPs and be able to tell whether they are a school, a webhost, a dynamic residential IP and so forth is essential information in the fight against abuse.
  3. Being able to get a new identifier by clearing cookies is so utterly trivial to any experienced vandal. In fact it is a clear regression over the current system where one has to both clear cookies and get a new IP address (range).
  4. Range blocks become impossible.

This project should be immediately confined to the trash can and never brought up again. We need to make anti-abuse editing easier, not harder. MER-C (talk) 18:48, 1 August 2019 (UTC)

Hi MER-C. Please avoid calling this nonsense or saying that the WMF are out of touch. We have behavioural expectations on Meta, and I would appreciate it if would make an effort to frame your comments in a more positive and less attacking way. Your feedback on the specifics are appreciated, but can be made without the theatrics. Thanks, – Ajraddatz (talk) 18:57, 1 August 2019 (UTC)
Please avoid calling this nonsense or saying that the WMF are out of touch -- seriously? I am inclined to point at the en-wiki article about snowflakes but it might breach urbanity (which seems like the perpetuation of a colonial stereotype against rural folks), in light of it's political co-relations. Also, in my part of world accusing someone of indulging in theatrics is a clear-cut breach of social etiquette.Winged Blades of Godric (talk) 19:09, 1 August 2019 (UTC)
I understand that it is the wiki way to use strong and aggressive language to get your point across. I'm saying that it is unnecessary and unwelcome here. Please keep comments civil and respectful; your feedback will be considered without an abusive tone (what I meant by theatrics, apologies if you are actually concerned with my word choice). – Ajraddatz (talk) 19:24, 1 August 2019 (UTC)
These are unhelpful and poorly-phrased comments, which explicitly intend to have a chilling effect, and have partly succeeded]. Johnbod (talk) 02:59, 2 August 2019 (UTC)
The only intent is for people to contribute in a productive and civil manner, and that same standard applies to you. – Ajraddatz (talk) 03:19, 2 August 2019 (UTC)
There is a fine line between encouraging civility and suppressing dissent. Caution is in order from both sides. Cullen328 (talk) 23:45, 2 August 2019 (UTC)
@Ajraddatz: It boggles my mind how do you not see the chilling effect of your comment (particularly when coming from a "power-user" such as yourself) - it is pure suppression of dissent! Calling this proposal "nonsense" or the WMF "out of touch" is valid criticism, especially in context. Notrium (talk) 12:34, 19 September 2019 (UTC)
There is an expectation of respectful participation on Meta. As I've said elsewhere, criticism is very welcome, but this discussion should not devolve into name calling and rants. – Ajraddatz (talk) 12:38, 19 September 2019 (UTC)
  • Agree with MER-C. They ought to spend more efforts to developing more anti-abuse features (as MER-C and DirkBeetsra has been pleading for years) rather than these. Winged Blades of Godric (talk) 19:09, 1 August 2019 (UTC)
  • I totally agree with MER-C. This is nonsense. I come from a small wiki whereby admin credibility is questionable. Restricting access to IP info to this small bunch of admins is much worse than the status quo. On one hand, no guarantee this small group of admin would fulfil anti-vandalism tasks. If only they have access, it's a unnecessary hurdle for non admins to help counter vandalism. On the other hand, when the small group goes rogue and abuses the private info, it's too hard to hold them accountable. Out of 300 wikipedias I'd say ~250 are small. Most non-wikipedia wikis are small too.--Roy17 (talk) 19:10, 1 August 2019 (UTC)
    • Replies to MER-C's comments:
      1. This project won't necessarily hide that information. IF we end up hiding IP addresses from administrators (which is a big "if"), we could probably still provide the ISP of the user based on their IP address.
      2. See answer to #1 above.
      3. As the FAQ says, the generated identifier may be associated "with a cookie, the user’s IP address, or both". If associating it with only a cookie is too easy to circumvent, we won't do it that way. In fact this could be an opportunity to make anonymous editors easier to track by making their identity more sticky than an IP address (which isn't very sticky these days due to mobile editing, VPNs, etc.). We just have to be careful about collateral damage on shared computers.
      4. As the FAQ says, "We hope to restrict IP address exposure to only those users who need to see it." If administrators need IP addresses in order to do range blocks, and there is no better way to do it (like automatically creating masked range blocks based on a set of users from the same ISP), then we won't be hiding IP addresses from administrators. We have no intentions of destroying the community's ability to fight vandalism.
    • Ryan Kaldari (WMF) (talk) 19:46, 1 August 2019 (UTC)
This does not address my first and foremost concern. The proposal harms the scrutiny of inappropriate edits by the public, and that scrutiny deters abuse. The Foundation knows, or should have known, about the Congressional IP editing scandals. Public interest journalism is not harassment.
> we could probably still provide the ISP of the user based on their IP address
Not sufficient. Geolocation is also necessary. Think Comcast or any large ISP. At this stage, you might as well publish the whole IP address. As for making identities more sticky - I'll believe it when I see it. The Foundation haven't even bothered to make available the entirety of the HTTP request headers to Checkusers, which they are sent automatically on every single page load. (In)actions speak louder than thought bubbles. MER-C (talk) 20:37, 1 August 2019 (UTC)
MER-C Those are good points we will need to take into consideration. Editing from Congressional IP addresses is mentioned in our research report, but it only discusses how administrators interact with that information, not the public. Are there any ways that we could retain some sort of public scrutiny (via something like ISP + state-level Geolocation) without revealing IP addresses to everyone (or exactly where a person is editing from)? In other words, do you think it is possible to improve anonymous editor privacy without losing public scrutiny or are they mutually exclusive in your opinion? Ryan Kaldari (WMF) (talk) 21:03, 1 August 2019 (UTC)
Mutually exclusive. State level is also not good enough - think California and its three large cities, or Guangdong. Also, good luck in getting the community to accept zombie cookies. MER-C (talk) 21:08, 1 August 2019 (UTC)
Also, if you disclose the ISP and geolocation, the whole change becomes unfit for your intended purpose - protecting from the danger of people being at risk of government persecution. The ISPs in third world dictatorships are always controlled by said dictatorships and keep extensive logs - and therefore the proposal is pointless. You cannot say we are going to publish all US IPs but hide all Russian ones because the public has a genuine interest in knowing whether our projects are being (clumsily) manipulated by state-sponsored disinformation troll farms. MER-C (talk) 11:10, 2 August 2019 (UTC)
Indeed: the information available for enforcement would have to at least be the same as what is already known about an IP address (including block/network relationship for range blocks and identifying problematic networks). Moreover, if addresses, or any of that information was restricted to administrators, this would reduce the efficiency of patrollers (most are not administrators, many do not intend to be, but they resort to them when necessary, in which case technical knowledge helps to produce better reports for faster processing, mark schools as such for future patrollers, etc)... PaleoNeonate (talk) 03:57, 27 August 2019 (UTC)
  1. There are IPs and IP ranges associated with long term abuse, sometimes sporadic, and therefore IP information must be retained indefinitely. w:Special:Contributions/195.171.136.42, w:Special:Contributions/122.166.47.217 come to memory. There are likely hundreds more. MER-C (talk) 20:51, 1 August 2019 (UTC)
    The WMF Security team also needs long-term information about certain abusive editors/hackers, so I imagine we will need to solve that one way or another, perhaps by flagging accounts for long-term tracking or something similar. Let me know if you have any ideas on that. Ryan Kaldari (WMF) (talk) 21:12, 1 August 2019 (UTC)
    VxFC was specifically relegated to WMF. What did you do? Winged Blades of Godric (talk) 05:48, 2 August 2019 (UTC)
    How am I supposed to tell whether there is an abuse pattern lasting years based on only 90 days of data? Look at the 195.171.136.42 link above: four spam links, then nine months of inactivity, then three spam links, then another six months of inactivity. If the person on the other end of this IP wanted any resemblance of cybersecurity, they would have updated their browser in the meantime = different user agent = different hashed identity. As for the solution, it is obvious - keep the current system. MER-C (talk) 09:13, 2 August 2019 (UTC)
  • Doesn't sound like a good idea. If people are concerned about their privacy, they should become registered. I don't know if we warn ip editors their address will be visible, but if not we might consider that. Johnbod (talk) 02:59, 2 August 2019 (UTC)
  • Attempting to edit while logged out is always met with "Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits." Though it seems like a lot of people just ignore this warning, based on how many emails oversight gets along the lines of, "I had no idea my IP address would be visible if I edited." If there is a concern about people doing this unknowingly, then just make the warning huge and intrusive. Someguy1221 (talk) 04:58, 2 August 2019 (UTC)
I endorse the idea of a click-through warning. It is both the easiest and the best solution to this "problem", if it really is a problem. Unregistered users can either affirmatively consent to public logging of their IP addresses, or register an account. Let's not pursue complex, expensive and divisive solutions when a very simple solution is readily available. Cullen328 (talk) 23:40, 2 August 2019 (UTC)
@Someguy1221: Attempting to edit while logged out is always met with a warning, BUT only when trying to edit enwiki. It is easy to Inadvertently edit other wikis without realizing you are not logged in. Ottawahitech (talk) 02:16, 16 August 2019 (UTC)
@Ottawahitech: That does not seem to be accurate. The message "anoneditwarning" displayed above the edit field is a standard part of MediaWiki, not an enwiki customization. It's possible that some wikis have disabled the message entirely, but that's the non-standard customization. Anomie (talk) 13:42, 16 August 2019 (UTC)
@Anomie:. Yes, you are correct. I don't know why I did not see it before. Sorry for the wild-goose chase.Ottawahitech (talk) 14:36, 17 August 2019 (UTC)
Just to add, though. Sometimes when I am logged in, something happens and I get logged-out without noticing. I don't know if it is a glitch that no longer happens, but I remember distinctly using HOTCAT (this happened years ago), and being logged out in the middle of an update. When I checked the history it clearly showed a HOTCAT update made by an IP, something that should never happen, I think? Ottawahitech (talk) 14:44, 17 August 2019 (UTC)
@Ottawahitech: It is possible that HotCat is or was coded in such a way that it would still make an edit if you became logged out. If it's still the case, that should be fixed in HotCat, for example by checking the user when (re-)fetching an edit token. Anomie (talk) 15:03, 17 August 2019 (UTC)
  • I concur with User:MER-C and everyone above. As a non-admin (and as someone who doesn’t want to be an admin) that assists in investigating sockpuppets and long-term abuse when needed, losing access to IP data (and consequently, WHOIS and geolocation information would absolutely cripple that work in investigating abuse, not to mention the invaluable work that the Open Proxies wikiproject does. Even in cases of simple vandalism, there have been times where I’ve reached out to the abuse contact listed on an IP’s WHOIS (mostly schools) to let network administrators know about the vandalism issue, which is invaluable for anti-vandalism efforts and stopping long-term abuse. I’m also concerned at the fact that non-administrators weren’t seemingly taken into account with the report, and I would vehemently urge the team involved to keep us plebs in mind going forward with any decisions made. OhKayeSierra (talk) 07:55, 2 August 2019 (UTC)
  1. > we should expect a transitional period marked by reduced administrator efficacy
Translation: we are deliberately compromising the integrity of our projects and deliberately increasing administrator burnout. Let's say the WMF actually gets the implementation order correct and deploys the anti-abuse enhancements first. That leaves a change that in isolation should not be deployed ... in isolation and therefore should not be deployed.
  1. > We intend to implement some method to make the generated usernames at least partially persistent, for example, by associating them with a cookie, the user’s IP address, or both.
Say hi to zombie cookies and invasive web analytics, because that is the only way this will work in any acceptable manner. MER-C (talk) 09:28, 2 August 2019 (UTC)
I look forward to WMF enabling zombie cookies! Face-wink.svg Winged Blades of Godric (talk) 12:11, 2 August 2019 (UTC)
  • @MER-C: where can I voice my opposition to this wrecking privacy-above-all initiative? Many Wikimedia sites—such as en.Wikipedia and Commons—are already cluttered with dumb sock puppeteers; CheckUsers’ backlog is clogged due to scores of puppets produced by various morons handy enough to register multiple accounts. If, in addition, every random person had his/her IP range concealed, then the ruin would come to anti-vandal, anti-copyvio, and anti-pushing job. Does any mean exist yet to counteract this vandal-friendly infiltration into the Wikimedia government? Incnis Mrsi (talk) 11:55, 2 August 2019 (UTC)
    • You just did so. But I suspect that the outcome of this "consultation" is predetermined, and to stop it will require escalation to Board/Jimbo level (again). MER-C (talk) 12:15, 2 August 2019 (UTC)
      I am committed to oppose and expect to see instructions by MER-C. Thanks a lot for early comments, by the way. Incnis Mrsi (talk) 12:37, 2 August 2019 (UTC)
    To prove this isn’t a rant by yet another incompetent wiki user: I caught seven socks of the LTA Pakistanpedia, with my naked hands and IP-range tips. Two (previously unnoticed) copyvio files were deleted from Commons consequently. And it’s certainly not my unique achievement. Incnis Mrsi (talk) 14:21, 2 August 2019 (UTC)
  • Hey WMF, we're not close at all to get rid of abuse, so you don't need to help abusers *so much*, leaving us fighting bots by hand is already enough! --Vituzzu (talk) 14:23, 2 August 2019 (UTC)
  • I have to agree that this would be a huge detriment to attempts to deal with abuse. The only exception would be if the IP address information was still available to experienced editors, which would mean the whole thing would become a bit pointless. I have generally supported the right of unregistered editors to edit but I think if making IP addresses accessible really is unacceptable then a better solution would be to require registration. Hut 8.5 19:04, 2 August 2019 (UTC)
  • If you want to hide IP's from edit histories because of privacy concerns then make it a requirement to register an account. I don't even want to think about how easy it would be for vandals to vandalize if this is going to be happen. WMF should try alone to fight against vandals for few weeks and then make that kind of proposals again. Stryn (talk) 04:48, 3 August 2019 (UTC)
  • Very Strongly Oppose – For many, many reasons, including those listed at the top of this section and others brought up by other users above. As an experienced anti-vandalism user, I can personally attest that these changes will make fighting vandalism/abuse, much more difficult, especially when dealing with experienced LTAs/trolls. Anyone determined enough could easily find loopholes - in fact, vandals and LTAs will have much more options to vandalize and harass others while the anti-vandalism work of most other non-admin users would be effectively crippled. Any small benefits from these changes would be vastly outweighed by the downsides to this ill-conceived plan. If you want to be "anonymous" (not reveal your public info), you should sign up for an account, simple as that. The WMF has already screwed up badly on the execution of Fram's "partial Foundation Ban" (regardless of the merits); we do not need to see another WMF screw-up after the last public relations disaster. On the same topic, I also oppose making registration mandatory. We already have plenty of measures in place for dealing with IP/new account vandalism, and if someone wants to abuse one of the projects using their IPs, then they can suffer the consequences. Per @Incnis Mrsi:, I would also like to hear the opinions of other experienced anti-vandalism users and admins. Also, I think that it would be very beneficial if we keep all of our discussions in one thread. LightandDark2000 🌀 (talk) 18:15, 3 August 2019 (UTC)
  • What is not clear from this research is why it is necessary to do this in the first place. It couldn't be easier to create an account. Those that have a privacy or security issue are surely doing this already. So what is the great evil that is coming about from exposing IP addresses? What is badly needed much more than hiding IPs is some defence against problem editors who avoid scrutiny with dynamic IPs on mobile devices. We need a statement of the problem that is being fixed before discussing the fixes. SpinningSpark 21:54, 4 August 2019 (UTC)
    My understanding, from the first sentence of the page in question, is that the world is becoming more conscious. </s> Killiondude (talk) 05:06, 5 August 2019 (UTC)
  • Agree with MER-C. Benjamin (talk) 05:03, 5 August 2019 (UTC)
  • Yes, my first instinct was to think of how en:Church of Scientology editing on Wikipedia and en:United States Congressional staff edits to Wikipedia would have resolved (or not resolved) very differently had the general proposed changes here been in effect at those moments in time. I don't know how we can resolve what the WMF wants and how the WMF wikis are actually fighting vandalism and COI edits and so on. Killiondude (talk) 05:06, 5 August 2019 (UTC)
  • I also oppose in agreement with MER-C. Keeping track of IPs helps us spot COI and subtle vandalism, and gives us more tools for dealing with it. Obscuring it would be a boon to vandals and abusers. For the sake of irony, I submit "my" contributions on EN-WP as evidence. 168.244.4.53 02:59, 7 August 2019 (UTC)
  • Agree with MER-C. If someone is worried about their privacy, they should create an account. It is really that simple! There is no problem to solve here. jni (talk) 05:20, 7 August 2019 (UTC)
  • actually the "w:CongressEdits." twitterbot was no deterrent, but rather was weaponized to broadcast personal information about senators , resulting in some jail time for a staffer. and was shut down at twitter. apparently your deterrence theory will need another example. Slowking4 (talk) 01:45, 9 August 2019 (UTC)
  • Strong oppose. This is absolutely non-sense you should not waste any resources on this. Don't fix it if there is nothing to fix. --2001:999:82:9855:DC90:883A:206A:2 14:47, 21 August 2019 (UTC)
  • Just for the record, yes, I would overall prefer the status quo and that the Foundation simply do nothing in this regard. If users are concerned about their privacy, then there is an easy currently-existing solution: register an account. For that matter, if they wish, they could register a new account consisting of entirely gibberish for each editing session, and then throw it away when they're done. There's nothing AFAIK in any local or global policies preventing them from doing so, so long as they're making productive contributions. The process takes a few seconds and doesn't even require an email address. I don't see that as an onerous burden for those concerned about privacy.
All in all, this seems like a lot of discussion about how to solve all these various problems when the easiest solutions seems to be not creating all these problem that need solving in the first place. GMGtalk 15:19, 21 August 2019 (UTC)
  • Please, do not hide IPs from registered users, because the hiding will make much harder war against vandalism. Раммон (talk) 08:40, 22 August 2019 (UTC)
  • Agree with MER-C. The current system is already good, why should we change it? Veracious (talk) 16:10, 23 August 2019 (UTC)
  • Oppose per all good points made by MER-C and Cullen328. Public IP is a handy and speedy tool for non-CU regulars in terms of sock investigations, COI, trolling, threats, harassment, etc. and we already encourage to create an account for those who don't want a publicly open IP. This is a solution in search of a problem where modern privacy trend is stretched beyond necessity. Brandmeister (talk) 20:40, 25 August 2019 (UTC)
  • Strong oppose per MER-C. Millennium bug (talk) 03:03, 27 August 2019 (UTC)
  • Strongly discourage per my above comment in this same thread. PaleoNeonate (talk) 04:04, 27 August 2019 (UTC)
  • Strong oppose per MER-C, Vituzzu, Stryn and also in view of the comments from Rschen7754, billinghurst and Dirk Beetstra in #Petition to WMF. --Udo T. (talk) 11:17, 27 August 2019 (UTC)
  • No bueno, bad idea -Indy beetle (talk) 03:45, 28 August 2019 (UTC)
  • Seems a pharisian idea from a deeply buried thinker who has no idea of real-life process of the simple contributor or administrator. In the age of VPN (living inside the Great Firewall, I use it every day), the clever vandal has everything to continue. Concealing the IP will remove information needed to deal with the naïve vandal (or erratic beginner). --Wuyouyuan (talk) 07:47, 29 August 2019 (UTC)

My views[edit]

Placeholder. I have to digest this and rationally give opinions. I will urge WMF to engage local communities, I didn't know this until I saw someone editing this when doing RC patrol. I didn't even see it in my homewiki village pump. I hope a mass message can be send to projects for such changes if they are to be implemented for early community comments. Thanks and appreciate it. --Cohaf (talk) 19:21, 1 August 2019 (UTC)

Hi Cohaf, this page is less than 24 hours old, when it was started as a draft – we're in the process of spreading the word and we'll be talking about this for quite some time. /Johan (WMF) (talk) 19:28, 1 August 2019 (UTC)
This crazy idea has long been discussed on phabricator. Several concerns were arisen, non of them was addressed.--Vituzzu (talk) 14:27, 2 August 2019 (UTC)
Wow, I hope all or most of the concerns on phab will be taken into consideration, thanks Vituzzu for the heads up that this isn't a draft started in less than 24 hours old' and was rather discussed elsewhere in many other tasks. I will give my comments in another section. Regards, --Cohaf (talk) 12:00, 3 August 2019 (UTC)
phab:T20981 and phab:T133452 mainly, but also mw:Requests for comment/Exposure of user IP addresses and this. This proposal comes out of the blue in relation to the amount of concerns expressed over more than ten years. --Vituzzu (talk) 19:59, 3 August 2019 (UTC)
Cohaf, yes, of course, the discussion is much older – this has been a perennial debate as long a I've been around the Wikimedia movement. We link to some of the previous discussions on the project page. This particular page, however, inviting to comments, is new. Danny has written an account of the project so far in a comment further down. /Johan (WMF) (talk) 11:47, 5 August 2019 (UTC)
This is going out in Tech News later today. /Johan (WMF) (talk) 11:49, 5 August 2019 (UTC)
@Cohaf: Thanks for posting a notice of this discussion on the Simple English wikipedia. And to WMF staff: if you are truly interested in getting feedback from ALL contributors you have to spend some effort figuring out how to introduce such discussions to the masses, not just insiders who spend enormous amounts of time finding out where the action is. I am not even sure you have an interestin my views on this topic. BTW this is not my first attempt to participate in a Metawiki discussion. Ottawahitech (talk) 21:17, 15 August 2019 (UTC)

Full support[edit]

We already see a lot of dynamic IP, meaning it is hard for our admin to keep track of the vandals edits. I perceive we are already losing the grip today on IP-edits in general. If we instead go according to this proposal and allocate an automatic generated username to these edits it gives us new opportunities to track vandals and bad edits. We can then evolve the "automatic username generator" to be intelligent (AI) and recognize a user using different Ip addresses.Yger (talk) 19:31, 1 August 2019 (UTC)

I agree as far as that the “IP masking project” should only be realized if it will make acting against abusers and vandals easier and more effective than it is today! For every user! --Gretarsson (talk) 12:07, 22 August 2019 (UTC)

Who are "we"?[edit]

"We believe..." In whose voice is this written? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:37, 2 August 2019 (UTC)

The team at the Wikimedia Foundation working on the project. /Johan (WMF) (talk) 10:10, 2 August 2019 (UTC)
I counted something like five different "we" in the initial paragraph, so I've clarified it. It would be useful to know whether each time "Wikimedia Foundation" is the subject that's actually a decision of the entity itself (a board-sanctioned goal, maybe in the annual goal?) or something else. Nemo 13:59, 2 August 2019 (UTC)
DannyH (WMF) wrote a bit about this in a comment here: Talk:IP Editing: Privacy Enhancement and Abuse Mitigation#"IP address masking product development meetings last year"?. /Johan (WMF) (talk) 11:52, 5 August 2019 (UTC)
Thank you, that was useful. Nemo 16:02, 6 August 2019 (UTC)

Open proxies[edit]

I'm still a cynic, but in the spirit of being open minded I offer this niche topic in case it's informative. I hope we can all agree that it's important to identify if an IP is an open proxy. We actually use a wider definition in our policies, which is "open or anonymising proxy". I'd like to just discuss a few issues involved with this. Obviously most of the time we need the IP address to identify an open proxy. One might think that the WMF could implement some sort of check, similar to various other blacklists, which says whether an IP is an open proxy. I've always said, and continue to say, that these lists (though sometimes useful) are often unreliable and there is no substitute for a human when doing this task.

  • Access to the IP gives us access to open ports, which is a common indicator for an open proxy. For certain types of proxy it is the only way to verify the proxy and which IP should be blocked.
  • Access to the IP gives us access to other IPs in the range. This is most useful for identifying web hosts. I've actually been dealing with a few abusive NordVPN proxies recently. These are often identified by following a simple formula: Put the IP address in your address bar, subtract 1 from the final digit, and hit enter. This is generally the only way to distinguish NordVPN, and something a non-human is very unlikely to do.
  • Access to the IP allows us to search Google for the IP address - this is probably one of the best indicators we have, especially when it comes to transient proxies such as zombies and dynamic VPNs.
  • Access to the IP gives us access to various blacklists. Blacklists will often disagree with each other, and many will be unreliable, but it is all useful information to factor in.
  • Access to the IP tells us the geolocation. If the same user is editing from two different countries, then one of them is probably open (or anonymising).
  • Access to the IP tells us (through WHOIS) about the owner of the IP - whether it's a hosting company, or a cloud company, or an anonymising proxy service (as well as being dynamic, residential, etc).
  • Access to the IP gives us rDNS information, which might eventually resolve to unblock-wikipedia.com (or similar).

One could argue that checking proxies is a very technical subject which most admins won't do, but it is an important thing that some users do do. If we are hiding IP addresses from regular users, then we will either lose the ability to check for open proxies. or have to rely on checks being done by software, which are going to be sub-optimal. zzuuzz (talk) 12:43, 2 August 2019 (UTC)

This is a serious concern that would need to be addressed before rolling this out. There is a fair amount that could be done via automated tools that is not currently being done (for instance, WHOIS searches, see: local and global), but as Zzuuzz mentions above, a lot of open proxy detection involves work that really should be performed by a human. SQLQuery me! 01:54, 5 August 2019 (UTC)
@Zzuuzz: Procseebot does blocks of IP addresses from open proxies. In University of Virginia/Automatic Detection of Online Abuse our university team reported that Procseebot does most of the blocks on English Wikipedia. Everything you are saying is correct but I am not clear that you have seen what a huge influence and culture change Procseebot is bringing. You understand the issues in this space and I hope you are part of any future discussion. Blue Rasberry (talk) 21:11, 9 August 2019 (UTC)
@Bluerasberry: ProcseeBot is indeed a blocking legend, and I've no doubt it's made a significant impact (and - WMF - I'm not sure what will happen if its semi-inactive owner retires). However, don't overestimate what it does. ProcseeBot has blocked around 3 million IPs in all of history. In one day I once blocked a /12 range, which is over 1 million IP addresses. I've blocked many other ranges containing hundreds of thousands of addresses each, and there are other admins who have blocked lots more similar ranges. See what User:SQL says about this below. For example we are currently blocking almost all of Amazon, Microsoft, and Google's cloud offerings, among many others. I don't know how many millions that is, but it's a lot. ProcseeBot doesn't block ranges. It doesn't block a lot of open proxies but only checks one particular subset of open proxies - HTTP proxies, and it doesn't even block all of them. Nor does it include VPNs, CGI proxies, Web hosts, VPSs, compute hosts, or any other proxies. -- zzuuzz (talk) 00:07, 10 August 2019 (UTC)
To illustrate Zzuuzz's point, here's a list of all active ipblocks on enwiki, broke out by admin, with all ranges totalled up. SQLQuery me! 14:28, 11 August 2019 (UTC)

Oppose[edit]

There's far too many sections on this page, which is why I'm creating yet another. Hopefully, this plan will never be implemented. If it is, hopefully I will no longer be editing en.wiki. As a CU, all I see is (a) complexity and (b) making my job much harder than it already is. As others have also said, I don't see the need for it. I don't see how the present system punishes IPs except when they merit punishment. Finally, although I know it's a complete pipe dream, I too believe in forced registration. And it doesn't change the rather tired and inaccurate mantra "the encyclopedia that anyone can edit". Anyone still can edit (except in all the places where they can't) with the incredibly low bar of registration, which doesn't mean they are not anonymous, just more accountable (excuse the pun).--Bbb23 (talk) 17:43, 2 August 2019 (UTC)

I endorse Bbb23's analysis. I have been an administrator on English Wikipedia for two years and Bbb23 has been an administrator there for seven years. He is very productive and highly respected for his constant work to defend the encyclopedia from vandals, trolls, spammers and kooks. Please pay very close attention to what he says. Cullen328 (talk) 23:08, 2 August 2019 (UTC)
I also agree with the above (also as an EnWiki admin since 2007). Nick-D (talk) 02:11, 3 August 2019 (UTC)
  • support Bbb23 and agreement from me as well Ched (talk) 20:40, 3 August 2019 (UTC)
  • Totally agree. Kudpung (talk) 21:00, 3 August 2019 (UTC)
  • I agree with Bbb23. The work of checkusers is relentless, unglamorous, thankless, and absolutely crucial. I firmly oppose anything which makes their job harder. --BrownHairedGirl (talk) 21:24, 3 August 2019 (UTC)
  • Oppose The direction of this text is en:not even wrong - there are fundamental misconceptions here because this originates in the absence of understanding Wikipedia community culture. I support the development of this discussion but this particular proposal needs to be deconstructured into its many unrelated controversial parts and handle one part at a time. I agree that discussion is worthwhile but simplify it so that more Wikimedia Foundation people can understand the topic as well as Wikimedia community members. Blue Rasberry (talk) 21:17, 9 August 2019 (UTC)

It is voluntary and it is disclosed that the IP address will be visible to anyone anon making an edit.[edit]

  • You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to a user name, among other benefits.
  • You are not logged in. Saving the change you are previewing will record your IP address in this page's public edit history. Please log in or sign up to have your edit associated with a user name, among other benefits.

It is voluntary and anyone editing clearly knows that there IP address will be publicly visible is displayed to them and is told to them twice before their edit is saved. Feel either we should leave as it is or make registration compulsory.Pharaoh of the Wizards (talk) 18:03, 2 August 2019 (UTC)

A big part of the problem is that good faith readers who want to make an edit to Wikipedia often believe falsely that unregistered editing is more "anonymous" than editing with an account. We perpetuate that by casually referring to IP editors as "anonymous editors". We should stop talking that way. The fact is that editing with an account and not disclosing personal information is by far the most anonymous way to edit. The notices quoted immediately above should be edited to clarify that fact. In my opinion, IP editors should affirmatively have to click a box saying "I agree to having my IP address recorded in this page's public history" before their edit goes through. Cullen328 (talk) 22:57, 2 August 2019 (UTC)
We had a session a few years ago on enwiki where we went around replacing most instances of "anonymous" with "unregistered" or similar, especially in the interface and templates. It's not really used in any public-facing places any more. There's only so much you can do to stop the locals using the language. Maybe these messages should be moved (or duplicated) closer to the "Publish" button. At the moment they're stuck at the top of the page where no one looks. zzuuzz (talk) 23:14, 2 August 2019 (UTC)
Thank you, Zzuuzz. That is why I carefully chose the word "casually". What do you think about the concept of an affirmative click-through warning for IP editors? Cullen328 (talk) 23:53, 2 August 2019 (UTC)
It doesn't strike me as particularly unreasonable, though I can think of two issues: if you're ticking that box, there will be an imperative to have another box agreeing to copyright, or T&Cs. And it will be annoying if you have to tick it for every edit. zzuuzz (talk) 00:13, 3 August 2019 (UTC)
I think it's possible to cover the bases without making it too long that people universally ignore it. Checking the box could also place a cookie that prevents it from appearing again until the cookie expires or is deleted. Concept notice:
By checking this box you agree to the following:
  1. Your IP address will be published with your edit, which may reveal your location or other personal information. (You may register an account to avoid this.)
  2. You may not copy text from other websites, or commit other copyright violations.
  3. You agree to all other requirements of the terms of use. [No one will actually read it, but it just seems so ordinary to include it.]
Part of the reason I framed it this way is that it occurs to me, we make it very clear that if you edit Wikipedia without an account, your IP address will be published. But a lot of the same people who freak out that their IP address has been published don't really know what it is or how it works. Rather than provide a long explanation that gets ignored, we can just plop down the worst likely outcome. Someone who is totally clueless about how the internet works still won't understand it, but at least they should understand along the lines of "something something privacy", which I think is good enough - the goal is just to make sure people know that editing Wikipedia is not necessarily anonymous. Someguy1221 (talk) 00:39, 3 August 2019 (UTC)
IMO length and wording are only likely to be a minor part of the problem. The bigger issue is that people have been trained over many years to ignore boiler plate. I find it doubtful that ticking will make much difference. It will be the case that most people's sole task will be box hunting and as for the text it will still be the case of, sure whatever DGAF, that is until they do after they've submitted and later find out they revealed something they didn't expect to. I mean even if you posted something completely ridiculously like 'by submitting you agree that we hunt down and permanently keep and redistribute any content including photos and videos you have submitted anywhere on the Internet including all social media and chat apps regardless of any privacy settings' and that's the only thing we show the number of people you will get who idea don't submit and maybe make a freak out post somewhere or do submit and tell us haha funny joke will be small because very few read it. This doesn't mean that such proposals are useless but simply we shouldn't overestimate the effect they will have on people not realising what they are doing and being annoyed when they find out. Nil Einne (talk) 04:55, 3 August 2019 (UTC)
However, if someone clicks a graphically bold and large and briefly and clearly worded disclaimer, they have no basis for complaining later about their IP address being logged. If we lose a few typo corrections while also continuing to effectively block and deter endless legions of vandals, trolls and spammers, then I see that as a net benefit for the encylopedia. Cullen328 (talk) 06:35, 3 August 2019 (UTC)
Support This actually sounds like a reasonable idea, unlike the original proposal of making it as hard as possible to stop vandals. The relevant part should probably be red for danger, so it has more chance of being noticed the first time by people who don't read everything in detail. It could even flash mildly for a moment when the box is ticked, but that's probably going too far. Cyp (talk) 07:35, 3 August 2019 (UTC)
I agree with Cullen328 that the term “anonymous editor” is harmful. Historical engine developers probably thought of anonymous ≝ ¬named definition, but with respect to Internet privacy this terminology is stupid. Incnis Mrsi (talk) 08:47, 3 August 2019 (UTC)
  • This is the best solution on this page. Add a checkbox below the minor edit one that says "I consent to having my IP address published alongside this edit, and acknowledge this may reveal my location or other personal information. (Please register an account to avoid this.)". The "Publish Changes" button is disabled unless this checkbox is checked. MER-C (talk) 16:03, 5 August 2019 (UTC)
  • We should display the actual IP address in the message, as many readers won't have a clue what it is. Ideally there would also be a link to show the sort of information that this would reveal, and to show edits recently made by this IP. There should also be the opportunity to log in or create a username at any stage in the edit process without having to start again from scratch. At the moment if I have previewed changes and then click on log in, I get a popup saying I am leaving the site. AlasdairW (talk) 22:29, 21 August 2019 (UTC)

Complete anonymous editing is not a right[edit]

If there is no law requiring WMF to hide the IP addresses of anonymous editors, then why do it? Furnishing some mark of identification -- such as an IP address -- ensures responsibility. Allowing complete anonymous editing opens the gates to all of the vandals & trolls, & will drive away all of the volunteers who handle these destructive edits. But if the Foundation insists on going this route, I invite them to take on the duty of finding & undoing them. Because the rest of us will shut down our accounts & move on to another hobby. -- Llywrch (talk) 22:18, 2 August 2019 (UTC)

Anybody editing with an account is completely anonymous to the public unless they choose not to be. 2A01:E35:39AE:B540:BD22:78D9:FBE5:D3B9 12:16, 3 August 2019 (UTC)
Sorry, but it is pretty clear that willfully publishing the IP address is problematic, see for example Fieldfisher: Can a dynamic IP address constitute personal data? (English). Note also Hva er en personopplysning? (Norwegian) and Dynamiske IP-adresser skal regnes som personopplysninger (Norwegian). The last one points to Court of Justice of the European Union – PRESS RELEASE No112/16: The operator of a website mayhave a legitimate interest in storing certain personal data relating tovisitors to that website in order to protect itself against cyberattacks [1] (English). This refers to “Directive 95/46” aka w:Data Protection Directive [2]
In my opinion; we have a legitimate right to keep the IP address of editing users to prevent vandalism, and to give authorized users access to that information, provided they are identified and their activity tracked. We are not allowed to publish the IP address of editing users, whether they are good editors or vandals. — Jeblad 10:40, 22 August 2019 (UTC)

Oppose[edit]

Just to make it clear I had made this a separate section. No discussion for community in the early phrases in Phab makes me a little sore. Per Vituzzu in the section above of mine, this is a crazy idea. I always know Wikipedia as the open encyclopedia which everyone will edit, anon also. From my experiences, it doesn't deter anons from editing. Take Simple Wikipedia for example, we have very productive anon editors and quite a lot of fine articles are from them. I don't see anons being deterred from editing and the edit notice is so obvious (I see it when I am accidentally logged out for instances). For those who doesn't want to create an account, I don't think so that their privacy will be compromised that much. If they are so, we have stewards, oversighters as well as admins who are willing to make that supression fast. I find my supression requests are often handled immediately when it comes to logged out editing. I will say I also at times made logged out edits (which are in compliance of socking policies) when I am in places that are unsafe to log in / just simply caught in system lags - I am fine with them. I know it may be because I am from a country with the densest population 5.6k in 700+square kilometers and there's isn't a way to find me that easily. However, the deal isn't great. I just cannot see how this will enhance privacy. In fact, if you want to reduce privacy concerns, it's to remove the cookie that is tracking every of my edits (which is what CUers use) and get flagged out everytime I did a virus scan with AVG as privacy concerns. Privacy enhancements and abuse mitigation are terms on polar opposites and on different tangential planes, you can't have both together. We are facing constant abuse everyday, and as SWMT members, we are almost on the frontline daily. When tools such as guc breakdown, or any other xtools that are not working, we are plain helpless. We need more aid to recognize vandals, and with ID being masked, it will makes existing tools to be depreciated or needs upgrading which will almost forever be later than these new rollouts. I will say this will open the Pandora Box for more Abuse not less. However, if the foundation wishes to push through, here are some suggestions. First, extensive community consultation needs to be taken, and I expect no less than 1 year with almost every community having someone represented, although flawed in some ways and not the most perfect, something like the vote to start the GS group will be ideal. Second, I expect that some trusted members in the community to be still able to see the information, we have already backlog at SRCU, I hope it will not be only stewards able to see the information, that will overburden them too much (can we have a way to really stop spambots?). For individual projects, if administrators are allowed access (and Edit Filer Manager) will be good. For those who are afraid that there will be sysop abuse, we can have something like OC for CU/OS. For crosswiki, I hope that GRs will be able to have the right to see IPs crosswiki, if you don't trust GRs, then at least GS. All in all, Oppose Oppose. --Cohaf (talk) 12:18, 3 August 2019 (UTC)

It is said that this is only a rough proposal telling the community what WMF is going to do in following years. There's no plan yet. It is also promised that "we will not be making any firm decisions about this project until we have gathered input from the communities". From my perspective, WMF is determined to hide IPs at some point in the future, though the timeline is totally unclear. With the discussion goes deep, we could probably find solutions to concerns you mentioned above. Of course it is impossible to hide IPs on the basis of current user group settings and counter-vandalism tools.--Tiger (talk) 07:36, 24 August 2019 (UTC)

IP address attribution[edit]

Is this compatible with the CC-licensing?[edit]

The reason all editors are recorded is to provide the necessary attribution wrt to CC BY-SA, is it not? I'm not a copyright lawyer but if IP addresses (that clearly can identify a person making an edit) are replaced with random strings, I fail to see how anyone reusing material successfully would be able to fulfill the "BY" part of the license... Regards SoWhy 19:42, 3 August 2019 (UTC)

IP addresses are not tied to a specific editor either, so it's a moot point. Whether it's an IP address or a randomized name tied to said address, the effect is the same. — The Hand That Feeds You:Bite 20:39, 3 August 2019 (UTC)
It would be logical to change IP edits to CC0 instead of CC-BY-SA. You can then add "If you want your edits to be attributed, please create an account and choose a username for that attribution to be made to". Attribution to a pseudonym makes sense, but there is a paradox in combining attribution and anonymity. WereSpielChequers (talk) 05:30, 7 August 2019 (UTC)

IP address attribution is just rubbish[edit]

In my opinion, the most important thing is to stop using IP addresses as attribution for anonymous edits. An IP address is a terrible user identifier, because they have a many-to-many relationship with users. Some IP addresses have multiple users, and pretty much all users have multiple IP addresses. Some users migrate from one IP address to another on a timescale of minutes, as they move from one cell to another. Among other problems, this breaks user talk page notification, making it nearly impossible to contact them.

The IP address is confusing to non-technical users -- a constantly-changing bunch of numbers. As long as we treat the IP address as a username, the IP address will be sprayed all over discussion pages, in signatures and replies. Using IP addresses for attribution prevents the smallest move towards privacy. We can't even hide them from Google. They are stuck in the discussion pages, archived and searchable forever.

The minimum change that we can make here is to stop using the user's IP address as a username. Instead, we will use an identifier linked to a cookie. This will greatly improve the stability of anonymous usernames, as seen in history and discussion pages. The argument against it is that technically-competent vandals will clear their cookies -- but surely such vandals are already creating throwaway accounts? We can rate-limit anonymous session creation just as we currently rate-limit account creation.

If we stop using IP addresses as attribution, then we'll be able to control who gets to see them. I would be fine with allowing all logged-in users to see the IP addresses of anonymous users, and to search for such contributions by IP range. It would be a big enhancement to privacy to just get the IP addresses out of the search engines and the talk pages, and to implement expiry, so that anonymous IP addresses are deleted after three months, as for CheckUser data.

This is a summary of the position I put forward when I was privately consulted on this initiative. It is my personal opinion. -- Tim Starling (WMF) (talk) 11:36, 5 August 2019 (UTC)

Thanks for your summary. After reading it I feel a bit positive about this proposal. It's true that leaving a message for IP's is almost useless, because they will change, and later someone else, an innocent person using the same IP, will see all the warnings. Stryn (talk) 11:55, 5 August 2019 (UTC)
Totally agree. Use real pseudonyms, and give the user some way to assign pseudonym edits to a real account. IP addresses as as so-called “anonymous edits” are plain nonsense. Local law in Norway may be interpreted to allow IP addresses as pseudonyms, but I doubt it was the intention. It says “the author is entitled to be named as good practice indicates, if naming is practically possible” (opphaveren [har] krav på å bli navngitt slik som god skikk tilsier såfremt navngivelse er praktisk mulig[3]) — Jeblad 15:33, 5 August 2019 (UTC)
Note that I only partly agree to “we will use an identifier linked to a cookie” as I don't believe this is feasible for anyone but good behaving anonymous editors. It will not solve the vandalism-fighting problem, but Tims post is not about vandalism-fighting – it is about attribution. — Jeblad 12:23, 16 August 2019 (UTC)
Masking IPs is one thing but if "anonymous" edits are tracked with a cookie that links the edits together even if the user changes IP address (does the coookie follow to a different network, too?), then don't we risk linking edits that should not be linked for privacy reasons? What if a good faith editor purposefully wants to split their edit history by getting a new IP, but the cookie links their "anonymous" edits together? There could be cases where even a registered user's anonymity is at risk, if they accidentally log out and make an edit that is easily linked to them, and then don't realize that a cookie is now connecting all their edits, even if they switch to a different IP. -kyykaarme (talk) 22:12, 5 August 2019 (UTC)
I agree with Tim that "IP address attribution is just rubbish". (Particularly with IPv6!) As discussed at mw:Requests_for_comment/Exposure_of_user_IP_addresses there better ways. Nemo 15:45, 6 August 2019 (UTC)
Some thoughts on this
What if a good faith editor purposefully wants to split their edit history by getting a new IP, but the cookie links their "anonymous" edits together?
They would use the "private browsing" feature in their browser. That would isolate the cookie. That's much easier than changing IP addresses. Also, the edit page warns anon users about their IP being recorded - in the future, the edit page can just show the anon user ID that will be recorded (along with a warning that some group of users will still be able to see the IP, at least for a while - I'd assume this would use the checkuser data, which is purged after 90 days iirc).
There could be cases where even a registered user's anonymity is at risk, if they accidentally log out and make an edit that is easily linked to them, and then don't realize that a cookie is now connecting all their edits, even if they switch to a different IP.
I think that's already the case - the session ID persists even after logout, I think -- but I may be wrong about this. In any case, fingerprinting techniques make it pretty easy to track individuals, if some one puts their mind to it. WMF has the theoretical ability in any case. The difference is what information is made available to other users.
Overall, I agree with Tim: we should stop using IPs as user names. To preserve vandal fighting abilities, the IP addresses of anonymous users should be accessible to a restricted user groups for a limited time. Probably not all users, but at least admins. Perhaps some new group that can be used to give some non-admins the right as well. (This is my professional opinion, but doesn't reflect any official stance of the WMF) -- DKinzler (WMF) (talk) 12:26, 8 August 2019 (UTC)
Nitpicking: IPv6 is often setup so that a single home connection keeps the same block of addresses (usually /64 or something like that). In that way IPv6 addresses are at least as attributable as static IPs. If there's only one person using that connection in that home, that person is all but deanonymised to any organisation that can correlate the first digits of IPv6 with their own access/user/shopping, etc. logs. Daß Wölf (talk) 07:23, 19 August 2019 (UTC)
DKinzler (WMF), I understand where you're coming from, one thing however: I'm not sure that the privacy protection issue is more important than addressing vandalism (much of which is impulsive/spontaneous rather than intelligent), sophisticated disruptive editing, PoV pushing, paid editing, and vote stacking, all which which are far more sinister and can go undetected for years, and which pose an enormous burden on our small teams of dedicated unpaid maintenance workers and admins. Whichever, this entire issue requires first-hand, empirical experience from the people at the WMF charged with this proposal. A few solid hours at each of New Page Review, Articles for Creation, and Pending Changes is needed, together with a first-hand feeling for the challenges our Check Users are faced with nowadays (e.g. the 90 day limitation, among others).
Secondly, while the creation of a new user group who can view the IP addresses might sound feasible to the WMF, in actual fact our en.Wiki is growing increasingly hostile to the creation of even more user rights and with it, the increased bureaucracy. Also, our admins are probably not favourable to their remit being increased with additional tasks.
As I've said already, the only real and practical solution is to require registration from everyone. The Wikipedia would still be The encyclopedia anyone can edit - if they first register then comply with our policies and guidelines. That old mantra was created in the days nearly a generation ago when no one realised the global size and impact Wikipedia would have today and when new users were urgently needed to grow the corpus and the content of its stubs. It was a great mistake not doing it in the first place, but there are a lot of people who still ideologically interpret the semantics of that slogan very fiercely and it might not be easy to convince the community that now is the time for change to registration for everyone. With good arguments, it should be tried first, before any top-down technical solutions are imposed on the volunteer communities under the guise of privacy. The WMF alone, whatever their opinion, would not be entitled to make such a decision against registration - it would be a local en.Wiki policy.Kudpung (talk) 03:09, 12 August 2019 (UTC)
Hi Kudpung! Privacy protection isn't more important than vandal fighting - nobody proposed to make vandal fighting impossible to satisfy the desire privacy. The question is how we can have a maximum of privacy while keeping vandal fighting as easy as possible. I hear you about the burden of admins and editors - both Tim and myself had been active Wikipedia admins for a long time before we became hired staff (though admittedly, our experience may be dated).
As I understand Tim's proposal, it would provide anonymity against the public, while still providing access to anon's IPs to any user group that the local community thinks should have it. It could be granted per default to admins, or to autoconfirmed users, or to all users, at the community's discretion. So basically, it would mean that anon's can't see other anon's IPs, but vandal fighters still could (at least for 90 days, but that period could be adjusted as well). In contrast, forcing people to create an account would potentially lead to a flood of fake accounts, with no way to block IPs (or IP ranges) from creating more (at least not without the involvement of people with checkuser rights).
These are considerations from the technical side - I won't go into whether it would generally be good or bad to require editors to register, as I'm not an active editor any more. -- DKinzler (WMF) (talk) 10:39, 13 August 2019 (UTC)
DKinzler (WMF): This is the best response a staffer has provided to anything on this page so far, thank you. First, I think that 90 days is not long enough, and I'm not sure anything is long enough. The most subtle vandals are often able to go years without being detected. Being able to look at all of the thousands of edits they have made, ever, is tremendously useful for figuring out which subtle vandalisms never got reverted. Obviously this is a limit that checkusers have to put up with regarding registered users, and must allow some people to slip by, but this would just allow even more bad editors and bad edits to slip by. Now, I am pleased that at least some of the people inside this proposal would be okay with local communities deciding who gets to see the masking information, but I honestly wonder if you would actually be making the privacy problem worse. Right now we have an issue with people who either A) ignore the warning; or B) don't know what an IP address is (; and small subset C) people who know exactly what it is, but after getting upset at being reverted want the evidence deleted). Anyway, the people in either group then freak out when they notice their IP address is in a bunch of places (and/or find out what it is), and worry about privacy. If you mask IPs, those people probably won't realize. They will think they are anonymous when they are not, and they may say/reveal things with that false sense of security that they would not otherwise. It brings to mind some recommendations for your team: A) Just do nothing. Absolutely nothing; B) If you ignore "A", keep IP data for masked users for a timescale of years, at least, if not simply forever. If they want more privacy register an account; C) the mask should not be a completely random thing, but rather should be partially connected to ISP/subnet/geography even after the underlying data is removed. That way we can still search the range for years-old vandalism after a pattern has been established, such as realizing that every edit to articles on mountains from a specific mobile provider in a specific city was subtle vandalism. Even if it's a busy range with many users, you can search the contribs and find those, and I have done it myself; D) The IP user should not realize that their IP is masked. Since so many ISPs have very rapidly changing IPs now, this would have to be tracked with a cookie or else each edit seems like it's coming from a new user. But basically, the server gets any request for a page that contains revision-author data from a user with an anonymous-editing cookie, it should remember which edits that user is responsible for, and make sure that those edits have the IP unmasked. This avoids the problem of people thinking they are more anonymous than they really are. They can still ignore warnings and not know what an IP is, but that issue already exists anyway, and this won't make it worse. This does nothing for people who don't allow cookies. Someguy1221 (talk) 23:39, 13 August 2019 (UTC)
@DKinzler (WMF): On the “fingerprinting techniques make it pretty easy to track individuals”; when people on- and off-wiki talk about fingerprinting they are refering to several different techniques. We should perhaps be clearer on what is fingerprinting the user, the device/browser, and the content. I know I am slightly sloppy myself when referring to “fingerprinting”. There are also several different techniques for fingerprinting the user, but most references about use on text-based systems like Wikipedia discuss w:keystroke dynamics. Even for keystroke dynamics there are different algorithms available. Some articles seems to imply that current implementations of non-cooperative user fingerprinting works well enough to create privacy problems. Used together with other techniques, like browser fingerprinting it seems likely we would be able to track anonymous users. — Jeblad 17:00, 16 August 2019 (UTC)
@Jeblad: Why are you even mentioning "keystroke dynamics"? That's just ridiculous in the context of this proposal. Bitter Oil (talk) 22:03, 16 August 2019 (UTC)
Thank you for your friendly reply. If you look at the first line of my post you see I mention “DKinzler (WMF)”, and then I quote what he said. The rest of my post follows from that. Keystroke dynamics is one of several techniques to track users and thereby do abuse mitigation. I believe that is one of the two core problems the subject page tries to address. — Jeblad 23:08, 16 August 2019 (UTC)
Analysis of keystroke dynamics is one of several things that could be done to "fingerprint" users but won't be done for pragmatic reasons. This proposal isn't really about "abuse mitigation". It's about hiding IP addresses. That in itself is creating the need for anti-vandal tools which, if provided, will leave the community only slightly less able to deal with vandalism. I'm sorry of that sounds cynical or if my response seemed unfriendly, but I see no reason to pretend that this is anything other than a discussion of how to hide IPs while causing the least harm to the community. Bitter Oil (talk) 02:08, 18 August 2019 (UTC)
If you go to the top of the page you find the title. It reads “Privacy Enhancement and Abuse Mitigation” (my emphasis). You can enhance privacy by hiding the IP addresses, but then you make the current techniques for abuse mitigation difficult. Somehow you need to figure out where are the vandalism done, who can you trust and who do you distrust. The current tools are not very good, and it is not very difficult to make better tools. A lot of the people arguing on this page seems to assume the current tools are flawless and excelent, but they are not. They give a false impression of assurance and more often than not they fail badly. — Jeblad 09:22, 18 August 2019 (UTC)
@DKinzler (WMF): You say that "forcing people to create an account would potentially lead to a flood of fake accounts". Any account created is a real account. I believe you are using "fake accounts" to mean accounts created by vandals. Have you taken a look at the account creation logs on English Wikipedia? Hundreds, if not thousands, of accounts are created every day. Many of these accounts are never used to edit. I have never seen a credible explanation for this phenomenon. Perhaps analysis of the current situation should be done before embarking on this project. Bitter Oil (talk) 22:39, 16 August 2019 (UTC)
A very interesting research topic, indeed, but I'm pretty sure it is way outside the scope for this proposal. — Jeblad 23:11, 16 August 2019 (UTC)
It is outside of the scope of the project, but it should not be. We don't seem to know why so many accounts are created but never used. We don't seem to know if IP edits are a net benefit or a net detriment. These are really basic things to understand before you start talking about changing how things work with IP editors and creation of pseudo accounts. Bitter Oil (talk) 02:19, 18 August 2019 (UTC)
  • I dispute the idea that anyone technically adept enough to avoid cookie/session tagging would also make multiple accounts. Just an incognito mode would be enough to stop the cookie method and is very easy, whereas more effort must be utilised to make accounts. Additionally, some individuals set their systems to block all cookies anyway - they'd trip over session-rate limits. Nosebagbear (talk)
  • @DKinzler (WMF): I think as has been mentioned below, there'd probably be room for compromise if viewing was attached to an automatic right - I think only seeing at autoconfirmed would be definitely agreed and probably at ext-confirmed (plus the user can see their own, so they're aware others can). But that would need to be granted as a basis before any progression - I would really not want the project to go "okay people are happier now, let's progress" then get to the end and T&S have talked to you and changed it to admins-only. Nosebagbear (talk) 08:36, 19 August 2019 (UTC)

What about third-party edit-tracking?[edit]

One of the most valuable third-party tools for Wikipedia is the Twitter bots which monitor edits from within parliaments and governments to Wikipedia pages.

These bots include:

The code to do this is open-source, at https://github.com/edsu/anon

As far as I can see, this proposal will break these very valuable tools. Has anyone even notified the developers and bot-owners? --BrownHairedGirl (talk) 21:36, 3 August 2019 (UTC)

Hi BrownHairedGirl, yes, these accounts are common in a number of languages and we're aware of how they work. This is in the very early stages and we are inviting the communities to give feedback because we genuinely don't have a plan for where to draw the line and what will happen. They have not specifically been notified, because nothing has been decided and this will take quite some time to figure out. /Johan (WMF) (talk) 23:36, 3 August 2019 (UTC)
@Johan (WMF), I am glad you are aware of them. Thanks for confirming that.
However, as a matter of principle I would very strongly urge you to notify all stakeholders at this preliminary stage, rather than waiting until you have made a decision which would effect them.
May I ask whether the WMF has done any research to map out:
  1. How many external projects use the IP data?
  2. How the IP data is used on-wiki by people without checkuser authority?
It seems to me that both groups are likely to be extensive, and that hiding the IP addresses will mean that #1 causes a big loss of functionality, and #2 both dumps a lot of extra burden on checkusers such as User:Bbb23 and significantly reduces the ability of the community as a whole to analyse the activities of anonymous IPs.
If such a map exist, it needs to be shared.
If it doesn't exist, it needs to be made. Soon. --BrownHairedGirl (talk) 23:49, 3 August 2019 (UTC)
Characteristically, none of the items is related to editing from within any body in the United States of America. Incnis Mrsi (talk) 07:57, 5 August 2019 (UTC)
Personally, I would be fine with extending these facilities to also track logged-in edits. We could apply a change tag to all the edits that come from certain configured ranges, so that such edits would be tagged in Recent Changes whether or not the user is logged in. The Twitter bots could then query the change tags instead of filtering all edits by IP range. But I gather this is not a very popular position. -- Tim Starling (WMF) (talk) 10:59, 5 August 2019 (UTC)
In some countries it may not be legal at all to publish the IP addresses, they are viewed as information that can identify a person. Either the user chose to self-identify, or you should provide real anonymity. Tagging a person with location, even if it is just something that resembles an IP address, could be a breach of current law. I have always found WMFs policy on this very strange.
A very real example: Some years back w:Norwegian Police Security Service pushed very hard to get the IP address of a person that published internal information. That stopped abruptly when that person by accident logged in so-called anonymously. Later I've asked w:Norwegian Data Protection Authority about use of IP addresses, and they told me that it was okey if the use of IP addresses was rooted in a real need and internal use. We publish the IP address openly. Real need? Internal use? Err… — Jeblad 15:24, 5 August 2019 (UTC)
As noted at #IP address attribution is just rubbish by Tim, changing the display name (ex rev_user_text, now actor_name) does not automatically imply that the IP address it would be hidden from everybody except checkusers, only that it would not be irreversibly propagated everywhere as it is now. Between the privacy currently given to registered users and the total lack of privacy for unregistered users, we have a sea of options.
It should also be said that this kind of transparency only targets the least harmful individuals, those who don't know what they're doing. It also doesn't work as deterrent, because if a person knew their IP address can be attributed to their employer they would register. Nemo 16:00, 6 August 2019 (UTC)

"IP address masking product development meetings last year"?[edit]

@Ryan Kaldari (WMF): On July 18 on [` phabricator], tstarlings asked

As discussed in the IP address masking product development meetings last year with @kaldari and @DannyH, I don't think it is appropriate to change actor names to a hashed IP address, as you propose for phase II, when we can't practically change the IP addresses in page content, for example in talk page signatures.

So there were development meetings in 2018 about something that according to the August 1st announcement "is currently in very early phases of discussions". It might be useful if the minutes of those meetings were made available so that editors can see what has already been discussed, debated, and proposed. Thanks. Bitter Oil (talk) 03:44, 4 August 2019 (UTC)

@Tim Starling (WMF), Ryan Kaldari (WMF), Johan (WMF), NKohli (WMF), DannyH (WMF), MER-C, Risker, Pigsonthewing, and Bitter Oil: Ideas and suggestions are fine, but this proposal is one on which the volunteer editors and admins who do the actual work and who have the best empirical experience at vandalism combat, should have the final say if-and-what solutions would be appropriate. In view of sunk cost, it's essential to inform the community/ies of a cost/benefit/user time study before any of the donations generated by community volunteers' work are allocated to this development. Kudpung (talk) 05:33, 4 August 2019 (UTC)

Hi @Bitter Oil: and @Kudpung: We've been thinking about this problem for a while. Ryan and I had meetings with people from the Technology and Community Engagement departments in the summer of 2018 to talk about whether going forward with this project was even possible, from a technical and social/cultural point of view. Tim made a proposal for how the back-end change could work, and we talked about what the general impact could be. We needed to talk to the WMF executive team about whether this was a good idea or not, and it took several months to get to a go/no-go decision. Once the decision was made to go forward, we added it to the annual plan for a product team to work on in the new fiscal year, which started last month (July 2019).
We did reach some conclusions during those summer 2018 meetings that are discussed on the project page here. One was that changing the IP addresses in signatures on every revision of every existing talk page would be completely infeasible, as Tim says on that ticket. Another was that before we make any real plans, we needed to evaluate the use of IP addresses in anti-vandalism/sockpuppet workflows (which Claudia posted at Research:IP masking impact report), and then talk with the community about our ideas. I wouldn't call those meetings "product development meetings", because there wasn't a product team involved, and we were just exploring the possibilities.
I agree that talking to the community is absolutely essential, and we posted this page as soon as we could. Claudia did the pre-research in May and June, and wrote the research report in July, so that we could post it here when we made the first announcement. The Anti-Harassment Tools team has been working on this for about a month so far, and what you see on the page is as far as we've got, besides some initial technical thought-experiments on that Phabricator ticket. I understand the desire for the WMF to be completely transparent on the initial discussions of a project, but we talk about possible projects all the time, and most of them don't go beyond the discussion stage. We don't want to waste the communities' time on half-baked ideas that we're not sure we believe in ourselves yet. On this project, we wanted to do the preliminary research and have a team in place, so that we were prepared enough to know what we're talking about when we started talking with the community. -- DannyH (WMF) (talk) 18:27, 4 August 2019 (UTC)
Some questions that might help with understanding from the community side: What has executive signed off on? From the perspective of whoever approved this, is the intent to protect the privacy of anonymous users, or to fundamentally restructure how we allow contributions and respond to abuse? What sort of resources do you have available to revamp/change/expand anti-abuse tools to the extent necessary to compensate for this change (which would be very significant)? – Ajraddatz (talk) 18:53, 4 August 2019 (UTC)
Ajraddatz: The execs signed off on protecting the privacy of anonymous users, and in order to do that, making sure that we can improve/build anti-abuse tools to compensate for the change. We expressed very clearly at all levels that this is a very big project with very big risks, and it requires a serious investment in a product team dedicated to the problem full-time. We can't deploy a quick change that puts the wikis at risk of being completely overrun with vandals and bad actors; it has to be approached with caution, and deep partnership with the communities. -- DannyH (WMF) (talk) 19:14, 4 August 2019 (UTC)
Perfect, thanks for the clarification. – Ajraddatz (talk) 19:32, 4 August 2019 (UTC)
Thanks, DannyH (WMF). Your answer allays my concerns. As long as you were just exploring the possibilities and that your talks were not detracting from actual urgent software development work, that's perfectly fine. I hope that the WMF does not ignore the fact however, that a great many volunteers are also very competent at coming up with ideas too - especially in the areas of control over vandalism and new content. Their disadvantage is they are not funded to have meetings in an office and discuss them. I hope this current discussion will eventually record a consensus for the best way the community would like to go. Kudpung (talk) 18:57, 4 August 2019 (UTC)
@DannyH (WMF): What's your reasoning for thinking it would be a time waste to have transparent discussion about new ideas when they are in the draft phase? It might be a time-waste to notify a lot of people, but if they draft ideas would be in a place where only people who actively want to engage with them I only see benefits. Involving the community in the idea draft phase would reduce the feelings of the community that the WMF has a different agenda then the agenda of the communities. ChristianKl❫ 08:08, 27 August 2019 (UTC)

Archiving this proposal[edit]

This is a non-existing problem since a workaround exists for almost two deacdes. The easiest way to mask one's IP is to create an account. Problem solved. End of discussion proposed. --Matthiasb (talk) 01:28, 5 August 2019 (UTC)

Why should anybody administratively shut the discussion up? Pro-(unregistered privacy) activists are unable to say anything but “allocate an automatic generated username to these edits it gives us new opportunities to track vandals and bad edits”, whereas they are offered a saying for a while. It is good – Meta-wiki is visibly not under heels of any clique, even stewards (some of whom vehemently oppose this stuff). Incnis Mrsi (talk) 08:12, 5 August 2019 (UTC)
The more I think on it, the more I see Matthiasb has a point: what is the point to anonymizing IP addresses? So a user with the IP address (to pick one at random) 127.0.0.1 is now known as Anonymous Editor 666. What privacy is gained that is different from having that person create a throw-away account? Doing this work makes no sense to me. I'd like one of the people advocating for this change to explain how it is a benefit. -- Llywrch (talk) 22:09, 9 August 2019 (UTC)
You can't do a whois search on Anonymous Editor 666, whilst you can find the approx location and ISP of an IP. OxonAlex (talk) 05:25, 19 August 2019 (UTC)

Tools first[edit]

It would probably be best to improve, develop and test these new, and existing tools to deal with vandalism + abuse first, and then talk about this idea. This could be a good idea if carefully planned, with mature, tested tools in place well before implementation. SQLQuery me! 01:37, 5 August 2019 (UTC)

I was amazed that I had read as much as I had and not seen someone point this out and was thinking I'd have to do it. So I agree with SQL that the tools should be the first part of the development. Best, Barkeep49 (talk) 04:02, 5 August 2019 (UTC)
I'd like to third that. Having had more time to mull this over, I think the reason I find myself so opposed to this project, and I assume I'm not alone in this, is that I'm always wondering how much vandalism and spam we are missing. We routinely uncover things that slipped us by for years, so it could be a lot, and I don't want the foundation making it even harder for us. But if we got to a point where we could honestly say, "actually, masking IP addresses from most non-admins won't have an impact", then poof goes that opposition. I don't know how the foundation could actually get us there, but that is where we need to be. Someguy1221 (talk) 04:08, 5 August 2019 (UTC)
I also echo SQL. Tools needs to be developed first. Even with current set of tools vandalism/spam go unnoticed and having this implemented before that will make it worse. ‐‐1997kB (talk) 04:35, 5 August 2019 (UTC)
I think before a final decision is made here, it's essential that we know what tools need to be made in order to help editors and admins still identify problem editing patterns before they have to request a checkuser - and those tools should already be designed and tested on multiple projects (including large projects) to ensure they're fit for purpose. (They'll be useful regardless, even if we don't decide to mask IPs). - From my comments above. Yes, we really need to have the tools first, designed and tested. Risker (talk) 04:39, 5 August 2019 (UTC)
Develop tools before we even talk about it? Err… — Jeblad 14:50, 5 August 2019 (UTC)
I share the concern that the privacy "fix" should not be implemented without appropriate tools being introduced either before or at the very least at the same time. But the tools required will depend largely on the model used for the privacy fix. – Ajraddatz (talk) 14:58, 5 August 2019 (UTC)
@Jeblad:, Yes - once the tools are in place it should be easier to decide if this is workable. I don't really even see any cogent ideas for tools yet. Regardless of the form that this takes, there are many complex common problems that need tackled - such as the Open_proxies section above that appears to have been completely ignored. I'd be happy to help write tools (I've got some experience) if desired. SQLQuery me! 14:24, 6 August 2019 (UTC)
@SQL: You discuss ideas, prod them to find faults and alternate approaches. If you don't discuss an idea it will lead to design flaws. — Jeblad 16:11, 6 August 2019 (UTC)
Great, let's discuss ideas for tools that can support this. Let's develop them, test them, let them mature in a live environment until consensus is that we no longer need access to IP data, and then we can discuss this idea. Win-Win. SQLQuery me! 01:12, 7 August 2019 (UTC)
  • I have to agree with SQL. I have some reservations about removing the IP information, especially from vandal fighters. Supposedly we're a long way off from this, but things seem to "pop up" on wiki. IJS. Ched (talk) 17:45, 5 August 2019 (UTC)

Fails to stop governments or enterprises knowing IP[edit]

One of the stated goals of this is to prevent governments from knowing the IP number of somebody editing. However this proposal does nothing to stop that. Any government that cares already would have access to data travelling on the internet in their country and would be easily able to tell the IP address doing the actions. You may think that the contents of the session are encrypted. But the number of server requests, the size of the data and the time it happens will be easily matched to logs to see what was changed. This can be determined in the reverse direction too. So if an eavesdropper wants to know who did an edit to an article with an obscured IP address, they just need to look at the time it happened and then check if their internet traffic matched the edit to find the IP address. Then using other metadata they could track down the actual location and perhaps person. This is one of the main reasons given to obfuscate addresses but cannot achieve that purpose.

Private companies and other agencies from which people may edit internally will also have their data going through a company firewall that will be logging similar information. So those organisations, too, will be able to figure out who the internal person is that is editing Wikipedia. Obfuscation will make little difference to this. The main people that will lose access to the IP behind the editing would be the non privileged majority and the readers of Wikipedia (and those mostly don't care). I will quote Luke 8:17: "For nothing is secret, that shall not be made manifest; neither any thing hid, that shall not be known and come abroad." Graeme Bartlett (talk) 12:28, 5 August 2019 (UTC)

Use of private proxies makes the matter less obvious than explained above. I know a user from Europe who evades Wikipedia blocks via such proxy. Neither the user’s employer nor the local government may learn of the fact using such a straightforward data mining… yet some attentive and experienced Wikipedians may identify the person (with the present wiki engine) by IP information. Incnis Mrsi (talk) 13:03, 5 August 2019 (UTC)
Given that proxies are generally blocked on site at wikipedia, I'm not sure that argument holds up that well Nosebagbear (talk) 13:26, 5 August 2019 (UTC)
This is another problem, and I don't think it is within the scope of this proposal. Only real solution I know of is onion routing, aka TOR, but that comes with its own problems. — Jeblad 14:47, 5 August 2019 (UTC)
… compared against diff which blazingly falls within the scope of this page. Incnis Mrsi (talk) 16:13, 5 August 2019 (UTC)
I must say I find your way of arguing quite intriguing! Mitigations for communication introspection and man in the middle attacks is quite another thing. Especially forced use of state-issued root certificates is highly problematic. It should be done, but probably not within this project. — Jeblad 17:42, 5 August 2019 (UTC)
@Nosebagbear: the proxy from my example is not blocked and may not be blocked as such. Again, it is not an open proxy, it is a private proxy. Please, re-examine relevant policy. Incnis Mrsi (talk) 16:41, 5 August 2019 (UTC)
Graeme Bartlett: This is a relevant comment and we should be aware that there are limitations to what IP masking can do, just as registering a user won't necessary hide you from a state actor, but having worked in this sector before joining the Wikimedia Foundation, for a while with trying to keep the wrong government-owned ISPs from getting their hands on the wrong technology, I left that field with these impressions a couple of years ago:
  • There's a rather large difference in capabilities of different bad actors, including governments, even on this level. We might not be able to do anything about actor X, but the same measures could be more effective against actor Y.
  • It matters if we make something easy, or slightly more difficult. There's a cost to looking up things, and there's always a budget. If we make it more time-consuming, something that is possible can become far less likely.
  • On that topic, for example a combination of a small token of randomized size for each request together with perfect forward secrecy would make this more expensive to track, especially for single edits.
  • There's quite some difference between knowing where to look because you've got the IP address, and not knowing to look because you don't automatically know what originates from a specific ISP or within your country.
  • (And of course, not all bad actors have access to the traffic – some are outsiders, using the IP to home in a location. But that was not what you argued against, just mentioning it here as a reminder in the general discussion.)
Others – including you, most likely – are better suited to comment on the specific technical questions, but it's not just about technically possible or not technically possible. /Johan (WMF) (talk) 19:33, 5 August 2019 (UTC)

There is another aspect to governments knowing and monitoring edits that cannot be easily addressed by any of these efforts. A government willing to invest time and effort in monitoring its residents is also willing to invest time and effort infiltrating the wiki community and gaining direct access to the information this project is trying to hide. A false sense of anonymity is worse than the understanding that there is no anonymity for IP edits. -- Dave Braunschweig (talk) 23:16, 5 August 2019 (UTC)

It is possible that governments may ask those with advanced permissions for information or actions. I expect that WMF will be asked on occasion by governments too. The initiative to provide information to government entities may also go the other way. For example someone seriously threatens another person on Wikipedia and then the local police should be informed. But which ones to contact? Graeme Bartlett (talk) 00:43, 6 August 2019 (UTC)
Wikimedia Foundation: Transparency Report. — Jeblad 16:20, 6 August 2019 (UTC)
They could ask politely with a $5 wrench. Pelagic (talk) 05:44, 31 August 2019 (UTC)
See also what "we" called "rubber hose cryptography" on usenet:sci.crypt. — Arthur Rubin T C (en: U, T) 04:39, 1 September 2019 (UTC)

Wrong approach on cookies for blocking[edit]

Lately abuse mitigation has been attempted solved by setting a cookie. This is a bad-boy cookie. When the user purges the cookies and request a new IP address he is clean. This is a slightly bad situation.

Instead of using a bad-boy cookie there should be a good-boy cookie. An anonymous user should be given a cookie and then accumulate karma. As long as he hasn't enough karma he is suspicious and should be tagged as such. The user could also be given a challenge to type a predefined text. That text could be used as a fingerprint, and if the fingerprint is found to be sufficient similar to a known good user a set level of karma can be transferred.

It is a readers exercise to figure out why a challenge won't work together with a bad-boy cookie. — Jeblad 17:21, 5 August 2019 (UTC)

Wikipedia editors have steadfastly refused to allow the WMF to record more data on them (remember, they only formally get: an email, an IP, and a session ID). There is absolutely no chance that this would be agreed to, especially given the current Community/WMF trust levels. It also can be way to easy to lose your karma with computer switches, geographics moves and other issues making a "fingerprint" too hard to demonstrate. Nosebagbear (talk) 21:17, 5 August 2019 (UTC)
I'm not sure you understand how this work. The present partial blocks are a “bad-boy cookies”, they live as long as the user does nothing to get rid of them. A “good-boy cookie” lives because the anonymous user want it to survive. If the anonymous user don't want the cookie, then (s)he is free to remove it, but then any edits will keep being flagged as suspicious. The karma associated with the cookie is very short-lived, typically 10-15-20 edits, so whether it is lost due to computer switches does not really matter that much. Geographic moves does not matter at all. Creating fingerprints from edits are pretty well-known techniques too, and also problems associated with it. If some user tracking fails then the only problem for the anonymous user would be that (s)he is flagged as suspicious, nothing more. Hardly a critical situation. Note also that providing fingerprints by keystroke dynamics is an inherently opt-in process, building such fingerprints involuntarily is an extremely slow process. — Jeblad 22:44, 5 August 2019 (UTC)
How does one accumulate karma? Why would an anonymous user care whether their edits are flagged suspicious or not? -kyykaarme (talk) 23:02, 5 August 2019 (UTC)
Some unregistered editors are proud of the IP number and even go so far as to making sure their signature also includes a former IP number so they are connected. This is the type of person taht we would really like to have register an account, but have no wish to. But these people would have fixed IP numbers. Graeme Bartlett (talk) 11:43, 6 August 2019 (UTC)
It is darn easy to accumulate karma. Edits that does not lead to reverts will count up, edits that lead to reverts count down. We can add several additional measures. This is already done for autoconfirmed users.
The transfer from an so-called anonymous user to a pseudonym account should be easy, and edits done as anonymous user should be reassigned to the pseudonym account. — Jeblad 16:17, 6 August 2019 (UTC)
Are you familiar with ORES? Maybe it's similar to what you're suggesting, if the point is to make it easier to spot bad edits? -kyykaarme (talk) 09:29, 7 August 2019 (UTC)
Yes, I know about ORES. No it is not similar to what I describe in this thread. It can although be tweaked to do the second approach in #Detecting similar content. — Jeblad 11:13, 7 August 2019 (UTC)

Split the project[edit]

A bit of meta-discussion.

The discussion above is evidently going to nowhere. I hereby propose to split the “IP Editing: P.E. and A.M.” project (and move some of associated talk threads manually) to several pages on related, but distinct problems.

One problem: what to do with IP addresses? To show only to trusted users, to encrypt, to make rDNS and whois queries, etc.

Another problem: what to do with the unregistered users’ edit form? E.g. to disable saving until confirming a privacy disclaimer, or begin to close some sites from unregistered users altogether.

Yet another problem: supplementary tools for abuse mitigation, such as cookies or content analysis.

Let each discussion progress its way and focus on its specific topic. Incnis Mrsi (talk) 17:34, 5 August 2019 (UTC)

While the above discussion is a bit messy, and becoming difficult to follow, I think that it probably fits the purpose of an initial consultation. To the WMF folks monitoring: what do you think about breaking this out into a more structured environment? Would you want to wait until more initial input has been provided? – Ajraddatz (talk) 17:49, 5 August 2019 (UTC)
Hello - the team will be meeting to discuss next steps for this consultation in the next few days. In the meantime, we'd welcome your thoughts on how to best split or organize the discussion, for example by splitting the talk page into subpages or topic-based sections. —User:CLo (WMF) (talk) 19:59, 5 August 2019 (UTC)
Might be worth splitting into topic-based sections, or adding a summary section (and updating it) to the top of the page. Or both. – Ajraddatz (talk) 23:44, 6 August 2019 (UTC)

@Ajraddatz and CLo (WMF): how about IP Editing: Privacy Enhancement and Abuse Mitigation/header? May it be deployed or not today? Incnis Mrsi (talk) 11:44, 6 August 2019 (UTC)

@Incnis Mrsi: Hello, thank you for making that header! We've discussed it in the team, please go ahead and deploy it when you wish. —User:CLo (WMF) (talk) 15:14, 7 August 2019 (UTC)

Transparency[edit]

Generally I dislike the idea of hiding IP. What differs a wiki from another platform such as tumblr is its transparency. As remarked above, even generating a unique nickname generator based on IPs would already make it difficult to fight vandalism. A large part of vandalism fighting comes from non-privileged users reporting potential abuse which they were able to see. A few remarks below. Regards, --Gryllida 00:12, 6 August 2019 (UTC)

Evidence of concern[edit]

I would like to see evidence of users finding it worrisome that their IP is visible. They already age given a banner with invitation to register, above every edit. In my opinion this is more than enough. If registration is hard, then they can be presented with a more interactive registration wizard: above or below the edit box, anonymous users can be presented with two text boxes for creating their account and their password. Regards, --Gryllida 00:12, 6 August 2019 (UTC)

Honestly I suspect even more would be concerned if they thought their IP had been masked but then found out that basically everyone can see it. The way we have things now, it's obvious to anyone paying attention that their IP is visible. Someguy1221 (talk) 03:09, 6 August 2019 (UTC)
Yea I think so. There are other ways to authenticate though like openid though, what do you think? Gryllida 00:46, 7 August 2019 (UTC)
Using OpenID without hiding the IP address would be bad, but perhaps you would use only a federated username? There are several systems we could use for federated identities, and some of them has a lot better security than our present system. — Jeblad 00:22, 10 August 2019 (UTC)

Email verification instead of password[edit]

Another possibility is when registering users are not required to set a password, instead they are required to confirm their email to verify their edit. This is what StackOverflow does. It asks for their email and a username, and after they have submitted their edit, they send a verification email and also suggest to set an account password. When the password is not yet set the account remains attached to a cookie and to the email. Regards, --Gryllida 00:12, 6 August 2019 (UTC)

This could be a way to connect edits across IP addresses, but it must be used together with a shared secret (cookie) to make an unique id. If not used with a shared secret it could be sufficient to give someones email address if that is compromised. A sufficient implementation should be possible though. — Jeblad 00:38, 10 August 2019 (UTC)

Against separate user right[edit]

I would suggest against a separate user right to view the IPs. If to limit visibility of IP, the most I would like is to require that a user logs in and opts in for viewing IPs of others. Regards, --Gryllida 00:12, 6 August 2019 (UTC)

User rights are used for authorizing access to functionality. It should not be possible to view any information that is deemed private without being properly authorized. Access to private information must be logged and the authorized users must be identified. That is if we want to do this the right way and not just some way. — Jeblad 00:27, 10 August 2019 (UTC)

This is a problem creating solution in search of a problem[edit]

I utterly dislike this idea. I cannot foresee any easy way to rangeblock a set of IPs if we do not see the IPs. Systematic vandals (a problem that is utterly ignored by WMF for way over 10 years) will embrace this to the fullest. These systematic vandals will not have to go through the difficulty of making accounts, they can just edit using IP ranges and no-one can connect them. Especially when speed is key in protecting the encyclopedia this is giving enormous delays. A simple block of a range and you seriously prevent influx of problems. Now you CANNOT block anyone even if you would have a tool to tell you that IP-user#1 is in a /24 with IP-user#2 since you do not know whether IP-user#3 is a good editor. You cannot block IP-user#4 with account-creation disabled because that is the IP of a school. You will however block IP-user#5 who turns out to edit from the White house and should have been reported to WMF.

And I don't see what problem you want to solve.

It is simple: If you are concerned that you have your privacy invaded because you are editing Wikipedia under an IP, then a) don't edit or b) create an account.

WMF is once again seriously disconnected from the community. Your aim is to keep editors at ALL costs in here, not realizing that you then also keep all these systematic vandals in here. Do you REALLY not realize that if you don't keep the rubbish out, that then editors (newbies and long term alike) will get frustrated and walk away. And all you do is ignore the defenses further or weaken them even further.

Strong oppose --Dirk Beetstra T C (en: U, T) 06:11, 6 August 2019 (UTC)

Rangeblocks can be done by checkusers easily, it'd be a similar problem we have with logged in editors. --Martin Urbanec (talk) 09:32, 6 August 2019 (UTC)
Yes, but these editors generally chose to use IP ranges and not make accounts. If this is implemented, literally everything has to go through checkusers. Do you know what the average waiting time is at the moment at a checkuser? My last 2 CU requests had respectively 4 and 6 days waiting time before a checkuser came to it. Now you add ALL IP ranges (LiWa3 caught 4 yesterday alone, and it does not catch all and that is only spammers) .... For accounts we do not even bother to ask for checkuser, it is already impossible to do. --Dirk Beetstra T C (en: U, T) 10:14, 6 August 2019 (UTC)

I agree. I've been told that there is very little chance that this will be canceled, but I sincerely hope the WMF will listen to us. Trijnsteltalk 19:58, 6 August 2019 (UTC)

I see mixed signals here. It is called a proposal and it seems that Kudpung was given some assurance above that led him to believe that the community's will is what will be done. Can we get some clarity on this? After all, it isn't in good faith to call it a proposal if it is a done deal. So far, I'm in the strong oppose camp and wanting to see the tools first that would make such a plan viable. The narrow benefits that would be gained are perceived but not validated in any quantitative measure presented. Meanwhile, the disruption that this would cause is easy to see. Please don't fix that which isn't broken and don't break that which is working.
This is the community's decision, right? If the community's consensus is no then that is the outcome, right?
⋙–Berean–Hunter—► ((⊕)) 23:14, 6 August 2019 (UTC)
Yeah I hope consensus matters. Gryllida 00:43, 7 August 2019 (UTC)
@Berean Hunter and Gryllida: I'm afraid not. I've been told that while it technically could be possible, we shouldn't count on rejecting the project if there's no consensus. There is already money reserved for it, so it has to be done. We can only tell them how it should be done. Trijnsteltalk 22:55, 9 August 2019 (UTC)

If I don't want my ip to be known[edit]

I can create an account. So why do we need to protect IP-users? And why is that an issue, if we do not take action? Again, I see a solution looking urgently for a problem. Edoderoo (talk) 12:40, 6 August 2019 (UTC)

I would do that too (as I did). I don't understand, we warn IPs that their IP will be known once they make an edit and they actually choose to do that. I still can't understand that *why* we would need this. If someone consciously choses to expose their ip address, why should we keep them from doing that? WMF wikis have been fighting vandalism for years, and now WMF comes up with this "IP Masking" idea that has no solution for this vandal-fighting problem. I would strongly oppose this unless WMF proves that we can fight vandals (especially IP-vandals) without knowing their IP just like the way we do right now (or better). Ahmad252 (talk) 19:37, 6 August 2019 (UTC)
given the Bassel Khartabil case [4], i suspect that there are some more recent cases of editors being harmed by ip tracking. apparently the WMF feel a duty to those editors who might be killed. Slowking4 (talk) 00:06, 7 August 2019 (UTC)
I agree that people who do not want their IP to be visible can create an account.
Bassel could have created an account too, no? Gryllida 00:45, 7 August 2019 (UTC)
oh really? is the burden always on the dead? i do not have much confidence in the SUL to not logout and disclose an ip. is there no burden on the site, and the admins? Slowking4 (talk) 01:51, 9 August 2019 (UTC)

alternative proposal - have IP data of good edits expire over time[edit]

We only need to know the IP addresses of vandals and spammers. DE wiki and some others have a system of flagged revisions whereby all IP edits are flagged as accepted or rejected. If that system was changed so that all edits from "new" IP addresses had a mask ID assigned to them, then the IP address could default to being displayed to anyone looking at the contributions of that mask ID, except if all edits by that mask ID were accepted as goodfaith and the last was more than three months ago. At that point the IP address would be deleted and no longer displayed, and the next time an edit came from that IP address it would be treated as a "new" IP address and a new mask ID would be generated. On EN Wikipedia and others without flagged revisions IP addresses of "goodfaith" editors would be rather slower to expire, but if info from Pending changes and some vandalfighting tools was fed into the system it could work there too. WereSpielChequers (talk) 06:00, 7 August 2019 (UTC)

Someone could be collecting all IP addresses and their masks before the addresses are deleted. If we're concerned about the privacy of IP users, the addresses can't be public at any point. They have to be restricted, maybe only to checkusers. Restricting them to admins is better than nothing, but admins would probably have to sign an NDA, which I'd imagine many won't do. Maybe a new user group could be created, in a similar way that interface admin rights were split from admin rights. Deleting the IP addresses after 90 days might feel nice to a random IP user who gets spooked when they realize their IP is public, but that kind of a user has nothing to fear even if their IP is public forever, and those users who might have something to fear gain nothing from their IP address being deleted 90 days later, because the damage is already done. -kyykaarme (talk) 07:51, 7 August 2019 (UTC)
OK so restrict the data to only being shown to admins and rollbackers. Still far less public than at present, but workable, and if people want their IP address to be private they can always create an account. WereSpielChequers (talk) 14:23, 7 August 2019 (UTC)
I don't see how this would work when applied to smaller projects; any result we have from this would need to work everywhere. Projects with small communities which may still have serious spam/vandalism issues (and usually do) would be completely unable to manage such a process. Best regards, Vermont (talk) 03:04, 9 August 2019 (UTC)

The Anti-abuse Tools we need[edit]

From the above discussion, it is clear that this WMF project has no support from the community. Forcing such changes upon the Project without any sort of agreement with the community would just be a brutal and anti-democratic imposition. I don't think that it is part of the Foundation's principles to behave in that way... at least I hope it isn't.

From the comments above from the WMF staff, it has been said several times that alternative solutions are to be found together, in order to mitigate this imposition from above. As I wrote at the beginning of the discussion, I would like to use these exchanges as an opportunity to enhance the tools we have, in order to reduce the burden on the CUs, sysops, and patrollers. Here's my list of ideas of what we need to address before doing anything aiming to hide IPs:

  • More support for ant-abuse filters (including an easier interface for regex)
  • Filters IP and UA based
  • Automatic identification of open proxies
  • Automatic labelling (via whois maybe) of accounts/IPs from schools or other institutions
  • Automatically suggest to block the full IP range of a blocked ISP
  • Storing IP for more than 3 months (how can we fight LTA that persist over the years if we have a 3 months window?)
  • Make available the full HTTP request headers to CUs
  • Global inter-wiki CU interface
  • ...

These are some aspects that I consider necessary (also with full IP disclosure) to perform the anti-vandalism job. I invite you all to add comments, ideas for anti-vandalism tools, or to highlight more technical aspects on the matter. I strongly recommend to address such aspects before changing anything, because if these request seems absurd to you, it means that you don't know how much abusive users affect the Project, and how limited the anti-vandalism tools actually are. Thanks :) Ruthven (msg) 13:38, 7 August 2019 (UTC)

Also, making sure content filters are up to scratch, such as the spam and title blacklists and the abuse filter (they aren't). Or even just adding a simple checkbox to the delete form "delete the associated talk page". This is not hard. MER-C (talk) 13:50, 7 August 2019 (UTC)
For checkusers, it would be helpful to have specific device blocking mapped to certain pages, namespaces or categories (such as Category:The Troubles (Northern Ireland) or Category:Arab–Israeli conflict). This should be independent of IP addresses used so that it defeats IP hopping or use of VPNs/proxies.
⋙–Berean–Hunter—► ((⊕)) 14:03, 7 August 2019 (UTC)
I definitely do not agree with is clear that ... no support from the community. I find myself and many others in support, even if the majority is negatively. And if the feedback is broken up per language version, I find constructive input from dewp and other non-en versions, and am not even sure it there is a majority of negative feedback from non-en communities. Enwp is the biggest community and as always in these discussions they dominate even more than their share. But please do not see enwp a The community.Yger (talk) 15:18, 7 August 2019 (UTC)
@Yger: Small projects would suffer even more if the IPs are masked without providing efficacious tools to prevent vandalism. On small projects you have few sysops and no CUs, and the regular users are the ones that perform the rollbacks, and identify abusers by their IP range. --Ruthven (msg) 07:50, 8 August 2019 (UTC)
On small project lite svwp, we have 60 sysops quite enough. And we patrollers are able to monitor all changes being done, with or without special tools. For myself I have for five years monitor all changes being done 24/7, and check those I find strange (with big help from ORES) as an afterpatroller. And as said for me the proposal look good, as I expect the automatic Ip-number converter will be smart and help identify user using different Ip numbers, and identify suspected vandals (with identifying IP numbers doing OK edits, = being OK). And for us the most powerful trick has been to quickly find Ip number from vandal ridden schools and block them for a year or two. Just here a signal of the origin of the Ip number is needed. (whois)(Yger (talk) 09:47, 8 August 2019 (UTC)
A project with 60 sysops in NOT a small project, unless all the active users are sysop (which is still a medium project; to give you an idea, on nap.wiki there must be 500 edits/month and 2-3 sysops). Yours is a possible solution: to flag all the active users with the ability to see the IPs, leaving out only newly registered and anonymous users (or the dynamic ones). Another good point you mention, is having a system that automatically indicates the location, ISP, and all the possible socks of an anonymous user: this would be a useful tool. I reckon that we should push in that direction. --Ruthven (msg) 11:29, 8 August 2019 (UTC)
  • Just some comments. Sorry, they are all arguments against the points:-
  • More support for ant-abuse filters (including an easier interface for regex) → Can't really see how the regex can be made simpler, regexes are pretty complex and if you don't understand them you should not fiddle with them.
  • Filters IP and UA based → This will expose private info to filter editors, which is pretty bad.
  • Automatic identification of open proxies → Pretty hard to automate, but could be done in some cases. It is although not necessary to know whether it is a proxy or not, it is a single IP address. A lot of requests comes through proxies, and the difference of an open and closed proxy is extremely blurry.
  • Automatic labelling (via whois maybe) of accounts/IPs from schools or other institutions → This does not work for most of the world. It is simply not something that is advertised.
  • Automatically suggest to block the full IP range of a blocked ISP → This will be pretty bad for projects with admins that tend to block whole IP ranges of major ISPs. Often it is only a limited IP ranges available for a device, but that limited IP range can be somewhat hard to figure out.
  • Storing IP for more than 3 months (how can we fight LTA that persist over the years if we have a 3 months window?) → It does not make sense to keep the IP address for longer time than the lease time, unless the address is used for edits. Keeping IP addresses for random visits just in case someone is a vandal has far worse consequences than random vandalism. Jeblad 00:14, 10 August 2019 (UTC)
@Jeblad: Ok, so if there are no useful tools to be implemented to fight vandalism as you're saying, this proposal is only making things worse and should not be implemented. I was thinking that the proposal wanted to do both things: guaranteeing some privacy to anonymous users, and providing the necessary tools to protect the project. Btw, I don't see the point on many of your comments:
  • there are no bad consequences for the project in keeping the IP + UA + any other data of a vandal;
  • proxies' addresses appear in freely usable tools on the Net;
  • Filter editors are also admins, so they must have access to private data anyways (we were also saying of creating a special "see IP" group which could fit with this filter task)
  • Whois provides a lot of information on IPs, so automatically labelling schools or companies is easy actually;
  • etc. Ruthven (msg) 12:33, 17 August 2019 (UTC)
  • No, this is not what I'm saying: “…if there are no useful tools to be implemented to fight vandalism as you're saying…”
  • You can't claim you are only storing personal data for a vandal, when you don't know who are the vandals. To safeguard and store data for everyone is a pretty bad approach. The argument that you only store info about vandals will simply not work, but assume you can track some users as vandals, how would you make sure you have updated and correct information? An IP address is leased to someone else, and then you end up claiming that the new lender is a vandal. That is slightly problematic. We should avoid creating data sets that can be interpreted as “known vandals”. We can although track “known users” and thereby take action on “unknown users”.
  • Yes proxies are common, but vandal fighting by trying to find and block every (more or less known) proxy won't work. Mostly because it won't scale very well, but in particular because a whole lot of proxies does not advertise as proxies. It may be possible to detect proxies by timing analysis, but it is quite difficult to do it right in a production environment. Measuring and estimating how long it should take before a reply is available is non-trivial, and the slightly increase in response time is quite small.
  • Editing filters is a special right that can be assigned to any user group. Any user given this right must then sign a confidentiality agreement. Any filter using IP addresses, or any other privacy information, should be non-public with all the known problems (read abandoned filters, and filters based on crystal ball knowledge).
  • Outside the US Whois is mostly used by ISPs, that is you get information about high-level assignments of IP addresses, not who actually leases the IP address. You also get information about addresses used by public facing servers, but a lot of users will enter internet through gateways and proxies that isn't reported by whois. I should also add that regularly hammering whois for every edit will most likely get us blocked, and should not be done proactively. One of the best approaches I know of for abuse detection is to build aposterior (dis)trust in gateways on a route, and then flag edits that has sufficient low rating. That too has problems and should not be done proactively, but without being used proactively we can't build the aposterior (dis)trust metric. — Jeblad 14:07, 17 August 2019 (UTC)

Whatever you do, don't roll it out at once[edit]

No matter what solution may be implemented here (from disabling anonymous edits altogether, to creating system-generating masking IDs for each IP, to masking IP data after a certain number of months, etc.) please, pretty please don't roll it out on all projects at once! Select one or two large projects (maybe enwiki and dewiki), roll it out there and let it be for a number of months (not days, not weeks, months) and then based on the feedback from the communities as well as the technical folks, decide to expand it or modify it or ditch it altogether. Huji (talk) 14:27, 7 August 2019 (UTC)

The problem with this is, going from past practice with the WMF, this might help handle bugs, but if it's rolled out and turns out to be a disaster - they won't agree to cancel it after the sunk costs. Nosebagbear (talk) 08:38, 9 August 2019 (UTC)

Replies to some of the points raised in the discussion so far[edit]

First off, thanks to everyone who has volunteered their time and thoughts on this page and elsewhere. It is exactly the kind of feedback and discussion we (Anti-Harassment Tools team) were hoping to have. I really appreciate it. Before I reply to specific concerns being raised, I want to lay out some general principles of this project:

  • First, we will absolutely not be implementing anything without building or augmenting tools and workflows that help the community defend our projects from vandalism and abuse. It follows that improvements to anti-abuse and anti-vandalism tools will happen before masking IP addresses.That is the primary reason that the Anti-Harassment Tools team at Wikimedia Foundation is responsible for leading this project. Our first priority will be to improve existing tools and build new ones to improve the state of anti-vandalism on the wikis.
  • Second, there is a range of acceptable solutions to this problem. Nothing is pre-decided and everything is laid out on the project page. The goal of this is to figure out the best possible solution that proves a win-win for all. Any implementation will be rolled out in stages and piloted on some projects first. For instance, reducing IP visibility to logged-in users only would be a possible first step towards increasing protection for IP editors.
  • Third, as mentioned on the project page, there is no concrete timeline on this project at the moment. Our goal is to seek community feedback on all aspects of this project and coming to a collective decision about next steps.

I have tried to summarize some of the bigger concerns/questions that have come up so far and address them below. While I don’t have an answer for each of these, I have tried to answer what I can. If I missed responding to something - that is completely unintentional and I apologize in advance. If you point it out to me, I will be happy to reply to that.

1. IP addresses are currently very useful for non-CheckUsers and non-Admins to flag vandalism. Restricting IP address visibility will hamper their ability to do so.

This is a very valid concern that we have heard from more than a few users about. We’re aware of how IP addresses are critical to the workflows. We have a volunteer checkuser and admin working on the team with us to help out as well. As mentioned above, there is a range of possible solutions on the table. For example, we could have an additional user right that allows users to see IP addresses, granted to people who need access to IP addresses as part of their work on the projects, with the understanding that they will use this information responsibly.

2. IP addresses provide a variety of important information such as - whether the edit happened from a proxy, if the IP address is on a blacklist, if the owner of the IP is an institution, what’s the user’s geolocation etc. Not being able to access IP addresses will cause a loss of this information.

It is possible that we could surface the same information that is provided by the IP address without revealing the IP address itself with the help of a tool, to a trusted group of users (as defined by the communities). We can take this a step further and surface other “anonymous” users who have edited from the same geolocation/institution etc. This can help anti-vandalism efforts on the projects.

3. What about the people who don't mind seeing their IP address broadcast?

IP addresses are not well understood by a lot of people on the internet. Often people do not understand what is at stake when they choose to be “anonymous” and sign their edits with their IP addresses. We also want to protect these people, who may not be technically adept enough to know why protecting your privacy online is important but could still face very real consequences.

4. Turn off IP editing/Make registration mandatory to edit

This is a legitimate ask. As others have pointed out, it is a Founding priciple of our projects that anyone can edit without registering. However, after reading everything on this page, it is clear that we should gather some research and data about the usefulness of anonymous editing on our wikis and potential impact if we disallow IP editing. I will be talking to the Research, Growth and other teams about this. Another option, if we choose to not completely disallow IP editing, is that we can use this opportunity to better encourage users to create accounts. There is a lot of scope to make that process more seamless and easy.
Another thing that has already been mentioned is that it is not necessary that making registration mandatory will lead to a reduced workload for CheckUsers and Admins. It will potentially drive up sign-ups but that would also mean vandals create throwaway accounts to make unconstructive edits, creating an extra workload for people who have access to IPs to detect sockpuppets and block ranges.

5. If people are concerned about their privacy, they should register.

Again, a lot of people don’t really know what they are giving up when they choose to not sign up. Banner blindness has been mentioned, which is a known phenomenon. Saying that an editor who edits without logging in has taken the conscious decision to broadcast their IP address may not always be correct. As part of this project, we will be looking at making it clearer to IP editors why it is a good idea to sign up and also looking at making improvements to the sign up process. A lot of people who use the internet do not understand the ramifications of disclosing their IP address, and we fear that by showing that IP address to everyone, we’re penalizing them for their lack of technical knowledge.

Open questions:[edit]

Based on all of the above, we have some questions we'd like to get your feedback on:

  1. What tools do you use as part of your anti-vandalism workflows that could benefit from feature improvements? We have heard about CheckUser, AbuseFilter, GlobalUserContributions (guc). Are there other tools or user scripts that you use for this purpose?
  2. What potential tools can you think of, that don’t currently exist but could help with anti-vandalism efforts, especially if IPs were to be masked on the projects? Some suggestions we’ve heard include more sophisticated filters (e.g. easier regex interfaces, filtering by UA as well as IP), automatic flagging of institution type (e.g. schools, government offices), and automatic open proxy detection.
  3. What do you do right now with IP addresses? We’ve heard about whois checks and looking at that data to see who owns the IP or where it is located. Another thing that has been mentioned is using the IP to find other IPs editing in that range. Are there other ways you use IP addresses? Can some of these workflows be automated to make your work easier?

My apologies for using "I" and "we" interchangeably in the post above. The post was drafted by me on behalf of the Anti-harassment Tools team at WMF. -- NKohli (WMF) (talk) 17:03, 7 August 2019 (UTC)

  • As a volunteer admin and checkuser, I could find the following uses of IPs in my workflows:
    • Check previous edit history of an IP. Some IPs are very static, I know people who edited anonymously from the same IP for years and created dozens of articles from their IP (but also possibly violated rules). Being able to interact with such people unwilling to register is clearly useful. However, we need to keep their identifier static and not change it daily, they might well edit the same article from the same IP a few years in a row.
    • Check range contributions. If an IP is dynamic, it is very useful to know if there is any activity from a neighbouring range. For instance, if a vandal with a particularly annoying pattern (e.g. changing dates in articles) is active in a dynamic range, getting all edits from this range to check them is clearly necessary.
    • Set an abuse filter on a range. If there is a particular pattern of vandalism from a range (e.g. use of certain words that might be appropriate in some but not all articles), we might have to disallow editing with this specific pattern to this range. This is an alternative to a block of the entire, potentially large range, and to disallowing potentially useful edits to all users.
    • Check global contributions. It is extremely important to keep identifiers consistent between wikis for fighting cross-wiki vandals. This is particularly the case of cross-wiki spammers who may insert spamming links from the same IP to multiple wikis.
    • Check if an IP is a proxy, VPN or Tor node. This is usually more advanced than automatic tools can allow, particularly in cases when people use proxies or VPNs to hide links with their main accounts in an abusive way. Sometimes I literally google an IP to find if I happen to find it in some proxy or VPN list.
    • Check if users/IPs belong to same network/geography. Some providers use multiple ranges with very different IP patterns (like a 128.*.0.0/16 and a 192.*.0.0/16), and a user (both registered and anon) might move from one to another without notice. Some users (both registered and anon) use two different providers (like home and mobile) but in a very specific location, and we can link accounts by this location. For example, if two IPs from the same town but different networks in Malaysia participate in the same discussion in Ukrainian Wikipedia, they very likely belong to the same person.
    • Check location of an IP. Unlike the previous case, location can be used in a positive context. For instance, an IP adding information of some obscure politician in China is possibly a vandalism. However, a Chinese IP adding information about a Chinese politician is less likely to be reverted.
    • Check organisation of an IP. This is needed for paid editing / COI matters. For example, an edit to an article about an MP made from the Parliament's IP (be it a registered or an anon user) is very likely an undisclosed paid editing or a COI and requires relevant actions.
    • Any combination of previous factors above can be needed. For example, I had a case of a user switching between an IP of a specific provider and VPNs who had abusive anonymous edits in multiple wikis from IPs of a different provider in the same (small) location and had sockpuppets on IPs from different ranges but belonging to the same VPN network. Yes, that's hard even with current tools, I really hope it will not be way more complex once this change is public.
    I might have forgotten something but this should cover around 99% of my cases — NickK (talk) 19:03, 7 August 2019 (UTC)
  • > What tools do you use as part of your anti-vandalism workflows that could benefit from feature improvements?
    All of them. Every single one of MediaWiki's anti-abuse and admin tools is not fit for purpose and a mess code-wise.
  1. Undeletion needs to be completely rewritten - every aspect is so thoroughly broken from the UI all the way to the database tables (including deleted title search). Deleted revisions and live revisions should be indistinguishable from a feature point of view.
  2. Likewise, the spam and title blacklists are both hacks and also need to be completely rethought. Make them infinitely scaleable, add antispoof to the spam blacklist, make logging less hacky (should really be a special page)
  3. Clean up technical debt and write unit tests for the abuse filter. Add subroutines and shared variables. Find a way to properly combine AntiSpoof and regular expressions (see e.g. T147765).
  4. Fixing our goddamn CAPTCHA so that stewards don't have to deal with spambots manually (!!!)
  5. Special:Linksearch and the external links table are unfit for purpose - you know you've got problems when the way to find external links is the normal search engine. The external links table should have columns for both the protocol and the domain.
  6. Make full HTTP headers available to checkusers.
  7. Search CU records by user agent.
  8. Checkuser also needs a complete rethink as well. This is a little drastic for the features above, but it allows you to back Checkuser with a graph database.
  9. In fact, the entire anti-abuse effort needs a fully integrated graph database (with one click block/revert/delete/blacklist links) - this will be a great improvement for dealing with sophisticated abuse spanning tens, if not hundreds of accounts, titles, spam links, etc.
  10. Private long term abuse documentation - solved in general by T230668
  11. API improvements - fetching range contributions, deleted title search, Special:Recentchangeslinked, ...
  12. Notify me when this block/protection is about to expire.
  13. Give the option that when a protection expires, revert to the level it was previously.
  14. Batch deletion, undeletion, blocking, unblocking and reverting being part of core software.
  15. Minor workflow improvements - "delete talk", "delete subpages", "RevisionDelete title/creation edit summary everywhere" checkboxes on the delete form, "RevisionDelete username everywhere" on the block form, jump to the page history near this revision on the diff page and contributions page, checkboxes on Special:Contributions for bulk revision deletion, ...
  16. Impose a lower rate limit on the number of accounts created per session.
  17. Machine readable diffs - if you're going to use natural language processing or machine learning to identify socks, you better be able to parse diffs easily otherwise you're in for a world of hurt when it comes to training.
  18. [More pending]

You have also not addressed the point that the public needs to know about certain abusive IP address editing. MER-C (talk) 19:23, 7 August 2019 (UTC)

Thanks MER-C. This list is very useful. I heard similar requests from people at Wikimania about several of the tools on your list. Do you typically use these tools for enwiki or also on other wikis?
To your point about the public needing to know about IP editors - can you clarify a bit more on why this is the case? The entire point of this project is to protect our unregistered editors from the harassment and potential persecution they could face because their IP addresses are public. In an ideal world, we would have a way to protect vulnerable editors while also being able to accomplish what we currently can with public scrutiny of IPs. -- NKohli (WMF) (talk) 20:55, 27 August 2019 (UTC)
@NKohli (WMF): What I use and where is irrelevant. Every single item on the list impedes the ability of admins, global admins, checkusers or stewards everywhere to fight abuse. Therefore, I expect you to implement all of them.
As I've stated numerous times on this page, public scrutiny that deters abuse e.g. w:Church of Scientology editing on Wikipedia, w:United States Congressional staff edits to Wikipedia and the various bots on Twitter and elsewhere looking out for politician and other COI edits are a legitimate use of this information. Investigative journalism and banning of spammers is not harassment or persecution. No compromise is possible, and the concept underlying this project is fundamentally unsound because creating an account is so easy. MER-C (talk) 19:58, 31 August 2019 (UTC)
  • I will also say that despite mentioning the importance of community consultation several times, and making it clear there are no concrete plans at the moment, your every comment gives me the distinct impression that IP masking will be implemented at some point in the future, whether or not it has community support. It's distressing to see that WMF once again appears intent to force its will. But on the subject of needed tools, I would recommend the following:
1. Make it possible to search the contributions of IP ranges with prefix lengths other that what is currently supported.
2. Do something to either speed up or kill the process that tries to list the IPs that have been used from an IPv4 range, as for busy ranges this process can freeze the page. Give it a time limit or something if making it better is hard. Or just have a button to make it run, instead of automatic.
3. Make it possible to search the deleted contributions of an IP range.
4. Make it possible to search the abuse filter hits of an IP range.
5. If this hasn't been fixed already, get rid of whatever code causes the abuse filter to occasionally drop fields from log entries - that makes it very difficult to understand some false positives/negatives.
6. Generally, give us much better tools for communicating with IP users, masked or not:
a. Many ISPs now switch users to new addresses so quickly (some every few minutes), that it is basically impossible to contact them through user talk pages. It should be easy to leave a message for an IP range. For example, User_talk:127..0/16 should literally act as a talk page for that entire range. Now, especially for busy ranges we don't want to alert everyone simply reading Wikipedia from that range - we already get enough readers freaking out about warnings, sometimes years old, left on IP talk pages. Perhaps a message on a range page could send a notification only to users on the range who open the edit window or have a cookie that says they did, and could make it clear it's a message for everyone on a range ("Have you been editing Wikipedia? You may have messages: Click here", and then there is a standard banner atop the page about shared and dynamic IP ranges). It may do well to limit creating talk pages for very large ranges to admins or another group, to prevent abuse. This will unfortunately mean that a single IP could simultaneously have several talk pages that the notification thingy will need to handle, but if they don't like the confusion they can register an account.
b. Many ISPs actually have predictable behavior, and there should be a way to bake this into the IP-username-assignment system. It will not be one-size-fits-all and IP allocations can also change, so it can't be permanent either. It has to be something we could set, change, and remove. Specifically, there are ISPs, typically residential high speed internet, that assign a single /64 to each customer that is stable over months or years. Let's say we know that a certain /32 is divided up in such a manner. We should be able to make MediaWiki treat every /64 in that range as a single user. Have an option to see the full address, but everything (signature, IP as listed in recent changes and similar, talk page links, block button, etc.) should default to using and showing the /64 subnet. There are some very active anonymous users, both good faith and bad, who have gone through hundreds of addresses from one static subnet. This type of method would shrink that to behaving as just one address, and greatly decrease confusion. Setting this sort of behavior would be a powerful tool that would have to be limited to a user group, and unsetting it should be retroactive (perhaps it could just be the way links and such are displayed, but regardless, if this is implemented it will need its own consultation).
c. Other ISPs allow an absurd number of customers to freely roam a single range with no way to distinguish them (I'm looking at you, T-Mobile). These should not be treated as the previous type of ISP. However, admins and checkusers collect a lot of information about all types of ISPs. It would be helpful if that information could somehow (either in conjunction with the idea above, or independently) simply be displayed for people looking at IP contributions or other IP-related pages without making them go to WHOIS and look at the range and figure it out all over again. That is, some knowledge blurb like "this ip is in the xx/## range, shared by 20 million T-Mobile customers - individuals cannot be isolated." "This IP range is in the xx/## range, shared by 120 million Comcast home internet customers, most of whom have a stable /64 subnet." We'd be able to set it for the whole range, and the message just shows up on contributions, talk page, etc. There are admins who won't do rangeblocks because they are concerned they will mess it up, and figuring out how a network behaves is not always easy. So help us make it easy but letting us share knowledge straight to everyone who could use it, without making them look for some tool or list of information that may or may not have what they want, and may even be unaware of. This actually doesn't just help admins who are hesitant about rangeblocks - it helps every admin by saving us from having to duplicate efforts over and over again. It could also actually reduce collateral damage from rangeblocks - for example, I have seen admins block essentially entire cities because they set a range well beyond what was necessary, not realizing how the network actually behaves.
d. In light of the awful type of ISP, perhaps consider reaching out to those ISPs directly. Especially ones that have actually been blocked in their entirety because of abuse. Let them know that it is so damned impossible to single out one user on their network we routinely just prohibit all of their customers from using parts of our site. Maybe they will care and try to do something on their end. Probably not but worth a try.
At the moment that's all I can think of other that what has been mentioned already. Someguy1221 (talk) 07:16, 8 August 2019 (UTC)
  • .
    • Are there other tools or user scripts that you use for this purpose? - I hate to plug my own tool, but ipcheck. It takes data from many API-based sources, and presents them all in one place. It can also do limited portscans.
    • automatic open proxy detection - That is far more difficult than it sounds, and really needs a human a lot of the time. There are large swaths that can probably be automatically blocked based on a whois regex-search, and some based on ASN - mostly webhosts. I'm working on an Machine-Learning solution to this baked into the above mentioned tool.
    • What do you do right now with IP addresses? We’ve heard about whois checks and looking at that data to see who owns the IP or where it is located. Another thing that has been mentioned is using the IP to find other IPs editing in that range. Are there other ways you use IP addresses? - I block a LOT of open proxies, and hosting ranges. Of the ~13000 rangeblocks active on enwiki [5], ~5000 are mine [6]. Googling the IP is very helpful, running the IP over various abuse databases such as stopforumspam, and the cbl is helpful. I might do a limited portscan in some cases. I'll resolve the range around the ip to understand how the range is being used. SQLQuery me! 15:32, 8 August 2019 (UTC)
@NickK and SQL: Thanks for listing out your use cases for IPs. This is extremely helpful. I will be compiling a list of use cases for IP addresses on the project page for posterity. This would be a great starting point. -- NKohli (WMF) (talk) 20:55, 27 August 2019 (UTC)
  • I perform CU and RC patrolling on it.wiki; on other smaller projects where I volunteer as sysop, it is sufficient to do retro-patrolling using Special:RecentChanges.
    • What tools do you use as part of your anti-vandalism workflows that could benefit from feature improvements? Are there other tools or user scripts that you use for this purpose?
      • For RC patrolling the 90% of the controls are done using LiveRC (a useful tool which code needs a serious refresh and update!)
    • What potential tools can you think of, that don’t currently exist but could help with anti-vandalism efforts, especially if IPs were to be masked on the projects?
      • I listed some of them above. Automated range detection for ISP and proxies could be very useful. For open proxy, they should be globally blocked, so a page where to list all of them, and the check + block is performed without robbing time to the stewards would be useful.
    • What do you do right now with IP addresses? Can some of these workflows be automated to make your work easier?
      • I check their ISP, the edits in the range, look for LTAs linked to the IP. A searchable private database of LTAs with their behaviour could be very useful to trace harassment and abuses. This means to perform radical changes to the checkuser wiki.
      • I check for open proxies using online IP checkers, and abusers/spammers on blacklists. These checks can be done automatically (even if blacklists are not to be 100% trusted).
      • I would be useful to have a direct line to abuse@ the ISPs, and have our voice be heard (because generally they ignore our messages completely).
Cheers, --Ruthven (msg) 14:19, 9 August 2019 (UTC)
  • Just to demonstrate how much of a regression this proposal is: I just blocked 194.181.146.128/27 for five years on en.wp because it was a company address range (see WHOIS) spamming said company for five years. How am I going to detect such long term abuse if this proposal is implemented? MER-C (talk) 16:19, 9 August 2019 (UTC)
  • Another point is that many editors notice IPs who spam or change numbers, or who otherwise degrade articles. Many such editors notice when similar IPs are used and they request range blocks; they can also check other IPs in the range themselves. If someone sees Anonymous123 changing numbers and Anonymous987 doing similar, they won't know if the IPs are related. Wikipedia works by empowering its users—we cannot put all the work on a handful of people with special rights to investigate the hundreds of disruptive anonymous edits that occur every day. Johnuniq (talk) 04:50, 30 August 2019 (UTC)

Likely error in the report - the conversation around anonymous editing[edit]

The PDF report says: The IP masking project is very unlikely to change the conversation around anonymous editing. I believe that is wrong, possibly spectacularly wrong.

While the intent of the IP masking project is noble, it appears inevitable it would cause significant pain and difficulties regardless of whatever tools are provided. That will definitely shift the conversation around anonymous editing. Currently there is limited support for banning IP editing. There would certainly be wider support for banning masked editing. This project might effectively lead to a ban on unregistered edits.

If this project effectively drives a ban against unregistered edits, all of the masking software and tools for dealing with unregistered/masked editors would end up worthless and unused. If that is indeed the result, the more efficient path would be to skip all the development work and just jump right to banning unregistered edits. Alsee (talk) 00:05, 9 August 2019 (UTC)

Yes, the most predictable outcome I see from this is even more innate suspicion of anonymous editors. The threshold for running CU (or some CU-lite that only works on masked IPs) will be approximately non-existent, and volunteers will be quicker to revert and block. If access to IP data is limited to admins who sign the NDA, then I think your scenario becomes very even more likely - there is sometimes simply too much vandalism to justify waiting for someone with exclusive tools to come and take care of it. So I would anticipate some ISPs just being permanently blocked, more or less. Someguy1221 (talk) 05:26, 9 August 2019 (UTC)
This is a point that I hadn't considered but you're dead right - the Community might be in favour of the "wiki way", but the Community also voted to stop non-autoconfirmed from creating articles (3 times) after they caused too many problems to handle. I'm in favour of IP editing, but anything more than de-masking being tied to extended-confirmed rights would make the problem so big as to not be worth handling (even if I got a userright, let alone it being limited to admins/CUs) Nosebagbear (talk) 08:44, 9 August 2019 (UTC)
On the specific idea of putting the de-masking userright at extended-confirmed, that would indeed reduce the downsides of masking to a largely negligible level. However it would so thoroughly destroy the purpose of masking that masking would be nothing but worthless complexity. I therefore consider it a non-option.
I question whether masking would even make sense if we set the unmask right at admin-level. While admins are considered "trusted" to use the tools to keep the wiki running, it's not generally considered a confidential position. There are a lot of admins, and while it takes a lot of work to get admin on EnWiki it requires little more than showing up on a small wiki to get admin there. A tiny-wiki admin can unmask any global editor simply by locally pinging them with a question and managing to attract a local reply. (Actually, it's not clear that a local edit would even be necessary.)
I'd say the a choices are (1) effective-and-disruptive masking set above admin level, (2) keep the status quo, or (3) end unregistered editing which renders the issue moot. Alsee (talk) 13:44, 9 August 2019 (UTC)
@Alsee: A person who can see the IP addresses of logged-in users can unmasked any global editor in the described way, but the same isn't true for the ability to see IP addresses of users that aren't logged in.
Seeing the IP addresses of not logged-in users could be a right that's it's own special permission similar to the rollback right. That would allow any Wiki to make policy to decide for themselves how many people should have access to the IP addresses of logged-in users on the Wiki. One Wiki might keep the right only to a subset of admins while another Wiki might believe that all admin and some non-admins need the right. Then it would be on the WMF to provide tools that make it seem to the individual Wikis that it's necessary to give less people access to the right because seeing the IP addresses gets important for fewer people.
I see no good reason why the restrictions to access IP addresses on individual Wikis should have to follow a global policy.
From a usability perspective "Synthetic user names" might make looking at edit histories nicer for users who don't care for IP addresses. Having a cookie-based link to a synthetic user name could help in some cases with sending an individual user notifications. ChristianKl❫ 11:59, 22 August 2019 (UTC)

A Compromise, with some issues[edit]

While @NKohli (WMF):'s summary was decent, it still had an inclination that they're willing to bend the criteria but not ditch it, despite a good 85% of the participants stating that that is what should must happen (Nkohli has noted the issues but taken very positive outlooks towards them, which hinders it as a summary)

I think the Community as a whole would be fine with only autoconfirmed users being able to see IPs. en-Wiki could probably roll with it being tied to extended-confirmed, I don't know about the other (and smaller) Communities.

But anything that required it to be acquired as a userright would have a major hit on acceptability - remember lots of reports that use IP aren't just individuals hunting but people coming across it in their day to day activity. So asking for a advanced user to come have a look is too slow, unwieldly and irksome for it to help.

NKohli (WMF) - one issue with any internal limitation, and an issue you did miss off the summary - what about the public benefit from knowing IP-edits, such as those linked to Parliament or Congress, that has caught some public issues before? Nosebagbear (talk) 08:56, 9 August 2019 (UTC)

Comment on Foundation engagement, for editors: I have a lot of experience dealing with community-foundation issues. In the past the Foundation has had the engagement and collaboration style of a freight train. A freight train with no brakes. Once an internal plan had been set, the only way to stop the train was with a major trainwreck. However recently I've seen very real efforts by the Foundation to genuinely engage and listen to the community. There are still problem areas (a long-term strategy to effectively sabotage wikitext being one of them), but my informed !vote is that we try to allow and support more positive engagement when the Foundation opens a dialog with us.
That said, my comment to staff is that the available comments from the Foundation are ambiguous at best. Painful experiences can result in long memories. A lot of editors are still inclined suspect unfavorable interpretations of ambiguous statements. My assessment of the situation is that editors generally understand the good intent here, but the current consensus-position appears to be that even good-or-great tools to work with masked editors will make things unacceptably more difficult. We take an extraordinarily-open step of allowing unregistered/IP editing, and we have little tolerance for anything that would make it more difficult to deal with the resulting problems. It already soaks up large amounts of volunteer-time dealing with these issues. It is miserable frustrating work. Every hour of labor spent on vandals, socks, and disruptive/abusive block-evaders is an an hour we can't spend doing more rewarding work like building and improving articles. I don't think this idea is going to be accepted, not unless there is a large and unexpected shift in the conversation.
My assessment is also that the available information and comments from the Foundation seem to have a tone that this is something that the Foundation plans to do, rather than the Foundation opening a dialog on whether we want to do this. My assessment is that that perception is creating discomfort, worry, and stress. If this is a "plan", then it is best to be open and clear about that. If this is investigating whether the community is on board with the idea, it would help to make that more clear. Alsee (talk) 14:49, 9 August 2019 (UTC)
@Alsee: - From what I've heard the WMF will certainly do something in this regard; sorta the t/p consultation where it was explicitly said that status quo ain't one option. We, at best, have a chance to specify and influence the specifications of that something. Winged Blades of Godric (talk) 12:01, 10 August 2019 (UTC)
For en.Wikipedia to survive, even if anti-vandalism tools are improved beyond the point that I believe possible, at least extended confirm users need, not only to see IP history as at present, but to see talk pages as if the IPs were not masked. Otherwise, I would recommend that IP users be banned from all except a few pages to make requests. If the status quo is not an option, neither is IP editing. — Arthur Rubin T C (en: U, T) 18:42, 11 August 2019 (UTC)
@Alsee: I cannot stress enough on how important it is for us to host an actual dialogue here. Me and my fellow colleagues, Johan and Sydney had a lot of fruitful discussions at Wikimania about this very project. We are going to experiment with hosting discussions on individual projects as we heard some hesitations about people coming to meta and talking about it here. Unsurprisingly, people expressed being much more comfortable talking in their local communities. I'm going to try to experiment some more in that direction. But the bottom line is that we are not going to be rolling out anything without explicit discussion with the communities. -- NKohli (WMF) (talk) 22:49, 27 August 2019 (UTC)
Hi Nosebagbear. Thanks for your comment. I understand your concern about public scrutiny of IP addresses - however that can just as easily lead to harm like it could lead to good. We need to balance the privacy considerations for our unregistered contributors against the benefits we hope to gain from such public scrutiny. It seems to me like it is our duty to protect those who make good-faith edits to our projects despite risks and do their bit to contribute to the sum of all human knowledge from government persecution and unlawful arrests. -- NKohli (WMF) (talk) 22:49, 27 August 2019 (UTC)
@NKohli (WMF): - hi there. 3 issues with your last: "however that can just as easily lead to harm like it could lead to good" - no it can't. Vastly more frequently it leads to good, rather than bad, and it's easier for individuals like me to use it for good than for harm. You then mention that masking IPs would allow individuals at risk from governmental persecution to still edit. Except, that doesn't hold up. The masking only happens internally (and they'd be equally protected with an account). Any government would be able to identify the individuals regardless of how we concealed it on our end. Lastly: Our duty is to the encyclopedia ("collect and develop educational content", in the WMF's less than pithy mission statement). You say risks, but they aren't risks, they're assured negatives. The negatives to the encyclopedia would be heavily outweighed by your mooted benefits. Nosebagbear (talk) 09:04, 28 August 2019 (UTC)
@Nosebagbear: When you say "vastly more frequently it leads to good, rather than bad" - can you give me some more context behind what makes you say that? My sense is that more often than not harassment (both on and off-wiki) goes unreported. I don't have any numbers or data to back up that statement except for anecdotes. Maybe you can shed some light on what your sense is on this.
IP addresses can reveal location, registrant of the IP range, their exact address, phone number and much else. I'm sure every single person on this page uses this information responsibly and in good-faith but what's stopping someone from misusing this very-visible information for harassment?
I'm not clear on your argument about masking only happening internally. Speaking with my technical hat on: right now we can pin-point exactly what an IP address wrote on the wiki because the diff associates the text added/removed with the IP address. If we stop doing that, there is no way (to the best of my knowledge) to be able to associate a diff with an IP address unless the seeker has access to our private databases.
To your last point, we are really not going to let this project negatively impact the projects. IP addresses are already highly dynamic and it's increasingly a whack-a-mole game to fight vandals. We hope to reach a point where our anti-vandalism tools can work independent of the IP addresses. I want us, the community and the WMF to work together to make this possible. -- NKohli (WMF) (talk) 03:07, 29 August 2019 (UTC)
@NKohli (WMF):A flick through a dozen pages of en-wiki ANIs shows IPs tagged together, and on a quanitified research front, you're right, we are lacking. However, the same applies to the pro-masking argument. Do we have any hard numbers on the number of IPs who have made accurate cases of harassment (vetted for legitimate complaints), what is the source of those (government, non-AC users, other IPs, admins, CUs etc), and what % of (accurate) harassment claims aren't raised.
By "internal" I meant that in countries that place control of the internet system itself, within their borders (or by requirements placed on ISP providers), associating an individual with an account (or a masked IP) would be very easy. Remember we block almost all proxies (and all TOR nodes) from editing because of the avalanche of vandalism from them with addresses that can't be specified more clearly, so they can't utilise them and edit, so it'd purely be the masking.
Strictly speaking, any negative edits that made it through because of the change would be a negative impact to the project (as tools can (and should) be made regardless). At least some of the editors here, some of those talked to at Wikimania, and an unknown proportion of the whole community, would be willing to accept some tradeoff, but it would need to fall below the Admin level (leaving AC, EC, and a new userright)
I think there's some general concern that we need to see more ideas on how these tools could work. Obviously some technically adept vandals (usually long-term wiki-abusers) can already get round it. But there are concerns that the non-functionary methods named, by anyone, (linking IPs, through session IDs or behaviour) are either more easily avoidable (the former) or a very concerning idea (the latter). Functionary time is a finite resource, anything that needs more of it is going to be an issue.
It's fantastic that you are working with us to provide the solutions (and your communication is strongly appreciated, especially in comparison to the working group recommendations), but there needs to be confidence that these methods would work before product development to actually make them kicks off, and we've not seen either the team or users suggest anything mutually accepted as viable. Nosebagbear (talk) 08:19, 29 August 2019 (UTC)
@Nosebagbear: I admit I don't have any numbers to share either. Harassment reports are dealt with in a confidential manner and my understanding is that not much can be shared on that front. My team does not have access to any of that data.
Even in countries which control the internet - it will not be an easy task to associate an exact edit with an IP address. Thanks to HTTPS enabled not too long ago on our projects, a third-party may be able tell that an IP address visited our sites but it is impossible to know whether that IP address visited a specific page or made a specific edit. Because the edits happen on our website, the data about which IP address made which edit is only under our control and will reside only on our servers. Any third-party entity would not be able to associate an IP address with an anonymized user.
I think a new user-right or some other way to provide IP (or equivalent info) access to non-functionaries would be an acceptable solution. To be honest, anything we can do to increase protection for unregistered editors is a step in the right direction. I feel we haven't yet fully explored the possible solutions for how we could find other ways, perhaps better than just using an IP address to link unregistered user accounts. There are a lot of good ideas already on this page. And a lot of good-faith, passionate people. I am hopeful we can come up with a mutually acceptable solution. -- NKohli (WMF) (talk) 17:46, 29 August 2019 (UTC)
@NKohli (WMF): you said "IP addresses can reveal location, registrant of the IP range, their exact address, phone number and much else". Generally speaking, there is no connection between the person using the IP and the registrant. It isn't the contact information of the anonymous editor. How does knowing the contact information for the company that has been assigned the IP range compromise the anonymous editor? Bitter Oil (talk) 04:25, 31 August 2019 (UTC)
@NKohli (WMF): - i'm just repinging because I was interested in the answers to the above question by Bitter Oil Nosebagbear (talk) 10:45, 11 September 2019 (UTC)
@Bitter Oil: I'm sorry I missed this. Thanks for the ping, Nosebagbear. You are correct that the registrant is the company that has been assigned the given IP range. For example, Wikimedia Foundation is the registrant of the IP address I am editing from. While this is okay for a company with many people, it can be used to identify someone working in a small company. A startup may only have 20 employees and if you know someone who works for that company is active on the wikis, the likelihood that it was them editing from an IP registered to that company is very high. So that person can give away their information if they accidentally edit while logged out. That person cannot also remain deliberately anonymous if they need to, for political/governmental reasons. Does this answer your question? Thank you. -- NKohli (WMF) (talk) 23:59, 11 September 2019 (UTC)

Outreach[edit]

I've seen a couple of questions around what we do to let the communities know this discussion is taking place, so I thought a list might help. So far, we've

  • Included it in Tech/News
  • Reached out specifically to stewards and checkusers
  • Posted on some Village Pumps where we happened speak the language

For the next step, we'll be posting to most Village Pumps (excluding some communities we've already notified, e.g. I won't post on Swedish Wikipedia again). If you feel like helping out translating, there's a message here: User:Johan (WMF)/Tools and IP message. /Johan (WMF) (talk) 15:51, 9 August 2019 (UTC)

Very glad to see you are reaching out to the community, I will try to translate to zh (if someone else not beat me to it). These kind of communication will be very positive. One note though, when at times WMF employees posted to village pump, the account isn't locally registered, it will cause some form of confusion, please do a global SUL (Krinkle tool?) to ensure the local communities know who are you when addressing the message to them, thanks. --Cohaf (talk) 19:04, 9 August 2019 (UTC)
Thank you! And yes, I've run Krinkle's useful SUL tool. (:
(For everyone else: see m:User:Krinkle/Tools/Global SUL.) /Johan (WMF) (talk) 22:28, 9 August 2019 (UTC)
@Johan (WMF):Done zh translation, and thanks once more for taking the lead to communicate with community.--Cohaf (talk) 07:07, 10 August 2019 (UTC)
Johan (WMF) outreaching more loudly and more broadly is unhelpful if the outreach is unclear. Please clarify:
  • Is this an announcement that the Foundation is going to mask, whether we want it or not? Or;
  • Is this is this asking the community whether masking is a workable and acceptable idea?
Alsee (talk) 16:44, 10 August 2019 (UTC)
I know a lot of people would love a straight answer to this question - myself included. It should be a pretty easy one to answer. SQLQuery me! 22:35, 10 August 2019 (UTC)
I am very worried that this simple question "Is the WMF asking IF this should be implemented, or asking for opinions on HOW they will implement this?" has gone unanswered after 12 days. I understand that wikimania is going on, but unless we're worrying about a diplomatic way to say "We are going to implement this no matter what you say", it should be very easy to say something along the lines of "We're just floating an idea to see if the community is interested in this". If you're planning on implementing this no matter what, just say so. If you're asking if this should be implemented, it would turn the heat down a lot to also just say so. SQLQuery me! 03:53, 23 August 2019 (UTC)
I do not in any way speak for the WMF, but my sense is mostly that this is likely to be implemented at some point, and they're trying to figure out how to mitigate the impact. Bottom line, if our servers were in Europe, this would already have been implemented (regardless of whether or not anyone had figured out how to mitigate the impact) because of the legal changes there. My sense is also that there really is a strong willingness to work with the community to find solutions *now* and start identifying the needs, building the tools, and finding other ways to lessen the impact, because it's probably just a matter of time before external pressures come to bear that will force their hand. The fact that we have known for years that it is technically possible to do this creates a situation wherein the WMF probably doesn't have a good defense against complaints, regulations, legislation, or lawsuits.

There are a lot of very good minds and very knowledgeable and experienced people participating in this discussion. I am hoping that we get some more hard data (I was told explicitly that work has already started on this) so that we get a better sense of what kinds of tools to design, and what kinds of changes should be considered in things like user rights in order to address the issues that have been identified. I think in a couple of years, we'll be talking of the "old days" when IP addresses were visible, and we have the opportunity to actively participate in determining how we get there, so we'd better take advantage of it. I did talk to some of the team involved at Wikimania, and it was really clear that they understand how big a change this is, and that they really have to work closely with the online community to get things in order before pulling the plug. It looks like they've got some reasonable resources to work with (financial and developer) so it can be done well. Risker (talk) 04:34, 23 August 2019 (UTC)

Don't fall into the trap believing that seeing IP addresses is THE way to do abuse mitigation. It is not. It is not the only way, and it is neither the best way. — Jeblad 20:06, 23 August 2019 (UTC)

Something is certainly going to happen[edit]

in light of the recently published Working Group recommendations on Community Health and safety which mentions Anonymising of IP addresses in public domain to protect IP contributor privacy as a core objective. Winged Blades of Godric (talk) 16:50, 10 August 2019 (UTC)

Was there ever any real doubt from the moment this was floated that they intended to implement it, no matter what the community said about it? I think not. Beyond My Ken (talk) 01:11, 11 August 2019 (UTC)
That's a point, as Trijnstel alluded to :-( Winged Blades of Godric (talk) 12:48, 11 August 2019 (UTC)
  • @NKohli (WMF) and CLo (WMF): - the lack of a firm answer isn't helping. Please give a non-wordy answer, do you plan on implementing IP Masking even if the broad editor consensus on this consultation page (and its lingual equivalents) is against it? Nosebagbear (talk) 13:10, 11 August 2019 (UTC)
@Nosebagbear and SQL: My apologies - please allow me some time to respond as I am currently traveling to Wikimania. This is not one person or one team's decision to make. Hence I don't readily have an answer for you at the moment. I will get back to you on that as soon as I can. Thank you. -- NKohli (WMF) (talk) 21:59, 12 August 2019 (UTC)
I'm just going to say here that I sincerely hope nothing is done about this. I'm sure the WMF/employees are aware that barely anyone is going to see this page compared to the tens of thousands of editors whose method of work will be altered in the event it is completed. Vermont (talk) 22:24, 12 August 2019 (UTC)
Well we first need to develop tools that mostly address the issues currently addressed by access to IP addresses and this will require close collaboration between our communities and WMF to determine what else will work (especially since IP addresses are becoming less useful anyway). But we do need to at least IMO partly mask IP addresses to bring location to a country level at least. Our current practices raises reputational risks with all we have seen happen to FB. Doc James (talk · contribs · email) 09:53, 17 August 2019 (UTC)
I take that as an yes to NosebagBear's poser.
At any case, masking the IP address to bring IP-disclosure to country level is as good as masking it in entirety. At-least ~city-level disclosure is needed but then, it gets too close to where we currently stand. Any form of masking affects the scrutiny (and thus, deterring of) of inappropriate edits by the 'public' and I firmly believe that improving anonymous editor privacy without losing 'public' scrutiny are mutually exclusive.Winged Blades of Godric (talk) 13:04, 17 August 2019 (UTC)
Regarding countries: If you edit English Wikipedia from the US, you're fairly anonymous on the country level. If you edit Norwegian Wikipedia regularly from Estonia, on a specific topic, this might very well be enough to identify you. /Johan (WMF) (talk) 13:41, 21 August 2019 (UTC)
@Doc James: Let's be real here. Most of the reputational risk that exists for Wikimedia is WMF pushing things through against the wishes of the community and this resulting in bad headlines. If the WMF would actually reason well about reputational risk this proposal wouldn't look the way it does. ChristianKl❫ 11:09, 22 August 2019 (UTC)
User:ChristianKl not clear what you mean? We of course need better tools to detect problems. These tools should be build first and than IP masking is reasonable IMO. Doc James (talk · contribs · email) 12:00, 22 August 2019 (UTC)
@Doc James:The post says "we should expect a transitional period marked by reduced administrator efficacy". This suggests that the person who wrote this expects that there's a period where admins have a less efficient workflow then they have currently. This matches past efforts of the WMF to deploy software in a state where it's not as efficient as the existing methods for some tasks like the VE integration.
When it comes to a task like creating bans of IP ranges, I could imagine a new tool that provides an admin who uses it all the necessary information to make the ban without telling him any IP addresses that are involved. If the tool is well developed and an improvement in the current workflow we would expect admins to use it instead of using the existing tool. We could wait till there are a few months without any IP range bans made with the old tool (or till there's an RfC to remove the old tool). If we wait till that point till we retire the ability to see the IP addresses there would be no period of reduced administrator efficiency, so that's not what this proposal seems to intend.
Solutions that will create such a period of reduced administrator efficiency will likely lead to community backslash. ChristianKl❫ 13:43, 22 August 2019 (UTC)

Let's get some hard data to really analyse the effect of this proposal[edit]

There's a nice report that has largely been duplicated by the comments on both the content and talk pages here; that is, it seems that there's widespread acknowledgement on a number of factors that will be affected by masking IP addresses used for editing. What isn't discussed effectively in the report, and is pretty difficult to analyse, is what the *actual* data are when it comes to editing by unregistered users.

I suggest that there be a serious study of the editing of IP editors carried out in a recent month. Specifically, I recommend April 2019 because it appears to be the most recent month for which there is good solid data on Wikistats2 that can be effectively analysed. [7] There was a total of 49,415,814 edits across all wikis in that month, of which 2,087,011 edits were performed by IPs (4.2% of total edits). I would like to see a table attached as a subpage here that will provide the following data:

  • The data from Wikistats for each of the different project types for the month of April 2019, by individual wiki, specifying
    • total number of edits on each wiki that month
    • edits in both numbers and percentage by "anonymous", group bot (i.e., registered bot), "name bot" (i.e., account that has the phrase "bot" in its username) and registered users

After that, the next level would be to select examples, at least one from each project type, as well as a range of Wikipedias of various sizes that have average or greater than average IP editing, and look at all IP edits for a selected period, with a minimum of 100 IP edits per project examined. I would suggest that projects with greater than average IP editing receive particular attention.

This shouldn't be hard to do for someone who has the right accesses, which I assume at least some of the people participating in this discussion have.

When I look at the same month of edits by each project type, I find that 14% of edits on all Wikibooks, 20% of edits to all Wikiquotes, 1% of edits to all Wikisources and Wikinewses, 11% of edits to all Wikiversities, 6% of edits to all Wikivoyages, 5% of edits to all Wiktionaries, and 12% of edits to all Wikipedias were performed by IPs. Even I can see that these numbers don't make a lot of sense, if the "all wiki" IP edit percentage is only 4.2%.

We need to better understand, by looking at IP editing on a variety of projects, what happens to those edits. How many are kept? How many reverted? How many result in blocks to the IP address? What kinds of problem edits are we seeing, and how would we be able to identify patterns related to those problem edits?

We've all been talking about IP editing in the abstract, or based on certain principles or anecdotal evidence. Let's get some hard data in play so that we can really figure out what we're dealing with. It's all guesswork at this point, because the last time anyone seems to have done even small-sample work on this was better than 10 years ago. If the WMF has done more in-depth study of this, then they need to publish this data and provide the links. Frankly, I'm kind of surprised that this proposal was made without this really obvious information already published and available for discussion purposes. Risker (talk) 23:14, 12 August 2019 (UTC)

Regarding the numbers not making sense, this is heavily distorted by a small number of projects having a huge number of bot edits. For example, Cebuano Wikipedia has about the same number of articles as the English Wikipedia, but less than 1% as many editors, and with over 99% of edits done by bots. Someguy1221 (talk) 01:09, 13 August 2019 (UTC)
Well, I went back and made up a chart. Here's the relevant table:
Project Type Total Edits

April 2019

IP Edits

April 2019

% of Edits

by IP

All Wikinews 28348 330 1.2
All Wikivoyage 41179 2652 6.4
All Wikiversity 26019 2836 10.9
All Wikisource 309,959 3470 1.1
All Wiktionary 748,925 35313 4.7
All Wikibook 30301 4274 14.1
All Wikiquote 38031 7763 20.4
All Wikipedia 15,518,067 1,894,047 12.2
SUBTOTAL 16,740,825 1,951,685 11.7 (avg)
ALL WIKIS 50,781,814 1,087,011 4.1

As can easily be seen, there's a 34,040,989-edit gap between the total of the main "content" wikis and the "all wikis" total, which I suspect is *likely* Wikidata for the most part (Meta and some other public wikis didn't have separate lookups either), and that most of the edits there would be bots, which would artificially dilute the "all projects" average percentage of IP edits. I am kind of surprised that, even almost 20 years into the project, we're still getting greater than 10% IP edits; I was guessing it would be a lot closer to 5%. It does go to show how important it is to really look closely at the numbers, and the nature of the edits being performed. Most projects can't afford to lose 10% of their edits - unless further research shows that most of those edits are "problems". Risker (talk) 02:21, 13 August 2019 (UTC)

Hi Risker! Thanks for taking the time to talk to me at Wikimania and also for compiling this data. It was very useful to hear your insights and concerns about this project. I agree that there is a need for more data in order to understand the usefulness of IP editing and the consequent impact of this project. I’m trying to formalize the exact data request in a phabricator ticket. Here’s what I have seen mentioned so far:
  • For one or more selected months (including April 2019), broken down by the project, we want to be able to see:
    • Number and % of edits made by unregistered users on the project
    • Number and % of these edits that have been reverted
    • Number and % of these edits that led to a block on the IP address
    • Possible % of good-faith and bad-faith edits based on ORES prediction model (depends on where ORES is enabled)
Did I miss something? We can always get more data - this could be an initial starting point for now. I'm planning to talk to our data analyst this week and get this going. Thanks so much for pointing this out. -- NKohli (WMF) (talk) 23:03, 27 August 2019 (UTC)

Geolocation and blocks[edit]

(Scattered discussions above, so I'll start a dedicated thread)
I often identify rogue users based on editing behavior and similiar geolocation, so hopefully that functionality remains (mostly) intact if IPs are masked. Editors these days go from home/business IPs to mobile to public wifi, and it'd be a huge impact for admins to be able to detect that. We dont want to create a CU bureaucracy. It seems that grouping users on related IPv6 /64 ranges should be a requirement as well. Bagumba (talk) 04:52, 13 August 2019 (UTC)

i feel like this will make the IP problem even worse[edit]

the anonymous thing shouldn't be assigned by a cookie first of all IMO. that makes it way too easy to get a new one. it should just be a representation of your ip address. also, why are you going to take away checkusers ip access? that makes it nearly impossible to detect accounts abuse.

ip's already have too much ability to wreck a wiki with little to no repurcussions. and this is just going to make it even worse. i strongly oppose this. Computer Fizz (talk) 18:19, 13 August 2019 (UTC)

Hi Computer Fizz, there's no plan to remove IP access for checkusers? /Johan (WMF) (talk) 13:35, 21 August 2019 (UTC)
@Johan (WMF): Thank you for the reply. I was under the impression that no one would be able to see ip's. either way i still think that as a consensus driven project you should not implement this because of the almost unanimous consensus on this page. If you're looking to prevent the whole "oh crap, this shows my ip?" thing, maybe for their first edit they have to tick a box saying "i acknowledge this will display my ip address publicly", and it'll place a cookie until it expires? Computer Fizz (talk) 15:28, 21 August 2019 (UTC)
Checkboxing and session cookie would be the logical route. It causes some minor issues but so much less than the masking proposal Nosebagbear (talk) 15:31, 21 August 2019 (UTC)
Given that there's a proposal about limiting the amount of time IP data is stored, that would affect CheckUsers if they can access less IP history. ChristianKl❫ 15:58, 21 August 2019 (UTC)
It seems like people in the discussion believe that because IP addresses are something they usually don't see, it must be like a magic number that never fails, and uniquely stick to the users. It is not. An IP address can be shared and it is short lived. Static IP addresses are often shared through a proxy or gateway, and dynamic IP addresses are leased for a rather short time. Most users/admins have no idea whether the IP address is static or dynamic (leased), and if it is leased for how long time. When the lease time is out, then you can't say anything about who was the user (vandal) behind it. Repeated activity within the timeframe of a typical lease is a strong indication that the ID is static, but random activity over months or years are not. As most of the world uses dynamic allocation it simply make no sense to store the information for longer time than strictly necessary. — Jeblad 09:49, 22 August 2019 (UTC)
Your argument is a bit self-defeating. Essentially you argue that there are no privacy implication of long-term data storage because long-term storage of the data can't be used to make decisions that are linked to an individual user and at the same time you argue that it's important to remove the information.
It's not a requirement for an IP address to stick to an user to be useful information. Knowing how many edits where done from a specific IP address range in the last years is useful information when thinking about whether or not to ban that IP address range.
It's okay when most user/admins don't understand IP addresses. Checkusers on the other hand should understand how they work. Finding sock-puppets is a task that requires pattern matching where having more data allows you to see patterns better. I used to do moderation on a forum outside the Wikisphere and the ability to look at IP addresses was very valuable even when it comes to users who don't have a static IP. ChristianKl❫ 11:33, 22 August 2019 (UTC)
I have no idea how you came up with this; “Essentially you argue that there are no privacy implication of long-term data storage”. There are a lot of problems with storage of IP addresses, but far worse implications when publishing them. Read my answer in #Complete anonymous editing is not a right.
If you find vandalism from an IP address range over years, then you have an atypical situation where schools have assigned whole IP ranges. Schools in most of the world are behind a single static or dynamic address, and if that IP address is blocked then they are gone. You don't need to know the address, you need to know whether an edit is vandalism.
I heard the claim “Finding sock-puppets is a task that requires pattern matching where having more data allows you to see patterns better.” but knowing a bit about fancy pattern matching I still haven't found anyone that can describe a working algorithm to find a sock-puppet by “pattern matching”. The assumption that some specific articles are vandalized by a single vandal over several years is simply invalid. Even when a returning anonymous user does the exact same thing in the same article from the same IP address ten minutes later you can't be sure it is the same vandal. When the time goes by it become less and less likely to be the same vandal. When you are outside normal lease time it starts to become unlikely.
Stop fighting an imaginary vandal, and figure out how you can create a system your opponent (a badly behaving user) can't bypass. It is often called a physical countermeasure as it has some physical properties that can't be bypassed. The badly behaving user should not be able to define or redefine those properties. The IP address is something (s)he can redefine (ie. cycle the power switch on the router) so using the IP address as a means for abuse mitigation is wrong. — Jeblad 20:46, 23 August 2019 (UTC)

┌────────────────┘
This has gone off on a tangent. I can't even tell which of you is supporting what. Either way I think the main point here is that different IP address can be useful information not just randomized strings and that the "Anonymous user blablabla", while a proposal in good faith, should not be implemented in the opinion of myself and a lot of other people as well, as there are lots of major problems with it and the original problem it's trying to solve can be solved in different and better ways. Computer Fizz (talk) 05:30, 24 August 2019 (UTC)

Impact on non-Wikimedia wikis[edit]

From this proposal, it would seem that the reduction in IP visibility will become part of MediaWiki core. This raises several concerns with non-Wikimedia projects which may operate in different environments. I shouldn't need to apologize for not reading thoroughly the entire discussion above and potentially making points duplicating those already raised.

  1. Will the IP-masking functionality become a part of MediaWiki core enabled by default, meaning that non-WMF wikis wishing to use some newer versions of MW will need to address the change in configuration?
  2. If (1) is true, will be there a setting or other form of uncomplicated configuration which allows to completely revert to the old functionality (where IPs display unmasked to everyone)?
  3. If (1) is true, how will the new system work for wikis or wiki farms with no or nearly no user groups between admins and staff members? For example, on one wiki/wiki farm, bureaucrats are almost nonexistent, and there are no checkusers or any other special groups available to regular users.

If non-WMF wikis will be unable to configure the abuse mitigation part of the new system to work better (or at all acceptably) with their environment, they will have to choose between:

  1. upgrading MW as is, while potentially causing a substantial net negative for their users due to their inability to properly re-equip their problem user fighters;
  2. indefinitely stopping updates to MW at the version before the one introducing the changes, meaning that users will have no access to any newer features and possibly security enhancements;
  3. forking MW and modifying the code as necessary; this requires a substantial one-time effort and complicates any further updates, meaning that some non-WMF wikis will be unable to take this option.

In other words, if the new system doesn't allow non-WMF wikis to customize it as necessary (up to the ability to disable the new system entirely), these non-WMF wikis will have to choose between reducing their ability to combat problematic behavior, stopping to provide security updates, or allocating developer resources to maintaining custom code. Any option has substantial negative impact on users of affected wikis or wiki farms.

P. S. At the moment, given some other interactions I am aware of, it seems that non-WMF wikis are at best second-class citizens when it comes to MediaWiki development. It remains to be seen how far they will drop after IP masking is introduced. --AttemptToCallNil (talk) 10:46, 14 August 2019 (UTC)

AttemptToCallNil: I'm pretty sure this can be made a configuration option, so it shouldn't be a primary concern when it comes to decide on adoption of the feature for Wikimedia projects. --MarioGom (talk) 12:22, 16 August 2019 (UTC)

WMF Enactment[edit]

Well we first need to develop tools that mostly address the issues currently addressed by access to IP addresses and this will require close collaboration between our communities and WMF to determine what else will work (especially since IP addresses are becoming less useful anyway). But we do need to at least IMO partly mask IP addresses to bring location to a country level at least. Our current practices raises reputational risks with all we have seen happen to FB.

-Doc James

So taking Doc James' statement as a "yes" to will it be implemented top-down, we seem to have some key questions. Obviously the WMF team is very distracted by Wikimania atm, but on their return, pending questions both above and below need to be considered and answered.

I've just made a bit of a summary of the above (removing pure support/opposes for now) - please feel free to add major areas I've missed. Thank you for reading Nosebagbear (talk)

Why Exactly are they doing this?[edit]

Doc James says it's because of the reputational risks a la facebook - but why is the IP addresses not being masked being compared to facebook? As far as I know all of fb's issues are due to concealed privacy subversions, whereas ours is admitted and actively stated. Why are they equivalent? Have we been getting significant numbers of data protection complaints?

Research Report/WG Recommendation Non-Covered Aspects[edit]

The report/WG states that it is aware of significant admin handicaps, at least in the short term. However, why does the report not consider the issues of non-admins not being able to provide IP issues to admins and checkusers, as there are a significant number? Is there a long-term consideration (as since mitigation tools would logically be fully ready before any implementation was even carried out, short and long term should be similar)?

What is an acceptable loss of time for the limited number of admins and checkusers/acceptable increase in vandalism/disruption?[edit]

A question for both communities and the WMF - a change at all, let along something in line with the rough recommendations in the report, would cause significant delays in combating vandalism (meaning more edits make it through) and draw Admin/CU time aways from their current roles. This stands like however good the adaptation tools are. How much should be accepted for a change (or prohibiting all IPs?)

Third-Party Issues[edit]

Various widerscale issues have been seen with political bodies being tagged to IPs and biased reporting discovered by 3rd parties, if this is being dropped as a possibility, should they be notified? Will other users of the mediawiki software be able to disable any new settings as part of their configuration

Who can see - other options[edit]

A specific question by the WG, which is fundamental - who will see. It suggests that it's possible not even all admins will be able to see "We hope to restrict IP address exposure to only those users who need to see it." - if individuals at a non-admin level who hunt rotating IPs actively need to see it, would they be able?

Effective Tools, more ideas needed for consideration[edit]

Currently the earliest of ideas for tools, such as cookie/session tracking, increased admin intuition (raised in the report), linking to a /64 cluster (insufficient masking?) have all had fairly significant issues raised with them. Either those concerns need to all be handled or more ideas need to be suggested (either by the WMF team or by our more technical community editors) for thought.

Some of the questions have public answers, WMF needs to answer the rest.
  • Data protection complaints, based on the Right to be forgotten, from EU alone from July to December 2018 where 11 and none where granted. (source: transperancy.wikimedia.org/content.html). More detailed answer from WMF would be nice though.
  • Wikis with few or no local admins are managed by the groups global sysop and global rollbacker. There are over 300 WMF content wikis (excluding wikimania, anniversary, foundationwiki etc.) with no local admins at all. These wikis are managed by 19 global sysops and 83 global rollbackers.
  • Wikis without checkuser use stewards instead. According to CheckUser policy there are 36 CheckUser wikis. So, a vast majority of WMF content wikis do not have local checkusers, 785 wikis. That is 95% of WMF content wikis. There are 36 stewards managing this group.
Content wikis with no local admins.
Here are estimates of wikis with no admins. The real number is higher. For example, it is missing Icelandic wikis on wikisource and wikiquote - which have two abuse filter accounts as admins.
Wikipedia 76
Wikibooks 75
Wikinews 8
Wikiquote 43
Wikisource 14
Wiktionary 87
Wikiversity 1
Wikivoyage 5
Total 309
--Snaevar (talk) 17:54, 18 August 2019 (UTC)

Replies[edit]

Thank you for your patience, folks. The team working on this has been at Wikimania and then travelling back, thus the lack of replies. We'll get back to you soon, and update you on our conversations. /Johan (WMF) (talk) 13:42, 21 August 2019 (UTC)

how is it alligned with opposite recommendations from diversity groups from strategy process? (which I strongly disapprove). They are thinking about identify personal traits and you about increasing privacy.--Barcelona (talk) 09:12, 22 August 2019 (UTC)

Communication[edit]

The draft pdf report suggests the "guest user" accounts would be easier to communicate with. This seems like a strong positive aspect of the proposal, but I'd like to see more detail about how that might work. Would we be able to "ping" these accounts, and they would receive notifications? Presumably edits on their talk page would be highlighted for them. If a "guest user" changed IP and was using a different browser (no cookie continuity) would there be a way for them to continue using the same account, or link the two accounts? ArthurPSmith (talk) 17:43, 21 August 2019 (UTC)

Hate it[edit]

Hate it. I think this change will inadvertently promote vandalism, thanks to the extra privacy it affords. I am in favor of grouping IP accounts using cookies though -- that would be fantastic! --Victar (talk) 18:22, 21 August 2019 (UTC)

I'm not entirely sure "inadvertently" is correct. — Arthur Rubin T C (en: U, T) 04:56, 22 August 2019 (UTC)
It is though, unless you're saying WMF wants pages vandalized. Computer Fizz (talk) 06:20, 24 August 2019 (UTC)
Possibly more correct to say the working group doesn't care about the products the Foundation supports, only about IP privacy. — Arthur Rubin T C (en: U, T) 23:19, 24 August 2019 (UTC)
I also dislike ip masking and favour account grouping like that, seems pretty difficult for it to trigger a false positive? Gryllida 23:11, 24 August 2019 (UTC)

Provide new tools before taking away capabilities[edit]

I see no reason why new tools should be introduced in a way that temporarily takes away admin efficiency. First introduce your new tools as an alternative to the status quo. If you manage to provide tools that provide better information then the existing tools and admins are using them as a replacement for the existing tools, then there's the time to discuss getting rid of the old tools and ability to view IP addresses. ChristianKl❫ 09:04, 22 August 2019 (UTC)

+1. If adequate tools are not provided, I will certainly recommend that "anonymous" editors be made unable to edit en.Wikipedia articles. We might negotiate as to whether they will be allowed to edit talk pages. — Arthur Rubin T C (en: U, T) 23:22, 22 August 2019 (UTC)
Absolutely. As I said above, let's get this to the point where we truly no longer need the IP's, and at that time, let's start the conversation about masking them. SQLQuery me! 03:48, 23 August 2019 (UTC)
Exactly. I mean, we've been seeing IPs for more than 15 years. Yes, I deeply understand WMF's concern about privacy, and I want to say that we do care about privacy, but there is something more important: We must ensure that our projects, massive efforts that we've made over years, will remain safe under any circumstances. That's why we block an IP range that affects thousands of users for one day, because we need to do it in order to keep our community safe. Tell me then, how can we find out the range of an IP that is masked? And if we do (indirectly), how can we estimate the damage that blocking it will cause?
I think you are going way too fast. Take it easy, give us those tools, let us see how they work and do it in a "win-win" way. Ahmad252 (talk) 10:13, 23 August 2019 (UTC)
This is the only way to do things. Any other approach deliberately compromises the integrity of our content, is contrary to your team's purpose (anti-harassment) and risks destroying what little confidence the community now has in the WMF. MER-C (talk) 14:47, 23 August 2019 (UTC)
@ChristianKl, Arthur Rubin, SQL, Ahmad252, and MER-C: I'm sorry if this wasn't made clear before - this is a given. We will be building tools first. As part of working on these tools, we hope to be able to automate some of the manual processes that users have been following for years - including doing whois on IP addresses. The aim is to get us to a place where the tools are good enough that users don't need to rely on looking up information about an IP address manually. This includes checking if the IP address is in a VPN/Tor list or if the IP address is from a certain region etc. As a first step, I am looking for ways in which IP addresses are being used currently as this will heavily impact the tools we can build to help editors in their work. I posed some questions here and got some replies but if you have anything to add that's not already covered, please do. Thank you so much. -- NKohli (WMF) (talk) 15:58, 29 August 2019 (UTC)
@NKohli (WMF):What exactly did you mean in the post with "we should expect a transitional period marked by reduced administrator efficacy"? What scenario were you envisioning where that happens? ChristianKl❫ 18:40, 8 September 2019 (UTC)
And @NKohli (WMF): as a follow-up to that, why does it say "administrator efficacy", when many of the issues raised point out that it's a handicap for everyone other than CUs, including the experienced editor corps, that makes up the majority of Counter-vandal work? Nosebagbear (talk) 19:08, 8 September 2019 (UTC)
@ChristianKl and Nosebagbear: When we wrote up the research report (couple months ago), we were not sure to what extent IP addresses are used on our projects. The statement about transitional period refers to the fact that we will be upgrading our existing tools and adding more of them which will require people to make changes to their workflows. This transitional period may cause some disruptions in the ways people are used to working right now. We have learnt a lot of things from this talk page, including the fact that IP addresses are heavily used by non-admins/CheckUsers for anti-vandalism work. I acknowledge that we should rework that statement now that we know more. -- NKohli (WMF) (talk) 18:53, 9 September 2019 (UTC)

In Czech language[edit]

Please use https://translate.google.cz/ or I'll translate it later. --OJJ (talk) 11:27, 22 August 2019 (UTC)

S prominutím, tohle je idiocie na druhou. Vlastně jsem do teď nepochopil, proč by se (zase) mělo upravovat něco, co funguje dobře skoro dvacet let a vyvíjet na to jistě nemalé finanční prostředky, které lze použít jinak? Nestačí pouze jisté vyloučení zodpovědnosti, které na mě koneckonců vyběhne kdykoli edituji nepřihlášený? Tento styl se mi zdá jako přílišná starostlivost o něco, co běžně myslící uživatel pochopí a zařídí se podle toho, a celkem mi připomíná slavný příběh, kdy ženská sušila kočku v mikrovlnce (RIP kočka).

Vezmu jen školní IP adresy, když to poberu velmi jednoduše; tak se nám (správcům, patrolářům?) budou zobrazovat adresy škol. Nehledě k dalšímu plýtvání daty jde zase o plevelení stránek posledních změn. Ha, a když budu chtít do té školy napsat, tak jak to pozná ředitelka?

Další věc. Článek jako [1]. Jsou prostě kolegové, kteří se neregistrovali, nechtějí se registrovat (z nějakého důvodu) nebo na přihlášení opomenou. Takže já když si otevřu takovou historii stránky, s trochou štěstí se mi podaří toho editora kontaktovat a opravit v článku chybu. Pokud tam budu mít Editor123 a po čase už vůbec nic (co autorská práva!), k čemu mi to bude dobré?

Další inteligence je pronásledování ze stran vlády, to jsem se zasmál. :) Tak ať ta hrstka editorů, co chce editovat (Odkud? Je nějaký alespoň jeden takový případ, kdy byl editor pronásledován, stal se, nebo tady táhneme do boje proti nějakému neznámému třídnímu nepříteli?) si buď založí účet nebo ať prostě needituje a hotovo. I pokud jeho IP adresu skryjeme pod účet, nepředstavuje to zřejmě velkou ochranu. Jistou anonymitu na internetu představuje tak možná Tor (sieť a ten je zase blokován na Wiki obecně. Pokud už má někdo inteligenci na to editovat tam, kde nemůže, rozhodně má inteligenci i na to umět se registrovat a vždy přihlásit.

Zkrátka v tom vidím jenom další otravování správců, otravování checkuserů, pozitivní přínos malý až žádný. --OJJ (talk) 11:27, 22 August 2019 (UTC)

Další problém: rangeblock.OJJ (talk) 13:15, 22 August 2019 (UTC)

From google translate:

Excuse me, this is idiocy. In fact, I have not yet understood why there should be (again) something that has worked well for nearly twenty years, and certainly to develop considerable funds that can be used differently? Isn't it just a certain exclusion of responsibility that ultimately runs out on me whenever I edit an unsigned? This style seems to me to be overly caring for something a normal thinking user understands and handles accordingly, and reminds me altogether of the famous story of a woman drying a cat in a microwave (RIP cat).

I'll just take school IP addresses if I take it very simply; we will see school addresses for us (administrators, patrolmen?). Regardless of further waste of data, it is again a weed of recent changes. Ha, and if I want to write to the school, how does the headmaster know?

Another thing. Article as [1]. They are simply colleagues who have not registered, do not want to register (for some reason) or forget about logging in. So when I open such a page history, with some luck, I manage to contact the editor and fix the error in the article. If I have Editor123 there and after a while nothing at all (what a copyright!), What good would I do?

Another intelligence is the persecution by the government, I laughed. So let the handful of editors want to edit (Where? Is there at least one case where the editor has been persecuted, has happened, or are we dragging into an unknown class enemy here?) Either creating an account or just needing to edit and done. Even if we hide his IP address under an account, it does not seem to be a big protection. A bit of anonymity on the Internet is perhaps Tor (network and it is blocked on the Wiki in general. If someone already has the intelligence to edit where he can not, he certainly has the intelligence to be able to register and always log on.

In short, I see only another annoying of administrators, annoying checkusers, a positive contribution little to no. --OJJ (talk) 11:27, 22 August 2019 (UTC)

Next issue: rangeblock.OJJ (talk) 13:15, 22 August 2019 (UTC)

I hope that was reasonably accurate, OJJ. If the translation was unwelcome, or inaccurate - please remove it. SQLQuery me! 20:44, 23 August 2019 (UTC)

The overall reaction of the Czech Wikipedia community to the proposal is negative (some proposals of restricting IP edits at all). ([8]) — Draceane talkcontrib. 07:34, 3 September 2019 (UTC)

Abkhazian wikipedia[edit]

Hi! In abkhazian wikipedia I had two problem with unregistered users. One (in my opinion) person whose IP address location is from Kazakhstan (example). He from the beginning was using different and different IP-addresses (why? I don't know!). Later he decided talk with me about life and something unnecessary and unencyclopedic therms and gave me jobs, like do this, do that, later that (like he was sysop)... There was some moments when he was right. I told him lets start work as registered member that I will write letter directly one place. He didn't register and continue change his IP-address on every edition, saying you cant catch me and because I am untouchable and will do whenever I will decide. Then I start block his IP-addresses but those which Kazakhstan IP-addresses I could identify as his (because he writes to me like my manager). I blocked about 20 entity, but twice more I left. His work mostly was useful for abkhazian wikipedia and I didn't reject all changes, just those which was not grammatically incorrect.

For example, IP-addresses of this person I blocked forever. I don't know possible I am wrong. If so please contact to me and If I will agree I will change my decisions!

Second unregistered person, whose IP-address location was from USA added BC years. He start from zero and wanted to continue to dinosaur era but I stopped him on BC 2000 year. He tried to continue his "job" but I stopped him, blocking, grouping deleted, IP-blocking. At the end he tired and stopped.

At the and about unregistered users I can say only that, every situation is uniq and sysop should decide what to do. For example, if user can changes his IP-address every time and totally have around 100 new created IP's, it isn't possible to catch him. I think that better way to use this persons behalf of wikipedia. Some day they will recognize that they are instrument who spent time in wikipedia. possible some day they become good wikipedians. God knows!

But main question for me is following - ware that parsons from Kazakhstan and USA or they was one person from one some other country. Really I don't know!--Surprizi (talk) 08:41, 23 August 2019 (UTC)

It is possible to do timing analysis to locate devices on internet. I don't know what kind of resolution is possible in Abkhazia and nearby areas, but it usually is around 10-100 km. On sites like Wikipedia you will simply hash the location and put a temporary ban on a small IP range or the browser fingerprint in the hash bin. It is extremely effective, but we don't do it. — Jeblad 20:16, 23 August 2019 (UTC)
It seems like timing analysis would be pretty easy to get around. Especially if the measurements are made against only a few network locations. Even Chrome comes builtin with a tool to adjust bandwidth/latency characteristics for the purposes of testing. Bawolff (talk) 09:44, 24 August 2019 (UTC)

Petition to WMF[edit]

The below editors intend to resign all sysop and higher rights if this proposal is implemented as written.

  1. Rschen7754 (talk · contribs) (English Wikipedia/Wikidata administrator, former steward) - I deeply believe in privacy, having been the victim of harassment on other sites. And I think that there is some room for masking the IPs of editors where government censorship is an issue (though a lot of spam comes from Chinese IP addresses, so that might be tricky). But this proposal will very quickly decrease our ability to deal with basic vandalism and spam by taking away the ability to handle rangeblocks, open proxy checks, VPS checks, blacklist checks, etc. Seeing how the WMF rolls out things like Visual Editor, and Flow (where you could not even block editors from making Flow posts in code deployed to production), and how poorly infrastructure like WMF Labs is run, I have little faith that these so-called tools will help us. And if we really care about privacy, we should think about the cross-wiki abusive editors who will post the residential addresses of users from many mobile IPs, and who will not be able to be blocked if this proposal is enacted. If administrators and some specific global groups were given the ability to see these IPs, I could be convinced to support, but without these assurances, I cannot - and I would find the tools not of much use and would resign. Rschen7754 00:31, 26 August 2019 (UTC)
    I will add: this proposal shows the severe disconnect between WMF and the base of the volunteers who do the heavy lifting of keeping the site running. It and the Fram debacle are very discouraging as it is. --Rschen7754 00:34, 26 August 2019 (UTC)
    Some damn fine points here.

    Plus with WMF wikis having spambots flooding the system, as IP addresses and creating accounts to spam, and they don't show the inclination to fix those issues, and then to take away some of the few defences that exist, it is quite dispiriting to see this sort of proposal. If IP edits are a problem, then maybe it is time to remove IP edits and force accounts. (Noting that I am not threatening to resign my rights, but if you make it harder to stop spam, definitely won't be helping further in that space.)  — billinghurst sDrewth 02:07, 26 August 2019 (UTC)

    What does it even mean to be "implemented as written"? There's no description of something to implement in the current text, either in the form of architecture/design or technical specifications. Nemo 07:53, 26 August 2019 (UTC)
    At the end of the day I don't know how you do rangeblocks if you don't have the raw IPs. Not to mention that the history of IP editing before and after this changeover will be fragmented (and that detail is clearly specified in what is written). --Rschen7754 18:26, 26 August 2019 (UTC)
    The new model. Software design by resignation ;) —TheDJ (talkcontribs) 13:16, 26 August 2019 (UTC)
    It is indeed sad that it has come to this. I am a software engineer, and in my software engineering course important principles were listening to the customer and getting feedback, and implementing it. WMF has done a poor job of this on Flow, on VisualEditor, on MediaViewer (and there's certain things on Wikidata that they didn't listen to feedback on, either). Sometimes, I wonder why I still care about this project. --Rschen7754 18:26, 26 August 2019 (UTC)
    I must agree that I would have supported a petition to push WMF to finally resolve old issues with the software (including working on material that is seriously outdated) instead of working on the implementation of new things with questionable use. I am sure that this is going to be written and implemented, and then the big wikis will !vote against it being enabled leaving the use to some small wikis (and through that seriously hampering any crosswiki antivandalism work). Point is that the petition would likely gather some votes, but there is no way to 'enforce' the outcome anyway (see this indication of how much success that would have).. --Dirk Beetstra T C (en: U, T) 14:01, 26 August 2019 (UTC)
    Obviously this didn't get the response that I was hoping for, whether it be because it was one of almost a hundred threads and nobody saw it, or because nobody felt that strongly about the matter. Truth be told, if WMF does a crummy job with this I will still resign all my rights and leave the site - out of disgust. But I'm striking this because I don't want to box myself into a course of action for something that won't happen until 2020 or 2021, and because in the meantime I do want to at least feel some motivation to make this site a better place, with the little time that I have (read: if nobody's going to join me, it is probably better to retain the tools until WMF actually does something). --Rschen7754 05:42, 6 September 2019 (UTC)


Comments[edit]

I am not sure how good of an idea this is. Many editors with elevated permissions have already fallen on their swords over the Fram affair, but has the Wikimedia Foundation's actions relating to Fram and other cases not shown that that is exactly what the WMF wants: their volunteers (editors) to be just a free workforce without any influence? Thus this petition could actually be counterproductive. Notrium (talk) 12:27, 19 September 2019 (UTC)

Reduce user experience in the name of privacy[edit]

Some of vandalism that were traced and found are not based on specific anti-vandalism tools, but through existing user-contribution interfaces. It is not a desirable method to grant exemptions through so-called whitelists and to limit the common interface to the public.

When a user edits with an IP account, it means that he doesn't mind the problem at all, or that's exactly what he expects - using an IP address that might be a proxy. --Cwek (talk) 01:45, 26 August 2019 (UTC)

Here are some applications: My local project will get the edited IP address through the user contribution interface, and then connect to toollabs:whois, so that you can know the country, ISP, and CIDR range of the IP. If a specific set of vandalisms occurs at different IP addresses, you can cross-analyze the information to get information about whether it is the same real human editor, whether to use a proxy.--Cwek (talk) 01:54, 26 August 2019 (UTC)

Slashdot turns off anonymous editing[edit]

I'm surprised that no one has mentioned Slashdot's recent decision to curtail anonymous posting and commenting. Users without an account could previously post as "anonymous coward" (no IP addresses revealed, either). I guess anonymous posting became too much of a problem despite Slashdot's community moderation of posts and comments. Bitter Oil (talk) 17:15, 26 August 2019 (UTC)

I've been arguing at some occasions that we should make it more worthwhile to register an account, even using real names, and tie it closer to other accounts out on the net, thereby triggering more willingness to invest in the registered account. It does not seem to attract much attention. Given that people at various Wikimedia-projects want “anonymous” editing it is better to find solutions than keep on arguing for something that will not get traction. That is why I dropped arguing for registered accounts to have higher access rights than anonymous accounts. — Jeblad 13:59, 30 August 2019 (UTC)

Summary notes from Wikimania discussions[edit]

Wikimedia Foundation staff working on IP Editing: Privacy Enhancement and Abuse Mitigation who attended Wikimania 2019 in Stockholm had many discussions with attendees about the project. Here is a link to summaries written by Niharika, Johan, and myself about our conversations. I'm not going to make a summary of the summaries in this post. :-) But we are glad to discuss these notes if there are questions. SPoore (WMF) Strategist, Community health initiative (talk) 17:19, 27 August 2019 (UTC)

  • Thanks guys for these summaries, they are helpful to have and raise a couple of interesting points as regards discussion process and elsewhere. I'm particularly interested that you seemed to have run into quite different consensuses of views. @NKohli (WMF), SPoore (WMF), and Johan (WMF): (I've pinged all of you since I thought you might have different answers). Did you get levels of pro/anti-masking (and other differences, like primary concerns) from "advanced permission" users (CUs/Admins etc) from "rank and file" editors? Would you say most of the latter were Swedish? Nosebagbear (talk) 09:28, 28 August 2019 (UTC)
@Nosebagbear: I am not familiar with the term "rank and file" editors. Can you explain that briefly? :) Speaking for myself, most of the pro-IP masking users were European but not necessarily Swedish. I recall talking to German, Dutch, Swedish, Italian and some US and Asian people. I actually only found one person at the conference who was not in favor of IP masking. Is this helpful? Thanks. -- NKohli (WMF) (talk) 23:31, 28 August 2019 (UTC)
@NKohli (WMF): - sure, it means those of us editors who don't have advance permissions (in effect, functionaries + Admins). While I'm a bit surprised you found such general unianimity for the base concept (given the large number of "straight nos" here), obviously there are different variants "IP/non-ACs can't see, non-ECs can't see, non-Admins can't see etc" - I was looking to see whether the Admins et al you talked to gave different specifics to the non-Admins Nosebagbear (talk) 07:54, 29 August 2019 (UTC)
@Nosebagbear: Ah, okay, I understand your question now. I didn't organize a very formal feedback session on this - instead I had casual hallway conversations with a large number of people just to get a sense of what the general feeling is. One thing I heard loud and clear is that people on smaller projects are more hesitant about this because they don't have as many (or any) CUs and admins like the bigger wikis do. This completely makes sense and again nods at the fact that there will have to be a way for non-functionaries to be able to see the IPs or the equivalent information (location, institution etc.). It could be a new user-right that is granted by the community or automatically granted once the user has been "trusted" (several possible ways to do that) or it could be claimed via access to a special page that asks them to be responsible in their use of this information etc. I also talked to a couple of stewards after that who are effectively the ones taking care of these smaller projects and they also mentioned that such a project should not proceed without tools like global checkuser and blocking being built first. -- NKohli (WMF) (talk) 15:40, 29 August 2019 (UTC)
Nosebagbear: Part of it might be that people simply behave differently when they're facing someone in the flesh,. We're more likely to try to bridge the gap between us, less inclined to dismiss what the other person is saying, making only if you do this instead of no more likely. To answer your question, maybe ten percent of the people I spoke to about this were Swedish. I'd say I got more concerns from the vandal fighters, which include CUs and admins to a higher degree. /Johan (WMF) (talk) 10:04, 9 September 2019 (UTC)

testing the change with JS/CSS ?[edit]

Hi,

so I did relay the news to the french wikipedia, but while we discussed that, I wondered if it wouldn't be worthwhile to start by giving a way to people to test the change without doing too much work. For example, someone could develop a quick js extension that do mask the IP address on history, and block some URL (like whois, etc), and then volunteers could use the extension for 1 week and report how this did impact their work, and so be forced for a 1 week to use existing tools, record what is missing the most, how much time did they had to ask to someone to look a IP for them, etc. This seems to be a good way to get data on what people use without starting to do lots of development, and lots of tiring discussion. Then once the tools are working well enough, then someone can start working on the change. That seems like a win-win situation. --Misc (talk) 17:20, 27 August 2019 (UTC)

Sounds like an interesting idea, but it would be pretty easy to bypass. Computer Fizz (talk) 15:08, 28 August 2019 (UTC)

Why? oh why? oh why?[edit]

Sounds like a lot of work and wasted money that will just make vandalism and harassment easier.

The obvious alternative is just to require registration. That would be essentially a $0 cost.

Smallbones (talk) 21:42, 27 August 2019 (UTC)

How will we keep vandalism and harassment off our wikis? If somebody does not want his/her goverment to know his/her IP adress, just needs to create an user and nobody will see the IP adress! --Jalu (talk) 20:53, 1 September 2019 (UTC)

IPs do not need to be protected, Wikimedia projects do[edit]

Why remove our capability to track basic data given from IP's? We need as much information, especially for abusive IPs. We need information for POV pushing IPs. We must not hide this information from the users. It is just the opposite. We need to make it more evident. IPs are warned before editing. They do not need any privacy. If they need privacy, they can create an account. --FocalPoint (talk) 22:28, 27 August 2019 (UTC)

Word, bro :-) Tbh, I consider “out-logged users” who are mainly active in non-article-related namespaces (i.e. “meta IPs”) a pest (sorry for harsh language). Not all, of course, but there’s a handful of toxic users who, already, are utilizing all possibilities of modern internet access, including smart phones and WiFi. Especially smart phones allow quick IP change and even provider change (open WiFi, multiple SIM cards). Such users are already hard to track. I really wonder how it might work to track them when IPs and geolocations are masked. So please bury that project! It probably will not help to gain users in the long run, but it might lead to a severe loss of exactly those users who provide most of the new content and who care for updates of the existing content! It is exactly those users who deserve protection, not the occasional ones! --Gretarsson (talk) 00:08, 30 August 2019 (UTC)
Read my answer in #Complete anonymous editing is not a right. — Jeblad 13:49, 30 August 2019 (UTC)

Split project and talk page[edit]

Hello all! As apparent, there are way too many things to unpack here. I created two sub-pages: First to talk about privacy enhancement for unregistered users and the second to talk about the tools we can fix and build to make our projects more resilient against vandalism. The general, broader conversation about this project should stay on this page. There is a header, created originally by Incnis Mrsi (thank you!) that is present on top of all these pages for easy access.

I will be copying over some of the great ideas raised in this page to those project pages. Please feel free to move relevant talk page threads to the appropriate page. I appreciate all the help and involvement. :) -- NKohli (WMF) (talk) 18:07, 29 August 2019 (UTC)

Good vs bad editing from a common IP address[edit]

A quite common situation now is that a user, probably a kiddo, is bored at school and post some rant on a page. Then another user, probably a teacher or another kiddo, from the same school clean up the page. Then an overeager admin comes along and block the IP address, thus blocking both users. This is bad, as we should be able to distinguish the users. If we use use synthetic names the name given to the good editor could be made to stick as “non-reverted new user”, bad editors would be “reverted new user”, and new anonymous editors (probably the bad one) would be visible as “unknown new user”. Imagine the first marked as green, second as red, and last one as yellow, or with alternate icons. It should make it much easier to spot vandalism, and avoid blocking good editors. — Jeblad 14:12, 30 August 2019 (UTC)

Vandalism fighters as a very vocal group[edit]

Note that different activities at Wikipedia attracts different types of people, and some of the groups tend to be more vocal about their activity. My impression is that vandalism fighters tend to be much more vocal than other groups, and tend to give their own activity much more focus than other groups. This may lead to a bias towards Abuse Mitigation, and not so much focus on Privacy Enhancement. We should probably try to rethink how we operate, not just add some tools to satisfy the vandalism fighters.

I should probably also say that some years back I did a lot of vandalism work, writing bots to track users and so forth. — Jeblad 14:31, 30 August 2019 (UTC)

When someone plans to run a freeway through a neighborhood, it might be understandable that those who live there are the ones complaining the loudest at the city council meeting. GMGtalk 15:15, 30 August 2019 (UTC)
Not sure whether the comparison make sense, as there will be a lot more than vandal fighters living in the neighborhood. The vandal fighters is just a very vocal minority group. — Jeblad 15:29, 30 August 2019 (UTC)
I would say it does in a sense - the wider community (the full neighbourhood) would be affected but the ones who lose their workplace at the same time (the vandal-fighters) are the ones most affected and rightly aggrieved. Nosebagbear (talk)
By the way, I don't mind losing my job, if someone else can replace me well. Which is why we need good mitigation tools and if they are implemented well and vandals don't come, it will be perfectly fine to mask IPs. --Cohaf (talk) 15:39, 30 August 2019 (UTC)
In the analogy, the vandal fighters, the ones "in the neighborhood" are the ones who stand to lose something and have their lives become a great deal more difficult. Everyone else may have leave to consider how the freeway might affect their commute, and the niceties of marginal increases in privacy, without really needing to concern themselves with the negative consequences. GMGtalk 15:46, 30 August 2019 (UTC)
So, one of the way to prevent the negative externalities this group of resident faces for the freeway is to build it underground, more expensive and length of construction will increase, but it's a win-win solution. I hope it is the case here.--Cohaf (talk) 16:11, 30 August 2019 (UTC)
If this is supposed to mean tools that don't have any negative effect, then yes that'd be fine, but we've not seen suggestions of tool solutions that haven't immediately had issues pointed out. Alternatively, the neighbourhood wouldn't mind if only other commuters and non-resident inhabitants were inconvenienced, so long as the problems didn't spread wider than that.
There are several solutions to abuse mitigation, but our choice is to allow IP editing and that leads to vandalism. — Jeblad 19:54, 30 August 2019 (UTC)
That statement is being made without any factual basis to back it up. My own experience is that at least 60% of vandalism is done by registered accounts; it may be correct that unregistered editors are somewhat more likely to vandalize, depending on the project, and the WMF is now working to start putting some statistical data on the table so that there is a shared and realistic understanding of what unregistered editors do and don't do. There's also an unfortunate tendency to equate any unhelpful edit to vandalism, even though we know that people constantly add things in good faith, not knowing exactly the "rules" that apply to that edit. There's also a pretty wide variation in what projects would even consider an unhelpful edit; for projects that do not demand a high level of referencing, many edits that get reverted on English or German or Italian Wikipedia would be considered quite acceptable. Risker (talk) 20:10, 30 August 2019 (UTC)
  • GMG has an excellent point, but I'd also say that comparatively few editors are solely (or near-solely) vandal fighters. I do occasionally see privacy related tickets at OTRS, but only 1 has ever been from someone unhappy about the lack of IP-masking (said in different words). I also work in GDPR as a living (the ironies of posting that online do occur), so privacy thoughts do occur to me. But it's that side of my experience that makes me aware that it's not really use of data that makes people angry or have a problem. It's hiding usage, lying and abusing that data that makes individuals enraged. Wikimedia does not do any of those. It's blatant, up-front and uses the data in extremely minimalised ways. Nosebagbear (talk) 15:31, 30 August 2019 (UTC)
  • If that is so, let's us not fight vandalism anymore, I don't know how the wiki will look like if there isn't counter-vandalism users. Per Nosebagbear, few users are exclusive into vandal fighting, what I care about is that my and other people work will not be affected by vandalism. You cannot do content work without counter-vandalism efforts by the way, unless you are working in very secluded articles. --Cohaf (talk) 15:35, 30 August 2019 (UTC)
  • It isn't simply adding some tools; it's making anti-vandal work without knowing the IP(s) as easy as knowing it. If it becomes too difficult to track patterns of vandalism then there simply wont be anyone doing it, and that is most definitely a negative result that can and should be avoided. I echo GMG's concerns. Vermont (talk) 16:55, 30 August 2019 (UTC)
  • Way to attempt to marginalize the legitimate concerns of your fellow editors. --Rschen7754 18:20, 30 August 2019 (UTC)
  • Rschen has put it succinctly. (Disclaimer:- I do fight vandals (often exploiting IP addresses in the process) who tend to disrupt articles on my watch-list but has not made my wiki-career of Huggle/STiki. I thus believe that our benevolent taxonomist ought to enroll me in some other species than Pediana vandal vocalis.) Winged Blades of Godric (talk) 19:16, 30 August 2019 (UTC)
  • The posts above shows very clearly the problem; vandal fighting is defined as the most important thing, but it is not. This happen when a some support activity is seen as the core activity. Our core activity is editing article content and providing read access to that material. Fighting vandals comes as a choice because we allow IP editing, and we allow IP editing because a very small minority fixes hard to find errors. We must figure out how we can change IP editing so vandalism is mitigated. Defining current status as the gold standard is pretty dangerous, and will not lead to any improvement. — Jeblad 20:01, 30 August 2019 (UTC)
  • Above consists of a whole bunch of editors stating it's not the most important thing. Certainly, upping the proportion of IP editors who are productive (with its commensurate downwards push on negative IPs) would be great. Do we have any particularly good ideas on how to make large enough inroads into that issue to render this moot? I've not heard any but I'm certainly game to hear out any decent proposals. Vandal fighting would have to remain at its current quality until that was achieved though - it can't effectively be reduced as a means. Nosebagbear (talk) 21:11, 30 August 2019 (UTC)

Pass the IP-user an edit token on email[edit]

If we pass the IP-user an edit token on email (s)he must provide a valid email address, and we can then block the email if necessary. We can create a request that repackage the whole edit, sending a reply back to the provided email address, and then doing the edit from an “accept” link within the email. We don't have to keep the email address, we can keep a MD5 or some other digest and block on that digest. The actual edit does not have to be in the email, it is sufficient to refer it somehow. The workflow is virtually the same as creating an account on a listserver, but with an additional number to visually identify the edit.

This will make it more cumbersome to edit as an IP user, but it will still be possible.

Having hashed email addresses for IP-users could make it interesting to demand email address from registered users, and holding them as unconfirmed users until an email address is provided. If the provided address is previously blocked, then the block is transferred to the unconfirmed account. That would partially block dormant vandal accounts from being used. — Jeblad 15:25, 30 August 2019 (UTC)

Surely canning IP-editing would be more logical than this option - it seems to have all of the negatives without any of the positives?
This directly violates the Privacy policy, which specifically does not require an email address from anyone. Keep in mind that there are already over a million accounts, most of which have been dormant for years, whether they were blocked as vandals or not, so it is not something that can be retrofitted. Risker (talk) 15:35, 30 August 2019 (UTC)
You will not be required to provide an email address, but if you do not provide one you will never be autoconfirmed. Dormant account will not be autoconfirmed, but already confirmed accounts will stay confirmed. Yes you can create an account without an email address, but no, you can not have full access. A lot of the projects has already limited access to some roles, only granting access to those that has provided a valid email address. — Jeblad 19:48, 30 August 2019 (UTC)
Aside from the fact that requiring email addresses of all users, including registered users, is out of scope for this discussion, it would "de-autoconfirm" about 30% of all accounts, based on a random sampling of 50 currently autoconfirmed users on enwiki. Given how incredibly easy it is to create a throwaway email address (there are literally dozens of organizations and groups that offer this), it's not a security or anti-vandalism measure that will do anything other than annoy good users. Risker (talk) 20:03, 30 August 2019 (UTC)

Surfacing more information, not less[edit]

What if we extracted some of the information that could be retrieved for an IP address and displayed it publicly?

Would that serve as an alternative to having people recognise related IPs? Would it deter some CoI/PoV edits if the person sees their company name before hitting Publish?

  • Anonymous Tiny Rabbit via Parliament House; Canberra, AU
  • Anonymous Flappy Bird via Comcast; Ontario, California, US
  • 123.145.167.189 via SingTel; Singapore, SG
  • Anonymous Crazy Heron via NordVPN; Manchester, England, UK/GB (oops don't want to open that can of worms)
  • Anonymous Purple Donkey via BlueCoat Threatpulse Web Protection; AAPT, Aukland, NZ

I've gone for "Anonymous adjective animal" here, but it could just as well be done with the actual IP address as in one of the examples. Wiki-defenders (squashers of vandals and investigators of harassment) would need more detailed info, of course.

Anonymous Pelagic Jellyfish via TPG Internet; Seven Hills, NSW, AU. 02:27, 31 August 2019 (UTC)

Geolocation[edit]

We already have a GeoIP cookie, mine currently is "AU:NSW:Westmead:-33.81:150.99:v4" (it's over 10 km from my actual location, so I'm not worried about any of you turning up on my doorstep tomorrow!).

Is this data logged and made available for anti-abuse work?

See mw:Universal Language Selector/Compact Language Links#How do you determine my location?

Pelagic (talk), AU:NSW:Westmead 4RRG5X00+. 03:06, 31 August 2019 (UTC) 13:06 (AEST)

Actually, I'm in the Underworld. I have simply possessed a human on the surface in order to interface with the network. Broadband quality down here has, um, gone to hell. Mwuahahaha! (ha) 03:23, 31 August 2019 (UTC)
How about a tool that lets you select a group of edits/revisions and displays them on a world map using the lat:long? Pelagic (talk) 05:39, 31 August 2019 (UTC)

Threat actors[edit]

I'm not a cybersecurity expert myself, but from what I've read they seem to be fond of asking "who is the adversary that you're trying to defend against"?

So, who would be misusing IP addresses, that we should hide them?

  1. An oppressive state where Wikipedia is one of many sites banned for the potential to spread wrong-think amongst the populace. All internet traffic is channelled through state-owned conduits where it is inspected and/or blocked. People there who do manage to get to WP would already be using VPN or some other circumvention that would put their IP address somewhere else. The state is probably able to track them down regardless of whether that IP address is shown or hidden on WP.
  2. A theoretical oppressive state that doesn't have a Great Firewall or TLS interception, but does access ISPs' records (logs and customer lists) and/or compels citizens to use a mandatory propaganda app on their phones every day (which also tracks the device's location and matches citizen ID to IP address). They could maybe correlate an edit to a real person or a group of suspects based on timestamp (depending on the size of the population and volume of traffic), but having the IP address would make their effort a hell of a lot easier.
  3. Another state that compels ISPs to hand over their customer records and IP addresses, but not traffic logs. They can now match the Wikipedia edits to real people via the publicised IP addresses en masse.
  4. A government bureaucratic or "law-enforcement" body gets a court order for the ISP to reveal the customer records for specific IPs involved with a particular set of edits. Having the IP addresses up front allows them to filter out people who are outside their jurisdiction and focus on those who they can prosecute. This could be used for preventing terrorist or insurgent actions, or for persecuting those who embarrass the powerful.
  5. A company that does TLS inspection of its employees' web traffic by deploying a trusted root certificate on their computers would be able to report on what pages those users are visiting. They don't need the IP address that's listed on WP, and that would be a common NAT address or the addresses of the security provider anyway.
  6. A company that doesn't do TLS interception would only see the time and the destination IP that could identified as a WMF server, but not the specific page. The source IP on WP would be a NAT address: it could confirm that the edit came from the company but wouldn't identify the user. It could be a handy means for the company to identify all edits that were made from their infrastructure: just look at the contribs for that IP-user. Matching to a real person would require correlating timestamps.
  7. Journalists or activists who report on edits coming from political party offices and government offices as a matter of public interest. How dare they?!
  8. Anybody with a website can collect IP address from their own logs, match them with their user records (if they have a user sign-in function), and also match them with Wikipedia edits.

The last point above is alarming. Can you imagine a Cambridge Analytica style actor who got more-than-reasonable access on a major social platform and crossmatched that data to the open wikipedia history?

I take special note of Johan's comments above at 19:33, 5 August 2019 (UTC). Especially There's a rather large difference in capabilities ... It matters if we make something easy, or slightly more difficult. I find it reassuring that we have someone involved who has experience in the area he mentioned. But also we must be mindful of the possibility that Dave Braunschweig raised: people infiltrating the wiki community and gaining direct access to the information this project is trying to hide.

Pelagic (talk) 05:34, 31 August 2019 (UTC)

If the threshold is autoconfirmed to access this information, there is really no point having a barrier at all. MER-C (talk) 20:03, 31 August 2019 (UTC)
I tend to agree — there really is no point to having a barrier. Unless the new tools work better than I believe possible, "extended confirm" seems the maximum restriction that would allow the Wikis to operate. I suspect, although it is difficult to prove, that more IPs perform anti-vandal edits (using existing anti-vandal tools) than are vandals. — Arthur Rubin T C (en: U, T) 01:30, 1 September 2019 (UTC)

I think my answer here would probably be: I would have had not much resistance against this proposal, if WMF would have seriously increased (and/or kept up with) the capabilities to combat undesired behaviour on-wiki. Our tools are crude, our tools are outdated. Sockpuppets have free reign until someone notices something and checkusers them (I have a sockpuppet that I can recognize, but seen that they have enormous ranges of IPs to their disposal, it is 'dweilen met de kraan open' (mopping with an open tap); the edit filter, the spam blacklist, the blocking tool are heavily outdated. There is a community wishlist but I have no confidence that significant time is being put in that (and that has been publicly stated). I understand the problem of saving your edit as an IP, but as it currently stands I cannot support this proposal as our other tools cannot keep up with this. As usual, the approach is top-down, not bottom up. --Dirk Beetstra T C (en: U, T) 07:10, 1 September 2019 (UTC)

Dirk Beetstra: Reasonable. Just making sure: you are aware we are not proposing to do this without actually addressing the tools first? /Johan (WMF) (talk) 09:56, 9 September 2019 (UTC)
@Johan (WMF): That sounds great!!! But overhauling some of these tools is going to take a significant amount of time (one of them has basically waiting to be overhauled for >10 years ..) .. and it would have made the choices here much easier. I am worried to shoot down this project at this time because it is 5 years too soon.... Maybe it is a good idea to start off with those projects then ASAP? --Dirk Beetstra T C (en: U, T) 10:26, 9 September 2019 (UTC)

Who would benefit that?[edit]

As I've mentioned in the cswiki discussion on this proposal, there are only three significant groups of users, who would benefit IP masking. These are:

  1. registred users, failing or forgetting to log in
  2. little contributors unwilling to register
  3. IP vandals

I don't see any reason to encourage any user from these groups to continue contributing anonymously. The first group can be handled through oversight and should be more aware of logging in, users from the second group are editing Wikipedia on their own risk (knowing that their IP would appear) and the third group shouldn't edit Wikimedia projects at all.

— Draceane talkcontrib. 13:59, 3 September 2019 (UTC)

About the oversight thing, the wmf are hoping this will cut down on that because 95% of their oversight requests are this kind of thing. The second one, some people have banner blindness and I suggest solving that with a tickbox, (i.e. something like "I acknowledge pressing Submit will showmy IP"). Agee about the third one, idk why the wmf wants to spread their ass wide to vandals but I don't like it. Computer Fizz (talk) 03:12, 4 September 2019 (UTC)
A tickbox would actually solve both the 1st and the 2nd groups. By WMF, who don't directly handle oversighting, do you mean the Stewards, or the WMF just wanting to need fewer people with OS rights? Nosebagbear (talk) 08:31, 4 September 2019 (UTC)
I mean, they said it'll cut down on that which you can hope without having oversight. It could either mean the local oversighters (for large wikis) or the stewards. Computer Fizz (talk) 16:56, 4 September 2019 (UTC)
@Computer Fizz:. Are you saying this will cut down on the anti-vandalism that can be done without the oversight and/or checkuser power? If so, I agree, but it doesn't really seem clear what you are saying. — Arthur Rubin T C (en: U, T) 00:30, 5 September 2019 (UTC)

Note from Legal[edit]

Hi all. I’ve seen some of the discussion on this page focus on the impact of publishing IP addresses of unregistered contributors on user privacy, so I’m providing some additional thoughts from the Foundation Legal team’s perspective.

Part of our job is to track regulatory, policy, and societal trends that impact the projects, movement, and Wikimedia users. We’ve watched as people around the world have started expressing more general concern about online privacy and safety, and we receive questions every day about privacy and our data handling practices — including questions about editor IP addresses on wiki pages. I’m excited that we’re looking at our use of IP addresses with fresh eyes, and working with the community to consider new approaches to enhance the privacy of unregistered users.

Our commitment to user privacy includes continuously reevaluating and improving our privacy practices. Reducing public use of IP addresses would be a significant step towards improving our protection of unregistered users. Of course, we are also committed to providing contributors with the tools to fight harassment and vandalism. We want to find a way to do both — well. As the Anti-Harassment Tools team continues this conversation and starts designing and refining their ideas, the Legal team will act as a sounding board, and share our perspective — including here on-wiki (as we are able). Thanks to everyone who has contributed to this important discussion!

--TSebro (WMF) (talk) 23:47, 4 September 2019 (UTC)

Morning coffee commentary[edit]

There have been a few comments to the effect of "well we'll just stop fighting vandalism then". It doesn't seem particularly likely that that's liable to happen. What seems more likely is that if this is implemented poorly, it's liable to embolden a popular movement to ban anonymous editing entirely. Now, the Foundation might take exception to this, but apply a little bit of common sense here folks. We don't really need the Foundation's approval; we can already do this with existing functionality. What that looks like is dramatically lowering the bar for semi-protection across projects, and indefinitely semi-protecting massive swaths of pages.

So lets be clear on exactly what's at stake here. I don't think any of us want to be the trigger-man in semi-protecting entire projects, but we can do it. The only thing that's required here is a local consensus that any vandalism whatsoever warrants indef semi, and if you take away our scalpels and leave us only with hammers, then hammers it is, and a whole lot of problems are going to start looking like nails.

Lest someone try to paint me as a singularly focused vocal vandal fighter, if anybody wants to have a cross-wiki pissing contest about content creation then I'll step right up and we can compare resumes. So please take it seriously when I say on behalf of our content creators, that we'll be damned before we'll pour a hundred hours into a product only to see it smashed by some kid with a smart phone. GMGtalk 13:53, 8 September 2019 (UTC)

It doesn't take a lot of imagination to revert to pre-1996 ways of doing things. Nemo 19:02, 10 September 2019 (UTC)

How is this going to affect people editing from Shared IP addresses?[edit]

Take for example the corporate proxy for Advance Auto Parts, seen here. There's a mixed bag of good edits and stupidity, probably from employees at stores all over the United States connecting through one IP address at corporate. Would all of these edits go through the same unique identifier? Currently we have templates like Shared IP corp and Shared IP edu to remind people not to bite the head off of newbies from these IPs over things that other people have done from the same IP (not that anyone on English Wikipedia pays any attention to that; the edu one has become more of a "block this IP longer" tag to admins), what happens when we have no way of knowing what the IP belongs to? Thinking about it, it might be a good thing for admins to no longer know when editors are coming from K-12 schools, both for the contributor's privacy and for Wikimedia's sake to eliminate bias, but will the editing community know to apply common sense that these unique identifiers might represent thousands of people rather than just one person? As a side note to those unfamiliar with me (which is probably most because I'm nothing special), I'm not in high school anymore; the user name comes from my alma mater where I was over ten years ago when I became part of the Wikimedia community. PCHS-NJROTC (talk) 13:35, 9 September 2019 (UTC)

@PCHS-NJROTC: We are thinking about having a way to surface information about the underlying IP (such as location, organization/school etc) without exposing the IP address. That will provide better privacy to unregistered editors than we have right now. We are still thinking about how this would be implemented and what information will be surfaced and how. Thanks for your question! -- NKohli (WMF) (talk) 23:49, 11 September 2019 (UTC)
@NKohli (WMF): With all due respect, what is even the point of this if you're just going to tell people what school someone is editing from? That would actually reduce privacy by making it easier for people with less technical knowledge to know where our editors work, go to school, etc, because people who have absolutely no knowledge of IP addresses or how to run a WHOIS query would then be able to see "Anonymous User from Charlotte County Public Schools" (for example) wrote something on Wikipedia, making it even easier for our contributors to be doxxed or harassed for their actions. I mean, imagine the scenario where a middle school student contributes negative but factual and well referenced information on the article of a large corporation and some company exec who knows little about computers suddenly sees that and pays someone to identify that kid and "make him pay" for it. That's already theoretically a possibility with IPs being exposed, and we already make it easier for that to happen by tagging schools with Shared IP edu, but we don't need to make it even easier by having our editors' schools, workplaces, or geographic locations appear in the edit history like that. I don't like this idea at all. PCHS-NJROTC (talk) 13:40, 12 September 2019 (UTC)
@PCHS-NJROTC: We will probably not be exposing the name of the school but rather make it clear that it is a school and provide some sense of location. Please do keep in mind that these are just ideas at the moment and depending on what is feasible from a technical point of view might change things. We are also probably not going to be recording the school/workplace/location identifiers in the edit history but rather have a way for users to be able to see that if they need to. It doesn't take a lot of technical skill to do a whois, with the huge number of tools available on the internet now. You could put in an IP address in a search engine and it will give you a lot of information. There are a lot of things still to be figured out. But trust me, we are not going to make it easier for users to be harassed. Thank you. -- NKohli (WMF) (talk) 22:25, 16 September 2019 (UTC)

Consider the effect on researchers using Wikimedia data/metadata[edit]

Although not included in NKohli (WMF)'s detailed half-time summary, a number of people have pointed to the potential impact this proposal might have on people outside of the Wikimedia community who have relied on public IP data and/or the specific way it is published and who have created things of value. Many of the examples I've seen have been things like w:United States Congressional staff edits to Wikipedia.

I wanted to urge the folks involved to carefully consider the impact of any change on researchers who rely on Wikimedia data. Although it's not always visible to participants, Wikimedia projects serve as the single most important laboratory for social scientific, computing, and informatics research in the world. There are literally thousands of papers published about Wikipedia and that use Wikipedia data. A major change to the way that contributions are attributed will likely affect many external reseachers' abilities ability to learn with Wikipedia and compare data collected before and after any change. It could make it difficult or impossible to replicate previously published studies in the future.

Without a concrete proposal, it's hard to know what the impact would be. Based on some of the suggestions floated in the proposal, I can easily imagine that change could meant that:

  • researchers become limited in their ability compare numbers of non-registered contributors made before/after the proposal is implemented;
  • researchers cannot allocate/attribute contributions to individuals/users in ways that are consistent or clearly explainable;
  • researchers cannot study geographic concentration of contributions (e.g., urban/rural divides; global inequality in participation, etc).
  • ...and so on.

I believe that these examples, and countless others we've not imagined yet, represent real costs to broader value that Wikipedia provides to the world through its utility as a source of research data. To be clear, I'm absolutely not categorically opposed to incurring these costs. Protecting contributor privacy and mitigating/reducing harassment is clearly very important too! It is clear from the discussion on this page that seeing any version of this proposal through is going to be an exercise is making difficult tradeoffs. I only want to make sure the team designing this system know what the researchers outside the foundation will be giving up.

I know that the WMF team shepherding this proposal has been in touch with people from the WMF research team. From what I've seen on this page, this has so far been focused on identifying research that would inform this proposal rather than on understanding the effect of the proposal on future research.

My suggestion here is for the folks in WMF working on this to connect with LZia (WMF) and others on her team to insure that any proposal is informed by a solid sense of what the effects will be on future research both inside and outside WMF. Maybe consider running proposals by the wiki-research-l] list? I'm happy to help out. —mako 23:56, 10 September 2019 (UTC)

@Benjamin Mako Hill: Thanks for looping me in. I had a chat with part of our team (Research) about it as well as NKohli (WMF), and I understand you had a separate conversation as well. The assessment of our team aligns with yours in that we believe it's very important to formalize/acknowledge the impact on the research community especially in light of the fact that the data has been used extensively used by them. (I'm also with you that the point is to make sure the effect is considered in any decision making.). I've offered our team's support to NKohli (WMF) et al. if they have questions about this part of the assessment. --LZia (WMF) (talk) 21:39, 12 September 2019 (UTC)