Talk:IP Editing: Privacy Enhancement and Abuse Mitigation

From Meta, a Wikimedia project coordination wiki
Jump to navigation Jump to search

About IP Editing (discuss)
About IP Addresses (discuss)  · IP Editing Restriction Study (formerly Login Required Experiment) (discuss)


IP Editing: Privacy Enhancement and Abuse Mitigation Archive index
This page is to collect feedback for the privacy enhancement for unregistered users project.
Hoping to hear from you. You can leave a comment in your language if you can't write in English.
SpBot archives all sections tagged with {{Section resolved|1=~~~~}} after 14 days and sections whose most recent comment is older than 120 days. For the archive overview, see Talk:IP Editing: Privacy Enhancement and Abuse Mitigation/Archives.

Please remember that this page is used by people from a number of communities, with different native languages. If you avoid using acronyms from your home wiki, that will help them participate in the discussion.

October FAQ Update Questions[edit]

Thanks for the new update. If I have the ability to unmask and I have opted in to that preference does that mean that all IPs are automatically unmasked? How would this tie into session based unmasking? Depending on answers to this I'm likely to have some more questions. Thanks and best, Barkeep49 (talk) 20:20, 29 October 2021 (UTC)[reply]

Thanks for the questions, Barkeep49. We're working on getting Product, Legal and Trust & Safety together on a proposal for how to best unmask a number of IPs, but we're looking more at a one-click solution than a preference which works on all IPs forever. For one thing, we need the accesses to be logged. /Johan (WMF) (talk) 12:17, 1 November 2021 (UTC)[reply]
(For those who didn't see it, there is an update on how to handle masking at IP Editing: Privacy Enhancement and Abuse Mitigation#IP Masking Implementation Approaches (FAQ). It'll go out in Tech News on Monday. Johan (WMF) (talk) 12:24, 1 November 2021 (UTC))[reply]

Why?[edit]

If you know IP you know only provider so there’s no problem with knowledge of IP. 217.117.125.83 16:05, 9 November 2021 (UTC)[reply]

We know a fair bit more than the Internet Service Provider, actually. But most importantly, for legal reasons: IP Editing: Privacy Enhancement and Abuse Mitigation#Motivation. /Johan (WMF) (talk) 16:24, 9 November 2021 (UTC)[reply]
In some cases an IP number can be directly linked to a person. For this reason the Dutch Privacy Authority considers an IP addresses personal data that require special care under EU law. --MarcoSwart (talk) 18:40, 9 November 2021 (UTC)[reply]
It means that only Dutch IP numbers must be masked. Carn (talk) 11:51, 17 November 2021 (UTC)[reply]
Catering to the demands of weak and cowardly politicians ⟨…⟩ is not the Wikimedia way. 217.117.125.83 14:38, 10 November 2021 (UTC)[reply]
We adhere to a lot of requirements, even in cases where we as a movement disagree with them (e.g. copyright laws in countries lacking the freedom of panorama) with more consensus than we have around showing IPs. This is quite different from censorship of articles, which is the context of the quote above. Nor do I think we'd generally describe the evolving online privacy regulations and norms as weak and cowardly, whether we agree with them or not, or agree with the specific implementations. /Johan (WMF) (talk) 16:21, 10 November 2021 (UTC)[reply]
Privacy is a human right. Respecting human rights is the Wikimedia way. --MarcoSwart (talk) 15:39, 12 November 2021 (UTC)[reply]
Why do you think that it’s privacy? 217.117.125.83 18:33, 17 November 2021 (UTC)[reply]
Let's suppose we have an editor from China, who says something rude about the President Xi or questions Party Policy. The Chinese secret services have great skill in tracking IP movement and then matching that to an individual. There even has to be a concern about Checkuser, because they have the patience to play the long game but we have to balance that against effective management of the project. Our US friends say, well that's China, would never happen here! but you came very close to a successful coup d'etat a year ago and a similar action (burning of the Reichstag) brought dictatorship to Germany less than a century ago. There is autocratic rule to a greater or lesser extent in Turkey, Egypt, Myanmar, North Korea and India (Kashmir). This is not about places like the Netherlands, where there is respect for the rule of law: it is about the places where the law is shaped to serve the governing power. --John Maynard Friedman (talk) 20:35, 4 January 2022 (UTC)[reply]
So China (& some others) found tool, how to destroy WMF at all by wikipedians' the very self hands. Because of crawling vandalisms by robots, which could not be eliminated by administrators, as in massive attack cannot help non-admins, who will not be able to identify vandal-bots and so able help admins (not enough time). So here rises dilema: is some person security (cf. does this tool guarantee his security, apart makink it a little bit safer?) more, than WMF existence at all? What about crossproject vandalism? What about smaller projects with weak or absent admin staff? --Kusurija (talk) 06:17, 5 January 2022 (UTC)[reply]
  • If a editor edits from China, the Chinese government will know who it is regardless if we implement IP masking or not. They monitor ALL traffic within the country and from timing data of the metadata alone they could figure out who made the edit. Further, it is naive to assume that none of our admins, who can still see the IP addresses, are covert agents for China. Bottom line: unless a user has been really really careful, any edit can be traced back to them. So, IP masking, as per your example, may actually lead people into a false sense of security. Jason Quinn (talk) 07:21, 16 January 2022 (UTC)[reply]

Contributions[edit]

How will I able to find my contributions? Will be there way to say to the site that I want to save my edit under IP? 217.117.125.83 17:25, 10 November 2021 (UTC)[reply]

Much like today! They will be tied to your identity; how said identity is to be defined is up for discussion (see above and the update on the main page). It won't last forever, of course, but then again – no unregistered identity does. Including IPs. /Johan (WMF) (talk) 01:12, 11 November 2021 (UTC)[reply]

It’s the way to finally force anyone who needs that to register. — 188.123.231.59 23:50, 10 November 2021 (UTC)[reply]

That's not our intention here, at least; then we wouldn't spend so much time and effort trying to make masking work. /Johan (WMF) (talk) 01:12, 11 November 2021 (UTC)[reply]

Everyone is a sockpuppet[edit]

If a pack of newly registered accounts enters a discussion, everyone knows how to handle that. From now on, whenever an honest EX-IP-user tries to discuss, everyone will ‘know’ they’re nothing more than a sock manipulating with cookies. Way to go. — 188.123.231.59 00:00, 11 November 2021 (UTC)[reply]

Well, first of all, a lot of people would have access to the user right where they can see the masked IP; there won't be anywhere near anything the checkuser process. So any such suspicion would be easy to check and confirm or dismiss, at least to degree we can do so with IPs. (IP jumping is, of course, a thing. But that's the case today too.) Second, whether to base it on a cookie or to an IP is up for discussion. They both have their drawbacks and benefits. /Johan (WMF) (talk) 01:15, 11 November 2021 (UTC)[reply]
Which makes participants a lot less equal, again. Equality in discussions would require masking registered accounts names as well, and from everyone, so that only arguments would matter in dispute resolution, not reputations. As for IP jumping it’s only easy for some, while cookie exploiting is for everyone. — 188.123.231.59 08:05, 11 November 2021 (UTC)[reply]
I'd say cookie exploiting still is for a minority – most people don't have any idea of how cookies work. But of course you're right that it makes it easier, nor that we don't have a problem with unregistered users not being fully respected partners of a conversation (which, at least on the wikis I know best, is true today too). Just to make sure I understand your feedback correctly, out of the two alternatives presented, you're arguing for the identity to be tied to the IP because you think a cookie-based solution will lead to even more issues for unregistered users in trying to be part of the conversation, is that correct? /Johan (WMF) (talk) 13:43, 12 November 2021 (UTC)[reply]
True, i believe encoding an IP in the way that changes almost nothing and only hides the actual ISP / geo location data from most viewers would be the best solution. IP editors don’t need any control over abusing the new system. Just always encode the same IP adress the same way, the other IP address the other way etc. — if the person behind it wanted to keep control they would create an account to personally control. — 188.123.231.59 20:31, 12 November 2021 (UTC)[reply]
Detecting the newly created accounts without rolling over their name would be nice. I always try to be polite to everyone but being able to instantly see who is new would be nice.... Hmm if the anonymous is going to have a number does that mean we can tell? Wakelamp (talk) 01:25, 21 November 2021 (UTC)[reply]

Cyphering IPs[edit]

It isn’t interesting to anyone that the start of IP is 217, it’s interesting only that 217.117.125.72 is close to 217.117.125.83 but 117.117.125.72 doesn’t. So why can’t we cypher IPs like 1fc4b57.125.83? 217.117.125.83 11:16, 11 November 2021 (UTC)[reply]

Do I understand you correctly that you are suggesting we only mask the first two octets of an IP address and expose the rest?
I can imagine scenarios where this could be misleading. Say if 217.117.125.72 and 117.117.125.83 are active on the page and they are cyphered as 1fc4b57.125.72 and 7c5f6e2.125.83 people might assume they are closer to each other but they are not really. NKohli (WMF) (talk) 01:24, 17 November 2021 (UTC)[reply]
But if IPs are 217.117.125.72 and 217.118.125.83 we have such problem now. 217.117.125.83 18:34, 17 November 2021 (UTC)[reply]
Unless you salt the IP with something that changes over time, it'd be easy to perform a known-plaintext attack of that system by seeing what random public IPs resolve to (eg. 217.117 → 1fc4b57). There are only 65k such /16s, so with a bit of determination you could figure out most of them fairly quickly. If you rotated the cipher somehow, you'd lose the ability to know that any IPs having the same first two octets are in the same /16 (or at least you could only make some kind of time-boxed statement of sameness), which prevents the main point of the system. And even then, you're vulnerable to the same attack within the rotation (so if I can find any IP that ciphers to 1fc4b57 quickly enough, I still find your IP). Inductiveload (talk) 19:49, 4 January 2022 (UTC)[reply]
Yes, anyway, the edits of IPs are voluntarily. So why is it not enough to inform them, that their edits will be associated with their IP-adress? IPs are anonymous. Greetings 84.186.35.187 17:37, 5 January 2022 (UTC) (That was me Redlinux (talk) 17:39, 5 January 2022 (UTC))[reply]

Authorship concerns[edit]

  1. IP adress can be assigned to different users so edits from IP are unquestionably anonymous and will stay such virtually forever. When you choose for user meaningful ID you make edits "pseudonymous" not anonymous. Right now you cannot register username that resembles IP address. With introduction of ID I see no technique to prevent user from mocking up themself as "temporary" user. I feel some unconfidence with this.
  2. If you assign ID through cookies what will prevent people from stealing such cookies to identify themselves as another person? For example, if I want to delete sole contribution of ID-assigned user I can use cookies to make speedy deletion request "by author".
  3. Will all the ID be uniq? IPs is knowingly not uniq what makes them truly anonymous. If there will be repetitive IDs, one can pretend to be author of other's work and, for example, do speedy deletion "by author requests" without any right to do that. I am really concerned whether you can make IDs uniq because this make them exhaustable and vulnerable to simple bruteforce attacks. Or they will be too long or too ugly.
  4. My last concern depends on uniqueness of ID, if they are such, there will be no problem. But this is a legal question. Author's name is protected by law. After you assign it to one user, is is doubtful you can freely assign the same name to another user.
--Igel B TyMaHe (talk) 12:20, 11 November 2021 (UTC)[reply]
@Igel B TyMaHe Hello.
  1. We will make sure that unregistered usernames look distinct from registered usernames. They will also be randomly auto-generated and assigned but not self-selected by the end user. This can change though, if needed.
  2. I am not sure if I understand your question. How can someone steal cookies?
  3. IDs will always be unique to the cookie. They could be an auto-assigned ID number or such. Can you elaborate on why IDs will be vulnerable to bruteforce attacks? What would that achieve? --
NKohli (WMF) (talk) 03:01, 17 November 2021 (UTC)[reply]
(To be clear, I think we're talking about session hijacking. Please confirm so we all understand each other.)
Re: the legal aspect of authorship, nothing should change here. IPs are already not unique, for that matter. /Johan (WMF) (talk) 17:33, 18 November 2021 (UTC)[reply]

Use case for unmasking IPs[edit]

Editors who partake in anti-vandalism activities, as vetted by the community, can be granted a right to see IP addresses to continue their work.

What's the benefit of unmasking the actual ip of a user? Wouldn't the better thing be to create a tool wherein a user who has been granted the proposed right, would input two masked identities say User:Anon3406 and User:Anon5538 and an output would say whether these two have the same ip, so just a "match" or an "unmatch", instead of revealing the actual ip. - hako9 (talk) 17:09, 12 November 2021 (UTC)[reply]

String comparison of IP addresses only gets you so far. To be able to implement a rangeblock (at the smallest size) or check for open proxies, you need the whole address. AntiCompositeNumber (talk) 14:49, 13 November 2021 (UTC)[reply]
  • Regarding require a minimum number of edits and days spent editing - what are these values? And before you say communities can pick - please verify that 1 edit and 1 day suffices. — xaosflux Talk 15:04, 15 November 2021 (UTC)[reply]
Xaosflux: We had a suggestion ("least a year old and have at least 500 edits"), but received a lot of pushback on the time limit being too long, so shorter than that. Legal has required that there is some sort of limit, so it won't be one edit and one day. /Johan (WMF) (talk) 13:17, 16 November 2021 (UTC)[reply]
@Johan (WMF): so somewhere between (2 to 499 edits) and (2 to 364 days) -- who is the decision maker on this and when are they expected to make the decision? I'm assuming communities can always be more stringent. Unless this is bundled or made in to an autopromote group, users will need to specifically request this like other groups I'm assuming - that alone should drastically limit it, as most user won't bother to request. — xaosflux Talk 14:19, 16 November 2021 (UTC)[reply]
Xaosflux: I would count on the 500 edits remaining, that was not the part people had issues with in the discussions we've had so far and satisfies the requirements we have from Legal. And while the time period would be shorter, I would still assume we're talking about months, rather than days. Yes, communities can be more stringent if they want to, of course, we have no intention to interfere more in the communities' workflows than we have to.
In practice I, NKohli (WMF) and STei (WMF) will run a new number by Legal. Formally, this is NKohli (WMF)'s decision with approval from Legal.
NKohli (WMF): Do you have a suggestion for a timeline here? /Johan (WMF) (talk) 14:50, 16 November 2021 (UTC)[reply]
  • Thanks, just throwing out information from some extended confirmed settings in use on projects (edits/days) azwiki:500/30, bgwiki:500/120, enwiki:500/30, fawiki:500/30, jawiki:500/120, kowiki:500/30, rowiki:500/30, svwiki:500/30, viwiki:500/30, zhwiki:500/90. I'd suggest a lightweight framework of something like: "no autopromotion", "500 edits + 30 days". Create one new permission and one new group that holds the permission, make the group assignable and removable by sysops - who are expected to follow the minimum requirements rule (noting that there will likely be exceptions - for example for people that operate multiple "accounts"). The new permission could be bundled with existing groups: stewards, sysops, checkusers - and by request possible to other admin-like groups such as eliminators. — xaosflux Talk 15:22, 16 November 2021 (UTC)[reply]
Xaosflux: Our plan is that the groups you mention will be able to opt in with a single click, having already the community's confidence, so there should be no need to burden anyone with a process. /Johan (WMF) (talk) 15:26, 16 November 2021 (UTC)[reply]
Obviously those groups should have it inherently, but just to clarify we'll need the new permission so we can assign it to additional users outside the admin corps. Nosebagbear (talk) 19:02, 16 November 2021 (UTC)[reply]

The very idea to conceal the IP addresses appears to cause very bad results. In our section there is a vandal who does his bad deeds at least since the end of July. He constantly creates pages with insults of a Russian actor. Fortunately, one of the pages with insults was created anonymously, which allowed careful users to determine the rang of IP addresses used by the vandal. If the new rules are introduced, then only high-ranked users will be still able to find out the IP addresses and ask the admins to lock them. The problems will become a worse version of the ones which we face when dealing with open proxies. I hope that this idea to show randomly generated pseudonyms instead of IP addresses will not be implemented. 178.70.250.106 19:58, 16 ноября 2021 (UTC)

178.70.250.106: It will happen. We're doing it because of changing norms and regulations around privacy online – so the Legal department's analysis is that this is necessary – not because we unilaterally decided it was a good idea. /Johan (WMF) (talk) 20:40, 16 November 2021 (UTC)[reply]
Just so long as the tools under construction mean that no part of the CVU/IP-sock workflow ends up having either a higher time taken, lower positive-clearance, or higher false positive rate then the status quo. That would be unacceptable. Though I've not seen anything on how to duplicate things like the Congress or Parliament watches (often on Twitter and so on) that check for problematic edits from those politically-hot button areas. Nosebagbear (talk) 20:58, 16 November 2021 (UTC)[reply]
We have some of those, yes. I'm not sure I see a solution for them, to be honest. I'll add it to the list of questions for the Legal department, if there is room to do anything there. /Johan (WMF) (talk) 02:35, 17 November 2021 (UTC)[reply]
@Johan (WMF): - that does not seem to be indicated in the impact text In order to provide proper tool support for our administrators’ work, we must be careful to preserve or provide alternatives to the following functions currently fulfilled by IP information - it notes that there may be a transition (which going from past changeovers on Wikipedia, an acceptable problematic transition period is usually viewed as 1 month) drop in effectiveness, but that improved tools will resolve the workflow issues that might be caused (or at least compensate for them by saving effort elsewhere, although this is itself dubious as editor and editor time is not the most fungible of resources) Nosebagbear (talk) 16:11, 19 November 2021 (UTC)[reply]
I don't think 500 edits is enough. We still see vandalism in extendedconfirmed-locked articles, much rarer but it happens. Also I'd like to not perform a dozen revision deletes every day on my projects AIAB page, just because of all the extendedconfirmed users who would not understand or care about how to deal with this. I think the requirement should essentially be the same as it is to receive rollbacker rights. Could even be same user-group. EstrellaSuecia (talk) 02:11, 20 November 2021 (UTC)[reply]
EstrellaSuecia: Just to be clear, someone will make an actual decision to give the user the right, it won't happen automatically. You still think it's not enough? /Johan (WMF) (talk) 14:26, 22 November 2021 (UTC)[reply]
Nosebagbear: Our intention with that text was to point out the specific things we were looking at, e.g. showing "this is from a university library" in IP Info so that would be obvious for anyone looking it up, but yes, I can see how broadcasting that an edit has been done for institution X could be construed as being covered by that text even if it wasn't our intention. I've started a conversation with Legal about this. /Johan (WMF) (talk) 14:26, 22 November 2021 (UTC)[reply]
Hi Johan, I realise 3 weeks is a fairly short period of time by the normal Legal time, but was wondering if they'd responded to this prior to the Christmas break? Nosebagbear (talk) 13:01, 15 December 2021 (UTC)[reply]
I think for wikis which have 30/500 protection, partially masked should be given for extended-confirmed users, and another wikis will be given a right called partially see IP address group, which is autogranted for users which made 30/500 edits; or make a RfC about universally extended confirmed protection. For small wikis, the requirements can be lowered to prevent crosswiki vandalism, needing 6 months and 5000 global edits to globally see partially masked IPs. Thingofme (talk) 15:47, 5 January 2022 (UTC)[reply]

Global contributions and other external tools[edit]

This is a follow up to Special:Permalink/22188440#Global contributions and other external tools. Is there an update on what we plan to do with the ip_changes table and the Toolforge replicas? I assume we will stop replicating this data, because otherwise IPs will be public. But by doing that, we break tools like XTools Global Contributions and GUC which stewards rely on to check for collateral damage across all wikis before blocking an IP or range. We cannot feasibly check every wiki individually. We could write an on-wiki gadget that uses the API, but this would be orders of magnitude slower than what we're used to now. I suggest the team build a global contributions tool into Extension:GlobalBlocking or something similar. XTools uses the ip_changes table and it is generally very fast, despite having to query up to 900+ wikis, so I imagine getting something like this into production would be acceptable performance-wise. Kind regards, MusikAnimal talk 15:38, 15 November 2021 (UTC)[reply]

Hi @MusikAnimal! We do not yet have a firm answer for what we will do to replace these critical tools but we are thinking about it. We investigated with pulling in global contributions information into the IP Info Feature and it looks like we will be able to do that. It's possible we will expand that tool to include more information in the future. Thank you for calling this out. -- NKohli (WMF) (talk) 04:21, 17 November 2021 (UTC)[reply]
@NKohli (WMF) Thanks, great to hear! I see now the "See global contributions" link in File:IP Info (10 June update).png. As long as that works for ranges, too, we should be in good shape. The data is all there, so it wouldn't seem terribly hard to build a Special:GlobalContributions page that gives a chronological list of edits across all wikis, along with tags, etc. (mw-reverted in particular is helpful). I suppose this special page would only be visible to stewards or whatever other global groups we decide have the right to view IPs.
Possibly unnecessary technical rambling: As I understand it, the production db clusters are now identical to Toolforge's, so the querying strategy could be very similar to what's done in XTools – where by we query 900+ wikis with just nine individual queries (because there are only 9 database sections: s1, s2, etc). The important thing I guess wanted to make is that this new tool should be built before we break the Toolforge replication of the ip_changes table so we don't break any workflows. If it means anything, this table stores IPs as a hash. They are not discernible to the public without first doing base conversions, so it's already sort of "masked" but probably not in a legal sense :) Thanks and hope all is well, MusikAnimal talk 04:56, 17 November 2021 (UTC)[reply]
Better to let tools access a hashed table. This is mostly a self-imposed obligation, w/ flexibility in how to implement it. –SJ talk  17:13, 7 January 2022 (UTC)[reply]

Masking algorithms[edit]

With apologies for repeating myself, it would be very helpful for all editors to know whether two contributions come from similar IP addresses, and to find other contributions from the same range. An algorithm such as Crypto-PAn achieves the first of these requirements and may help with the second. Certes (talk) 16:29, 15 November 2021 (UTC)[reply]

Agree. Can you tell if they come from the same VPN? --Wakelamp (talk) 04:32, 21 November 2021 (UTC)[reply]

IP-based versus session-based masking: Why not both?[edit]

Consider this sequence of events:

  1. Alice makes an edit from 1.2.3.4, but does not clear cookies afterwards
  2. Alice makes an edit form 1.2.3.5, but remembers to clear cookies afterwards
  3. Alice makes an edit from 1.2.3.5

Now either IP-based or session-based masking "break the chain" at some point. But what if these edits are publicly logged as:

  1. Masked User 473982743:47982
  2. Masked User 027469812:47982
  3. Masked User 027469812:51023

Now everyone can see that there is might be one user behind all these edits, but we haven't given up her IP. Suffusion of Yellow (talk) 20:10, 15 November 2021 (UTC)[reply]

  • Third "Alice makes an edit from 1.2.3.5" is indistinguished from "Bob makes an edit from 1.2.3.5" so an assumption that all 3 edits are from one user is incorrect. The correct assumption is that Alice and Bob have the same IP, and then if we somehow know Alice's IP we know Bob's IP as well. This is a minor threat but still a threat. (In my opinion this is not so far from giving plain IP. Changes are redundand in such case). --Igel B TyMaHe (talk) 08:13, 16 November 2021 (UTC)[reply]
    Good point. This threat also applies to a pure IP-based masking scheme, too. Combining both schemes doesn't make it any worse, unless I'm not thinking of something. Suffusion of Yellow (talk) 23:23, 16 November 2021 (UTC)[reply]
    This starts to get into a meta-discussion (not the project) IMO about what actually constitutes a unique user. Certainly Alice and Bob are two distinct people. But if they are using the same IP and the same browser on the same device without any additional identifiers (like credentials or a token), are they actually a distinct user? Anyhow, the hybrid approach proposed above does seem like it would eliminate some of the concerns around the limitations of either implementation. SBassett (WMF) (talk) 22:12, 18 November 2021 (UTC)[reply]
    Might I suggest that the "MY LITTLE BROTHER DID IT" essay may hold some relevance to SBassett's question about a unique user. If abuse is coming from an IP, we block it. When blocking an IP, we aren't exactly blocking a person, but rather a connection to the internet. This wouldn't be any different, and Bob still gets blocked for what Alice did. The difference is that, Charlie, who just got Alice's IP, doesn't get blocked, as we could see that Alice definitely changed IP address so it would be pointless to block the one she has moved off. Mako001 (talk) 12:34, 13 January 2022 (UTC)[reply]
  • I like this hybrid approach. ProcrastinatingReader (talk) 19:56, 10 December 2021 (UTC)[reply]
  • I like the hybrid approach. Details that need to be worked out are the retention policy for the data. We need to consider scenarios where an authoritarian government (1) grabs a user's computer and checks their cookies, and (2) breaks into a WMF server and steals our logfiles. What policy strikes a reasonable balance between allowing WMF to defend its websites, and prevents our data from falling into the wrong hands and causing real life harm to a user or even a mere visitor to the website? This problem is deep and will require ongoing thought and research, regardless of which path is chosen. Jehochman (talk) 18:22, 4 January 2022 (UTC)[reply]
I guess that if an authoritarian government actually grabbed someone's computer, the cookies would be fairly irrelevant, browsing history would likely be enough, and we can't do anything about that. The idea of the whole IP masking thing is to avoid the situation where they actually do grab it. Mako001 (talk) 12:34, 13 January 2022 (UTC)[reply]
  • Indeed I see advantages in the hybrid approach too. --.mau. ✉ 18:25, 4 January 2022 (UTC)
  • Yes, using both is certainly an improvement to either. I also don't see why IPs from the same range editing the same page can't be tagged as possible socks (or some less accusatory wording). They are both still remain masked. This is not just a vandal fighting issue, ordinary collaborative discussions are affected. Conversations on talk pages will start to become really difficult when unregistered users get involved. At the moment, it is faily obvious when the same person makes another reply, even from a dynamic IP. With masked IPs this could get very confusing unless there is some kind of marker. SpinningSpark 18:56, 4 January 2022 (UTC)[reply]
  • I am also interested in seeing what a hybrid approach would look like. Will there be a trial period with specifics before we decide definitively which approach (or both?) will be used? Spencer (talk) 03:34, 5 January 2022 (UTC)[reply]
  • I also agree with a hybrid approach, but where the IP is not necessarily transparent. So, besides getting a session handler (e.g. anon-1fe49afc7bc5), for whom we can see the User contributions, we can also have an list of IP anonymised identifiers for that user (which might be shared by different anonymous users), and direct access to any contributions done each (still hidden) IP, and we are still able to (temporarily) block anonymous contributions from a given IP (even without seen it!), or pass it forward to a checker for more information if necessary. MarianoC 10:48, 7 January 2022 (UTC)[reply]
  • I like this approach, too. It could even be improved further by including a hash for the IP range (first three numbers) only: then one would see that all edits are from the same range without knowing the range... --Qcomp (talk) 13:47, 7 January 2022 (UTC)[reply]
  • Support, came here to suggest this, along with an explicit nod to basic hashed fingerprinting. That also elevates the feel of the whole project from "switch from one suboptimal solution to a second suboptimal solution, with added transition costs and confusion" to "extend current system to add more useful nuance + make everyone's lives easier" ;) –SJ talk  17:16, 7 January 2022 (UTC)[reply]
  • I also like this approach, but there are a lot of users if the IPs or cookies changes, and it's hard to discuss with masked usernames in this way. However, it can be inferred that it would be easier to combat vandalism, if we use both cookies (one cookies) or IPs IP editing. The cookies show that an temporary account for IPs are disappeared frequently, once cookies changes, the edit would be lost. Thingofme (talk) 11:58, 8 January 2022 (UTC)[reply]
  • Yeah, this should be the #1 option. Makes it much harder to evade scrutiny, ab-users will inevitably forget to always change both the IP and the Cookie. Assign the cookie to all users, registered ones too. Would give CU's another way of tracking down sockpuppeteers.Mako001 (talk) 12:21, 13 January 2022 (UTC)[reply]

IPv6 and privacy[edit]

"Users well-versed with IP addresses understand that a single IP address can be used by multiple different users based on how dynamic that IP address is. This is more true for IPv6 IP addresses than IPv4."

Actually, for purposes of privacy, IPv6 can be considered to be essentially static. Each user is assigned many IPv6 addresses, but each address belongs only to a single Internet connection (router), as opposed to a many-to-many relationship between dynamic IPv4 addresses and users. Hence, the IPv6 range assigned to a single household (it seems it's usually /64) is as personally identifiable as a static IPv4 address. Daß Wölf 19:48, 20 November 2021 (UTC)[reply]

  • No, IPv6 is protected against such a simple detection of a user as "privacy extension". Usually an IPv6 address is issued for 1 day, then the client receives a new one. Old IPv6 addresses respond 1 week. Also, new IPv6 is usually issued upon reconnection.

You also need to understand that IPv6 segments have a capacity of trillions of numbers. Therefore, mobile operators often create one IPv6 segment for the entire country. Therefore, if a vandal works from such a segment, you can neither block him by IP, nor even understand what city he is in.

An additional point is that it is rather difficult to apply masks to IPv6 addresses. The "user interfaces" portion of the address is actually under the control of the client. This leads to the fact that within one IPv6 segment, all users are fundamentally indistinguishable.

Vandals will not need open proxies in the near future, because their smartphone is enough for them to "crack" the protection of Wikipedia.

In the near future, IPv6 will replace IPv4. In the US, over 40% of connections are already IPv6.

Everyone who is crying here about blocking vandals' IP addresses needs to understand that in the next few years, IPv6 addresses will make it almost useless to study them, as blocking masks, and it will be impossible to block IPv6 segments. Blocking the IPv6 segment of a mobile operator means cutting off millions or even tens of millions of people.

Instead of suffering about getting into someone else's personal data, it is better to think about modern means of identifying anonymous users, which using modern technologies. This tech are NOT based on IP addresses, but are based on fingerprints based on the characteristics of browsers and hardware. This is similar to User Agent, but much more selective (identification accuracy is higher than 99.999%). Also, you will not be able to understand that User Agent is being faked for you. Modern fake addons like "Random User Angent" choose very realistic options from templates. However, browsers' fingerprints are blocked now very roughly and it is easy to understand what the user is doing. Such users can be blocked as users of open proxies are now blocked.--2A00:1FA0:C433:8FC2:ACAE:2BAC:619E:8195 16:30, 21 November 2021 (UTC)[reply]

If this really is as technically feasible as claimed, it is a great idea. SpinningSpark 18:57, 4 January 2022 (UTC)[reply]

Monitoring what happens in terms of editor behaviour[edit]

I wonder if this will change the behavior of anonymous users in some way that we can't foresee; maybe not having an IP address will makes them feel uncomfortable in some way

Are the team going to monitor before and after, to see there are statistically significant changes to - The number of new editors, and the split rate between unregistered and registered - The behavior of unregistered editors - distribution of the number of subsequent edits, distribution of the number of subsequent edits, number of edits reverted, vandalism etc Wakelamp (talk) 01:30, 21 November 2021 (UTC)[reply]

@Wakelamp Thanks for the question. We are already tracking some of metrics you mentioned about blocks, unblocks, reverts, page protection, page deletion etc for both registered and unregistered editors. We also track metrics about edits from IP editors - this happens by default thanks to our edit database. To measure community health overall, we are also keeping an eye on number of admins and admin to content ratio on our projects, as well as number of checkuser requests that are filed by the community. -- NKohli (WMF) (talk) 03:27, 23 November 2021 (UTC)[reply]

Abuse prevention[edit]

Increased abused mitigation is unlikely to work, as even a one second revert is enough time for a screen-dump Maybe we could reduce abuse if we targeted the vandal's motivation instead reduce the benefits. There has been lots of discussion in the past, but if they are just after a screen dump, then even a one second revert is too much

IF we moved some NPP functionality to Publish, we could

  • warn the vandal before publishing a high vandalism article that the edit will be reverted. Stop this NPP functionality if they repeatedly try it (identified using the new cookies
  • warn that vandalism may be reported to their VPN/ISP in their IP address country
  • Stop the value of the edit by placing a watermark/blank the page (only to the editor) saying article waiting for NPP review
  • Give them an option to send a copy of the preview to friends - Even provide a program to do this so they can draw big arrows (but change the wiki URL)

Yes, it would allow a user to work out how to get around some of the tools, but they can do that by trial and error.

Risks of an increased Abuse Mitigation Approach

The three plots above make a few things apparent:

  • The proportion of desirable newcomers entering Wikipedia has not changed since 2006
  • These desirable good newcomers are more likely than their ancestors to have their first contributions rejected
  • The decline in good newcomers is the result of a decline in desirable newcomers. Undesirable newcomers (not shown) retention rate stays constant.

Does Abuse mitigation work

  • The assumption we work under is that reversion in under 5 minutes reduces benefit and thus motivation for vandals. I think the benefit for nearly all is achieved as soon as they hit publish and take a screen shot.

It's the screenshots many of them want (based on searches on google/instagram/twitter for Wikipedia hack/edit lolz and funny); editors want to make lasting contributions, but many vandals may not care. OR they will send their revision of an article to their friends.

As an example of the mentality, this site [1] shows the needs of many of the vandals - it allows an update on your chosen article on ES:WP with a find and replace of your choice.

Rather than mitigation and an us versus them culture (even though in this essay portrayed humorously), maybe we should try to understand the Vandals motivations, so we can do nudges that decrease their motivation to vandalize, and increase their motivation to become WP editors. Wakelamp (talk) 04:29, 21 November 2021 (UTC)[reply]

@Wakelamp I tweaked your comment to fix some of the links. I hope you don't mind. This is very thoughtful and valuable feedback. We are already thinking about changes for the unregistered users and how we can nudge them towards creating accounts as part of this change. I like your ideas about nudging them towards better behavior. This is very timely. I will discuss this with my team. Thank you very much. -- NKohli (WMF) (talk) 04:47, 23 November 2021 (UTC)[reply]

We need fingerprints ID[edit]

I do not understand the suffering of most for very old protection against IPv4 addresses. With the introduction of IPv6, IP-based security has become obsolete. Most common users mixed with vandals use mobile operators. They have giant IPv6 country-wide segments. IPv6 prohibits an operator from tampering with the "user interface" portion of the address. Therefore, all attempts to block vandals using an IP address mask for a huge IPv6 segment are absolutely useless. Therefore, we still need to redo the vandal protection.

I think anonymous identification cookies are a good idea. Most of the vandals are schoolchildren and stupid people, so they won't think to clear the cookies. Therefore, all their contributions from dynamic addresses will be automatically combined, making it easier to remove it.

We must understand that in the near future, blocking by IP-addresses will stop working altogether, because users will be operating from giant IPv6 segments and you cannot block the IP range without shutting down millions of people.

Instead of outdated IP address protection, it is better to use user attributes. It's a good idea to display the GeoIP tag that Wikipedia is collecting now. You also need to start using professional fingerprint deanomization tools, many of which are elementary in implementation.

Some of the fingerprint algorithms are very easy to implement, and each such method has a probability of determining the uniqueness of the user above 99.9%:

  1. HTTP_ACCEPT
  2. Font Fingerprint
  3. WebGL Vendor & Renderer
  4. and so on

Typically, each fingerprint has a repeatability among users no more than 5,000-10,000 variations. This allows you to introduce even a global identifier on fingerprints as a combination of numbers from fingerprint variants. This does not disclose the user's personal data to other users, but it allows users to be identified with a probability of 99.999%

This will greatly simplify the work of check users. Of course, fingerprints can be blocked by addons, but not everything like HTTP_ACCEPT. A special private browser mode is required, similar to the TOR Browser mode from Firefox. However, only a small number of users will be able to enable this and it is easy to determine that they have done this. For such users check user should see special flag as "professional anonomization enabled". Easy to implement mode for blocking users with PRO anonimous mode for some articles.--2A00:1FA0:C433:8FC2:ACAE:2BAC:619E:8195 15:34, 21 November 2021 (UTC)[reply]

    • Who needs to know more about fingerprints and probability of detection I recomended to check the site with collection of such Java Scripts. In Chrome engine is impossible to block all this fingerprints. It can do Firefox with privacy.resistFingerprinting option. However it is very easy to detect that privacy.resistFingerprinting is enabled. Blocking of fingerprints will return empty or known values. --2A00:1FA0:C433:8FC2:ACAE:2BAC:619E:8195 16:44, 21 November 2021 (UTC)[reply]
Good idea. Or an automatic ID founded on hardware fingerprint (or MAC address). We need something more consistent and persistent than a cookie based id. --Jean-Christophe BENOIST (talk) 15:57, 22 November 2021 (UTC)[reply]

Archived without response[edit]

Hi Johan,

Thought I'd re-note <https://meta.wikimedia.org/wiki/Talk:IP_Editing:_Privacy_Enhancement_and_Abuse_Mitigation/Archives/2021-10> - it didn't receive an answer, and the issue of whether those in non-"friendly" nations would fall under the same recent restrictions as the more normal NDA-roles is relevant. Nosebagbear (talk) 11:27, 30 November 2021 (UTC)[reply]

Nosebagbear: Huh, I somehow missed this before it was archived. Talking to the relevant teams and will update with an answer as soon as I have it. /Johan (WMF) (talk) 11:52, 30 November 2021 (UTC)[reply]
Nosebagbear: This will be included in an update from Legal. /Johan (WMF) (talk) 20:20, 1 December 2021 (UTC)[reply]

What about global rights holder[edit]

For groups active in SWMT work, the ability to see full IPs is very critical, you can see the SRG reports. For Global Rollback and Global sysop, members are added to the groups by at least 5 days and 14 days of discussions respectively. The comments are held at SRGP which IMHO is equal or more rigourous than a sysop RFA of a small wiki (these can be promoted for temp status without any vote - albeit with at most 3 months and mostly 1 month duration grant). So it will make sense to grant access to full IPs to GR/GS and as of SWMT team members, we can IMHO set up a lesser right to see IP for active users who are not yet ready for global rollback. Hope this can be addressed by the team, thanks. Camouflaged Mirage (talk) 11:54, 30 November 2021 (UTC)[reply]

We definitely want good global coverage. Thanks for the suggestion, will talk to the team. /Johan (WMF) (talk) 11:56, 30 November 2021 (UTC)[reply]
I've talked to Legal and to the rest of the team and this seems doable. /Johan (WMF) (talk) 02:09, 1 December 2021 (UTC)[reply]
Thank you @Johan (WMF) for the promising replies. Camouflaged Mirage (talk) 09:26, 1 December 2021 (UTC)[reply]
Camouflaged Mirage: We'll have to look into implementation, of course, but this has been added to our plans. As you say, global coverage is necessary and we don't want to mess things up for SWMT. /Johan (WMF) (talk) 10:54, 1 December 2021 (UTC)[reply]
I realised I sound unnecessarily vague above. Both Legal and we in the Product team say "yes, this makes sense". Most of our discussion has been around things like "what if the local wiki has set a higher threshold for access", but I figure that's very unlikely to be the case for the wikis SWMT look after anyway. /Johan (WMF) (talk) 20:22, 1 December 2021 (UTC)[reply]
@Johan (WMF) Yeah, this is a valid point. For Global Sysops, they act in wikis which had consented them to act so they aren't in all projects. For Global Rollback, it's truly global like although we care for those smaller wikis, our rights are across all projects. And all GS are GR by default, like they can just ask for GR. I know some projects are concerned about GR rights to be mis-used. So this is a concern that needs more global consensus like how supressredirect / autopatrol can be issues over some wikis. I will suggest we poll the projects to see what their threshold is (like the application process), if all of those seems to match the rigour of the SRGP discussions for GR, I think we are fine. Camouflaged Mirage (talk) 12:59, 4 December 2021 (UTC)[reply]

Questions about cookies and other things[edit]

Why are session cookies being suggested over persistent cookies? For cookies like centralauth_Session, they expire the moment I close the browser. Wouldn't persistent cookies that take a long time to expire be more helpful there?

Secondly, in the cookie implementation, if a person uses two different devices on the same IP address to edit, will they appear as two people or one person? Will it be possible for the software to be able to assist in associating these two pseudo-accounts to one another without using Checkuser tools?

Finally, in the encrypted IP implementation, will static IPs present themselves as the same encrypted address across an indefinite period of time? Couldn't this information be used to associate an encrypted static IP address with their unencrypted state? Also, how will IP ranges be affected by encryption? At the moment, it is generally necessary to review an IPv6 user's /64 address to get any meaningful data on their prior contributions (which is where blocks usually occur as well). The current documentation does not make this explicit, but it does seem to imply that it will mostly not be a thing. –MJLTalk 22:10, 9 December 2021 (UTC)[reply]

MJL we are going to be using a persistent cookie. –– STei (WMF) (talk) 20:58, 14 December 2021 (UTC)[reply]
I haven't seen anyone suggest using HTTP session cookies, rather, sessions based on a cookie (similar to how the "Remember me" checkbox works on login). In a pure session-based implementation, edits from different devices that happen to use the same IP address would appear with different identifiers. The same would apply if, for example, an LTA decided to clear their cookies after every edit they made. AntiCompositeNumber (talk) 22:40, 9 December 2021 (UTC)[reply]
Couldn't device fingerprinting be used to solve the LTA example without much issue? –MJLTalk 20:03, 12 December 2021 (UTC)[reply]
If you consider throwing out any semblance of the WMF caring about user privacy to not be an issue, sure. AntiCompositeNumber (talk) 00:59, 13 December 2021 (UTC)[reply]

Keep going![edit]

Just leaving a message to say that I think that this is going in the right direction. Will both approaches be tested? --Gnom (talk) 23:10, 9 December 2021 (UTC)[reply]

Thanks! Our current plan is to decide on one of the approaches in January.
We'll send out some reminders at the beginning of January, to make sure more people see it and can leave feedback before January 18. /Johan (WMF) (talk) 00:14, 10 December 2021 (UTC)[reply]
@Johan (WMF): to clarify, have you relayed Suffusion's idea above (about a hybrid approach) to legal and/or the devs? I'd rather see that trialed than a one-or-the-other, tbh. ProcrastinatingReader (talk) 04:15, 31 December 2021 (UTC)[reply]
I have. We're not making any decisions right now, because we have said we'll listen to feedback until January 18 before we make any decisions at all at this and don't want to break that promise, but it's part of the feedback we'll be processing. /Johan (WMF) (talk) 00:44, 3 January 2022 (UTC)[reply]
I should note that I'm happy to see the increased activity on this page post the mass message, that was a good call. Nosebagbear (talk) 10:55, 5 January 2022 (UTC)[reply]

Welcome[edit]

So, very soon we'll be sending out a notice to all admins. (This might include some people who very recently were admins, but are not longer, and exclude some who very recently became admins.) This is not because we're interested only in their opinion, but because it's one way we've identified to reach out to the communities and people who are likely to be active in vandal-fighting, without spamming every content contributor.

We're specifically interested in feedback on the solutions we have listed in IP Editing: Privacy Enhancement and Abuse Mitigation#IP Masking and how to protect the wikis (9 December 2021 Update). We're of course open to feedback on everything else, too. And I figure we'll get some "why are you doing this?" again, so I'll just point out again that we're doing this because the Wikimedia Foundation Legal department has told us it needs to happen, because of changing regulations and norms around privacy – the regulations are not the same today as they were in 2001. /Johan (WMF) (talk) 13:45, 4 January 2022 (UTC)[reply]

  • @Johan (WMF): I got the message three times (admin on three different wikis). If you can't even tell that an admin is the same user across wikis, I'm not convinced that "Session based identity" will help at all with cross-wiki spam compared to sticking with IP addresses or a direct analogy. Thanks. Mike Peel (talk) 18:27, 4 January 2022 (UTC)[reply]
    The easiest way to make sure all administrators are notified is to notify all administrators on all wikis. Inventing a way to remove duplicates and to prioritize wikis (per edit count? per importance? Did you expect a notification on wikidata or enwiki?) is largely unnecessary. ToBeFree (talk) 18:33, 4 January 2022 (UTC)[reply]
    Ain't that one of the good things of SUL, that you know such stuff and just send it to the Homewiki? It's not even necessary that it's a Wiki where s/he is sysop, it's just home base. Grüße vom Sänger ♫(Reden) 21:47, 4 January 2022 (UTC)[reply]
    I'm sure the complaint would then have been "Why have I received this message on the wrong wiki? I [could] have turned off notifications there." ToBeFree (talk) 23:41, 4 January 2022 (UTC)[reply]
    This —TheDJ (talkcontribs) 10:51, 8 January 2022 (UTC)[reply]
  • I'm an admin on several wikis, and I also administer single sign-on for a major university as my career, so I'm familiar with changing privacy norms around the web. I'm not at all surprised to see IP-based identification going away. While IP masking would work, I'd be in favor of switching to a session-based identity path. I think this is the more solid and user-comprehensible approach, and it opens better opportunities for anonymous editing in the future. I'm happy to see WMF dealing with this issue proactively.– Quadell 18:29, 4 January 2022 (UTC)[reply]
  • That said, I'm not sure at all this measure would protect any contributor from government's prosecution: a malicious government might create an user that gain adminship privilege and the IP privacy measure is made ineffective. -- Blackcat Ar Icon Contact.svg 18:31, 4 January 2022 (UTC)[reply]
    • Outside the US, if I remember my experience working for a supplier for Deutsche Telekom correctly, governments in many countries (except the US) can gain access to phone/internet details very easily -- although it is far more difficult, if not impossible for corporations to get this information. On the other hand, it is difficult for the US government to gain access to those details (who must seek a court order) while it is very easy for corporations to get it. (And thru these corporations, foreign countries can buy the information either directly or thru a strawman intermediary. There's less privacy out there than we know, dammit.) -- Llywrch (talk) 21:23, 4 January 2022 (UTC)[reply]
      • @Llywrch: I'd be less worried by a corporation rather than a malicious government, but my point was that you cannot declare "private" an IP address then let an admin disclose it. We do not have a policy for admins and admins are anonymous. To be consistent, WMF should issue a rule that whoever has sysop upwards privileges must be disclosed and cannot be anonymous. -- Blackcat Ar Icon Contact.svg 23:40, 4 January 2022 (UTC)[reply]
        I think that if people can't be trusted to see the whole IP address, then people should not be trusted to see the partial IP address. Only checkusers can see the IP address made by anons, and block IP ranges (IP range block log should be kept private to only checkusers). Also, the Confidentiality agreement for nonpublic information must be rewritten and existing functionaries should sign again. Thingofme (talk) 09:54, 8 January 2022 (UTC)[reply]
  • I prefer the session-based approach. It provides more value in being able to identify and communicate with legitimate anonymous editors. However, at the same time, we need abuse filter options to be able to identify multiple new sessions from a single IP. These could be legitimate (from a school, for example), but will most likely represent abuse or bot activity. One feature I haven't seen mentioned yet. When a session user wants to create an account, it should default to renaming the existing session ID to the new name of their choice. We need to be able to see and/or associate the new named user with their previous session activity. -- Dave Braunschweig (talk) 18:37, 4 January 2022 (UTC)[reply]
    Hello Dave, when a session user creates an account we are planning to not carry over their edits. This is because we can’t be sure that the device was used by a single person (the one now creating the account) and therefore shouldn’t attribute all the edits to them. This could be common on public or family computers for eg. ––STei (WMF) (talk) 14:46, 13 January 2022 (UTC)[reply]
  • The ability to perform purely session-based blocks in addition to the existing IP+session blocking would be an interesting upgrade. Being able to communicate with IPv6 users through their session instead of their repeatedly changing IP address would also be a benefit. ToBeFree (talk) 18:42, 4 January 2022 (UTC)[reply]
  • @Johan (WMF): Not an admin but saw this on various talkpages I'm watching. I noticed the message said "There will also be a new user right for those who need to see the full IPs of unregistered users to fight vandalism, harassment and spam without being admins. Patrollers will also see part of the IP even without this user right". Who does "patrollers" refer to here? New page patrollers? Recent changes patrollers? Something else? I do think this right would be somewhat useful for the work I do patrolling. Elli (talk) 18:46, 4 January 2022 (UTC)[reply]
    Anyone is equally welcome to comment! ("[N]ot because we're interested only in their opinion", as I wrote above.) Who would need this user right would be largely for the local community to decide. Anyone involved in vandal-fighting (or who needs it for some other reason) who lives up to some simple requirements, and can be granted the right through some sort of simple community process. I would personally find it more useful when patrolling recent changes than when looking at new pages on my home wiki, but I don't imagine we know all potential needs. Johan (WMF) (talk) 18:54, 4 January 2022 (UTC)[reply]
    Including it in dewiki's "Aktiver Sichter" and enwiki's "rollbacker" could be an idea. The former is assigned automatically according to rather strict criteria. Is manual assignment a strict requirement? ToBeFree (talk) 18:58, 4 January 2022 (UTC)[reply]
  • I remain a steadfast IP anti-masker, but I feel that my opinion probably won't change things at this point. I'm grateful as an enwiki admin for the ability to still see IPs to block malicious ones. This will probably speed up the already-inevitable path to mandatory registration on enwiki, I think. John M Wolfson (talk) 18:49, 4 January 2022 (UTC)[reply]
    • @John M Wolfson: indeed I am almost completely agreeing. In order to fix a minor problem we are creating a bigger one. At this point either a) allow open proxies to avoid IP addresses to be tracked or b) impose mandatory registration. WMF seemingly chose to hide IP addresses from malicious eyes though admins, that can see those IPs, are largely anonymous. -- Blackcat Ar Icon Contact.svg 23:45, 4 January 2022 (UTC)[reply]
  • Anyone seriously interested in vandalizing or evading our controls will be using more than a a single computer or single browser. So do many good-faith users (I currently use ≥3 at least occasionally) They are usually but not always working from the same router, and I almost always use the same browser, but not necessarily the same version. If I were an ip unregistered user, trying to link what I do via session-based cookies would seem to be useless. DGG (talk) 18:56, 4 January 2022 (UTC)[reply]
    • There certainly would be an incentive for good-faith unregistered users to use an account. I'm not certain that's a bad thing. LtPowers (talk) 19:00, 4 January 2022 (UTC)[reply]
  • Based on a quick perusal of the issues, session-based IDs seems like the best solution. LtPowers (talk) 19:00, 4 January 2022 (UTC)[reply]
  • In the session based model, will we be able to see all sessions belonging to a given IP? Kusma (talk) 20:11, 4 January 2022 (UTC)[reply]
  • Hi Johan, I (and presumably all the other admins) just got the message. Since most of it I've already discussed with you, I was just wondering why it places "norms" before "regulations" in the reasoning for why. I thought we'd settled that regardless of if they have or haven't changed in this way, Legal have no right to change things on that basis, and using it as cover is not acceptable for a top-down imposed change Nosebagbear (talk) 21:40, 4 January 2022 (UTC)[reply]
    Changing norms lead to changing regulations, is probably how my mind went at the time, but it was weeks ago, so honestly I couldn't tell you. This is me putting together a very brief explanation in a message I wanted short enough for people to feel they had the time to translate it. I wouldn't read too much into it. (: Johan (WMF) (talk) 23:20, 4 January 2022 (UTC)[reply]
  • Considering there has been near-unanimous opposition to this IP masking since day one and "But WMF legal told us to!" is the only reason anyone is pushing this and as more and more Wikimedia projects adopt the required registration policy I think simply implementing that globally would be better than this absolute mess. Don't solve a tiny problem by creating a much larger one. Naleksuh (talk) 23:53, 4 January 2022 (UTC)[reply]
  • I understand why this happens, but I think it is very close to seriously damaging our capacity to administrate wikis:
    • If a user edits on two wikis simultaneously from the same device/browser (e.g. creates an articles and adds a link to Wikidata), whatever the scheme, the identifier absolutely must be the same (otherwise it is just useless). If a user adds similar nonsense pages to all wikis, but they are Anonymous1235 on enwiki, Anonyme245 on frwiki and Анонім74 on ukwiki, we will need a steward to find this out, while today any bystander can detect this.
    • For session-based identifiers, I would really like us to use a MAC address rather than a cookie. Cookies are very easy to circumvent, and I don't think cookie restrictions will be any efficient against vandals. It would really be a cat and mouse game: an admin blocks by cookie, a vandal deletes it and can edit again immediately. A vandal can literally change their identifier after every single edit, and this change will take just 1-2 sec (restarting a router to get a new IP address for a dynamic range usually takes 1-2 min), which makes circumventing restrictions unreasonably easy. Unless our vandal has zero IT skills, this would be completely useless as modern browsers allow to purge cookies very easily (or offer quick access to private mode).
    • If using MAC addresses is not possible, keeping blocks by IP addresses is the best option. This approach has lots of disadvantages, but at least we know how to deal with it. However, there are two important conditions.
      • We absolutely need a replacement for range identifiers, notably for things like 3RR, bans and filters. Today we know that three anonymous reverts in a row made from the same range are almost surely the same person (more rare the range is, more likely is this fact). In future we would need to easily find out if these three anonymous edits are coming from the same range or from completely unrelated ones. There are significantly more use cases, for instance, we have AbuseFilters targeting some very specific behaviour from specific ranges. The most common tools we need would be <do anons X, Y and Z come from the same range> or <check all edits from X's range>. We can do it in a smart way, e.g. automatically querying WHOIS to get the respective range of the provider, or automatically checking provider's IDs to identify a different range of the same provider.
      • We need to prevent extra burden on admins. Anonymous edits are already quite hated by our admins. If these changes will mean that non-admins can do literally nothing with an IP edit (cannot check what other edits on this topic were made from this range, cannot check if they circumvent sanctions etc.), this would put an unreasonable burden on admins. I don't believe we would magically get more admins, so quite likely we will just end up doing what the ptwiki has done and ban IP edits altogether.
  • To sum up: session blocks are OK only if we use MAC addresses instead of cookies, IP bans are OK only if we have a good replacement for range tools for non-admins; both need to keep identifiers consistent cross-wiki. If not, banning IP editing altogether seems just the simplest option — NickK (talk) 00:00, 5 January 2022 (UTC)[reply]
    There is fortunately no way for an internet website provider to obtain the MAC addresses of their clients. This would be a privacy nightmare. IPv6 users can use their MAC address to generate an address in their /64, but this method is usually disabled for privacy reasons. Instead, a random address is generated ("Privacy Extensions", RFC 4941). ToBeFree (talk) 02:17, 5 January 2022 (UTC)[reply]
    @ToBeFree: Thanks for this clarification, I did not do any research into it. Thus our only option to ban a specific device is setting a cookie that a vandal can immediately delete? Or do we have any better option? For instance, pairs <IP+port> might be good in many cases, although we are not using them now AFAIK. It would be good to have a brainstorming on what our options (even if not readily available now) are — NickK (talk) 20:10, 6 January 2022 (UTC)[reply]
    A new source port is used for each connection, to separate them from each other. Connection reuse for multiple requests to the same server is a thing, but there may be multiple connections to the same server to improve performance, and at very least closing the browser and re-opening it throws the connections away. TCP/UDP source ports are thus less useful than IP addresses and cookies for any kind of identification.
    The thing is, if there was something more identifying than cookies (Browser fingerprinting has been mentioned above), then using it for identification would contravene the idea behind the whole action: Improving users' privacy. So even if there is something that could be technically used in place of the IP address, it won't be used for exactly the same reasons. ToBeFree (talk) 23:26, 6 January 2022 (UTC)[reply]
    Ah, and to address your banning concern: We can still see and block IP addresses, they're just not public anymore (#Blocking unregistered editors). ToBeFree (talk) 23:29, 6 January 2022 (UTC)[reply]
    @ToBeFree: Well, my main concern is: we are deciding what identifiers will see most registered users. The problem is that there are two separate approaches: we do want to protect privacy of good-faith casual unregistered contributors, but we need to be able to fight annoying unregistered vandals. There are way more non-admin vandal fighters than admins (perhaps at least by a magnitude of 10). I want to have some meaningful approach that would allow non-admin vandal fighters understand which vandal they are dealing with, otherwise we would have a real communication problem between admins (who do see IPs) and non-admins (who would have some weird identifiers). While I would definitely prefer a cookie identification for a 60-year-old lady contributing from time to time on the same topic but having no idea how IPs or registration work, I do want to have an IP (incl. range and provider) identification for a 16-year old geeky vandal who knows how to delete cookies or get a new dynamic IP.
    Regarding ports, of course I mean IP + port pairs as a way of identifying distinct connections, particularly for dynamic ranges. We already have this data somewhere and it is quite an industry standard, thus I think it is an option that should be explored and might be helpful — NickK (talk) 21:48, 10 January 2022 (UTC)[reply]
    Fortunately, there will be a user right we can give to non-admin vandal fighters to let them view the IP address ("The IP address itself will be visible to administrators and patrollers"). Perhaps it can even be given to all members of existing antivandalism groups such as enwiki's "rollbackers", or even to all members of automatically assigned groups such as dewiki's "Aktive Sichter". We'll need details about this (18:58, 4 January 2022 (UTC)).
    IP+port pairs are being used to identify connections all the time; that's their technical purpose. Displaying source ports to users – randomly generated, meaningless numbers up to 65535 – doesn't provide any benefit, though. ToBeFree (talk) 22:57, 10 January 2022 (UTC)[reply]
  • The session identity needs WAY more thought before rolling out. Cookie-based session identities? Come on now! That is way too easy to circumvent, which the deliberate miscreants will do anyway, negating any benefit that might have come from it. NickK has a good point that MAC addresses would be better. There is no harm in rolling out IP masking first. It doesn't change anything from the administration side, causes minimal disruption to existing workflows, and preserves privacy (particularly editors in countries like China who are hesitant of repercussions editing here). Anachronist (talk) 02:04, 5 January 2022 (UTC)[reply]
  • As long as IP based blocking remains possible, it seems like the session-based approach has some advantages. I don't think session-based blocking will be useful, almost everyone knows how to clear cookies. -- hgzh 14:58, 5 January 2022 (UTC)[reply]

Admins responsibility[edit]

Hi. Does it come with any additional responsibilities for administrators to not make these hidden information public? In any possible way? Piastu βy język giętki… 18:37, 4 January 2022 (UTC)[reply]

They already have that responsibility. 86.14.197.26 18:53, 4 January 2022 (UTC)[reply]

What's going to happen at switchover time?[edit]

Hello, I'm an admin on the English Wikipedia. We have countless discussion pages used to track vandals by their IP addresses. Suppose there's an ongoing discussion about the notorious 127.0.0.1 vandal. This new system is switched on; suddenly it's illegal (for lack of a better word, but you get me) to edit that discussion because IP addresses are private. That raises questions:

  • Are we going to have to search and replace every instance of 127.0.0.1 with TemporaryName123 and revision-delete the entire page history up to that point?
  • What about all the archives? A "sock puppet" log may be dormant for a long time until a vandal pops up again, and have a bunch of information about IP addresses that were used before the new system came in - what to do about those?
  • Will the WMF have a cut-off date for how far back we'll have to manually edit IPs out of view?
  • Will there be a tool that administrators can use to turn IP addresses into their masked versions, to facilitate discussions that were IP-based? Edit: I think this is absolutely essential to avoid huge confusion

I don't expect you to have answers to all these right away, and I support this change, but boy. This is going to be hard. — Scott talk 18:55, 4 January 2022 (UTC)[reply]

Hi Scott, I don't have a full answer for you, but we will not implement this retroactively, that would just create too much work and trouble for everyone. The already published IPs stay. Johan (WMF) (talk) 19:07, 4 January 2022 (UTC)[reply]
It does however leave the discussion going about how particular ranges will get discussed and referred to. People who have the new right will see the same wave of vandalism (or what have you) and that it's TempName5 and similar. When they discuss it, are they going to have to just use the new naming - how to make people who haven't made the jump (or, have, but haven't associated to the new name-set) understand its the same problematic editor? This seems to have a major ongoing risk of tying IPs to masked IPs, or hindering the process. Nosebagbear (talk) 21:37, 4 January 2022 (UTC)[reply]
@Nosebagbear: That's exactly what I meant with my last question. For continuity we must have a tool that will "hash" IPs into their new, masked form. — Scott talk 15:19, 5 January 2022 (UTC)[reply]
Hello @Johan (WMF):, considering that sysops have the privilege to see the IPs that soon will be "private", are you going to implement requirement for sysops and upwards, such as they must be identifiable and not anonymous, at least to WMF staff? -- Blackcat Ar Icon Contact.svg 23:49, 4 January 2022 (UTC)[reply]
That's not needed even for CUs - I believe only BOT (public-known) and MCDC (WMF-known) members fall into this camp currently. Nosebagbear (talk) 10:52, 5 January 2022 (UTC)[reply]
@Blackcat the Wikimedia Foundation will not require identification of admins for this. Find more details about the new user right needed to view IP information here. –– STei (WMF) (talk) 11:41, 6 January 2022 (UTC)[reply]
I think only seeing the IPs would not needed to identification into the noticeboard, as this is time-consuming and not the best way, because IPs are sometimes not as powerful as determining the state of privacy. Personally, I don't think that admins and IP viewers must be 18 years old, having experiences (We don't ban minors into being an admin, with a requirement that you must have a mature-like way of work and not act like a child), and not living in countries which bans Wikipedia. Thingofme (talk) 15:33, 5 January 2022 (UTC)[reply]

What if not? Preset sanctions[edit]

What happens if the people who promise to keep IP's confidential, eventually publish this information?

I believe that there must be preset sanctions, so that users can lose their rights if they fail to keep the IP information confidential. --FocalPoint (talk) 18:58, 4 January 2022 (UTC)[reply]

Wątpliwości[edit]

Co mnie w tym niepokoi: że admini będą mieć łatwy dostęp do tych IP. Ja nie bardzo chciałbym mieć. Bo co, przyjdzie czas, że wpadnie policja, czy jakaś inna służba, i powie "daj sobie dostęp, albo coś tam..." A ja wtedy się postawię? Zapomnijcie. Dam sobie dostęp, tak szybko jak się da. Więc wolałbym by była do tego jakaś niezupełnie łatwa procedura. --5.60.193.155 19:59, 4 January 2022 (UTC)[reply]

Cookie based behaviour[edit]

There are two questionable statements in the description of how cookie based identities work:

  • It is likely that many users will end up with a semi-permanent talk page unless they specifically try not to.
  • Another advantage to note is that talk page messages will no longer end up with the wrong recipient in any scenario.

Many web sites (using cookies for sessions) recommend that you clear your cookies afterwards. With normal browser setups, clearing cookies for that site only is difficult, so most people will clear all cookies, meaning they will have a new talk page every time. Of course, many will not, but the wording suggests that the system generally leads to persistent talk pages. Also people using a public computer will often get cookies cleared between sessions, even if they routinely use the same computer.

There are still scenarios with shared talk pages. The simplest is a desktop computer used by a family, with no individual accounts, or with family members occasionally using each other's accounts on the computer. I think this is quite common. Another scenario is a public computer, where cookies are not cleared between uses. People might not log out, so even where cookies are cleared on logout, unrelated people might use the same cookie.

If we want talk pages to remain stable for unregistered users, we might want to provide the cookie in plaintext on their screen, allowing them to save it, and allow restoring the identity by entering that cookie somewhere (it doesn't of course need to be the cookie itself, but I see no reason to use anything else). That would be a login without registration, but the first point suggests that that is what we want.

LPfi (talk) 20:19, 4 January 2022 (UTC)[reply]

Browsing in incognito mode, as many anonymous users do, completely negates any advantage that may have been imagined for this session-identity idea, because the entire session (cookies and all) disappears when the browser window is closed. Definitely not ready for rolling out yet. Needs much more thought. Anachronist (talk) 02:08, 5 January 2022 (UTC)[reply]

Question[edit]

Hello! When is it scheduled to be completed this project? AlPaD (talk) 20:37, 4 January 2022 (UTC)[reply]

Per the message, they will decide on the final move after January 17. —Atcovi (Talk - Contribs) 21:16, 4 January 2022 (UTC)[reply]
OK, thank you. AlPaD (talk) 07:08, 5 January 2022 (UTC)[reply]

The encyclopedia that anyone can edit[edit]

It's not a good sign when we are asked to comply with a legal directive whose scope is unclear. By what authority is that directive issued? In whose name? See Instruction creep, and en:Instruction creep.

When an editor is used to jumping in to edit an article on a favorite topic, without logging in, they are currently exposing their IP address, which is of interest, for example, to authoritarian regimes. If WMF were to assign a time/device-based identifier to that editor, an edit session and cookie would get created on the editor's device (phone/ tablet/ computer) in their behalf, by default without their own mental effort. If the editor were to select their own identifier that cookie would have their en:username embedded in it. A sock puppet investigation, or vandalism block, would be aided by a database of this kind of information. This would embargo the devices of those malicious users, who have not selected a username.

At the very least, the editor is going to have to acknowledge the they have read, and intend to comply with the encyclopedia's publshed standards; such as the en: five pillars. --Ancheta Wis (talk) 20:41, 4 January 2022 (UTC)[reply]

Can unregister user use username without password?[edit]

I think unregistered users must give a choice to make usernames without passwords. I mean when an unregistered user click on edit, he/she will see a popup showing some alphanumeric usernames or a text field that allow them to create a username without a password? Such usernames may have their separate namespace.--Ameen Akbar (talk) 20:50, 4 January 2022 (UTC)[reply]

Global groups[edit]

Hello, My little contribution here is a mix between the questions of @MusikAnimal and Camouflaged Mirage:. The implementation is still unclear to me. I will surely repeat what has already been said, but as it seems to me essential. The interwiki patrol must be recognized by a GR to provide an acceptable "escalation" to the S or GS. That is to say a SWMT patroller should not give more work to other usergroups (S, GS) if he/she does not know the damage of IP, or even a range of IP , on all wikis. Cordially and happy new year 2022. —Eihel (talk) 21:15, 4 January 2022 (UTC)[reply]

Kindly, can you write the 'S', 'GS', 'SWMT' in full so we are on the same page? –– STei (WMF) (talk) 12:12, 6 January 2022 (UTC)[reply]
Hi @STei (WMF):, I'm pretty sure that they are: GR = Global Rollbacker, GS = Global Sysop, S = Sysop, SWMT = Small-Wiki Monitoring Team Nosebagbear (talk) 13:04, 6 January 2022 (UTC)[reply]
Courtesy links: Stewards, Global rollback, Global sysops (the global groups) and Small Wiki Monitoring Team. —MarcoAurelio (talk) 13:05, 6 January 2022 (UTC)[reply]

For small wikis, I think the IP based approach is better[edit]

For small wikis, I think the IP based approach is better because it is unlikely that two anonymous users will have the same IP, and for a vandal modifying its Ip is most difficult that erasing cookies --Gat lombard (talk) 22:23, 4 January 2022 (UTC)[reply]

In addiction, small wikis need a heuristic system to recognize reitrated vandalism since there are few vandals but the same vandal could recur repeatedly with similar vandalisms. We need an artificial intelligence that recognizes vandalisms based on past recent vandalisms on that wiki, most likely for small wikis it would only block the vandals and nothing else. A famous example to solve is the one that kepps on happening on the wikipedia in Neapolitan language and occasionally on the Lombard one. The vandal is very harmful to the small Neapolitan (nap) wikipedia. A heuristic system might help. A similar problem could happen to any small wiki --Gat lombard (talk) 22:30, 4 January 2022 (UTC)[reply]

I agree with Gat lombard. In my opinion, the IP based approach is better for small wikis. --Starladin (talk) 10:56, 5 January 2022 (UTC)[reply]

Actually it's much easier to change IP than erase cookies. Of course depends of settings, but at least in Finland we use a lot mobile network, and all the time we restart it the IP will change automatically. Stryn (talk) 11:31, 5 January 2022 (UTC)[reply]

My two cents[edit]

I'm just going to dive in here. Maybe it's the wrong place - it's hard to tell.

As an en-wiki admin (but not a check-user) I couldn't give a monkey's cuss about the actual IP address I see. Encrypt it for all I care; it matters not one jot to me. All I need to worry about on a daily basis is seeing the edits made on the full IP range on which one individual is liable to be editing on, and to do the occasional geolocate when deciding if they're avoiding a previous block (also based on other editing characteristics).

What I currently absolutely hate is the inability to see IPv6 user contributions across the whole range (usually /64) by default. There is also no button or tool to let me display this - I have to insert "/64" manually, which is so absolutely frustrating. I believe /64 contributions should be shown together by default, even if anonymised, especially as it's totally impossible to add /64 to a url on my mobile phone because it doesn't have a 'Home'/'End' function. So I simply have to let IPv6 vandals get on with it until such time as I'm back on my desktop or laptop.

By default, I need to see all contributions made across an IP address range relevant to one individual. I really don't need to know genuine IP address details. I need a system to recommend/look up the most effective rangeblock for that person which has the least collateral damage. And I need some way to communicate with past and future addresses on that single user IP address. For example, if I warn or block a /64 range of IPv6 addresses, I'd like to post - in one go - a warning to all past addresses on that range, a well as ideally having a system that identifies new addresses on that range and automatically repeats my messaging over the next 24-48 hours. What I don't need to know for my routine admin work is their actual IP address. —The preceding unsigned comment was added by Nick Moyes (talk)

This would be a really interesting set of functionality for the new tools - and a "default to the /64" position for ipv6 would be excellent, and indeed becomes more critical with masking. Nosebagbear (talk) 10:49, 5 January 2022 (UTC)[reply]
Agree. I frequently block persistent disruptive users whose edits are on a :/64 range. No need to see the actual addresses, just need to know that they are related.Bagumba (talk) 04:15, 9 January 2022 (UTC)[reply]

Impact on AbuseFilter[edit]

I think it would be best if impact of IP masking on AbuseFilter would be clearly explained on this page on Meta.

Moreover, if the session-based approach is used, it would be ideal if AbuseFilter could access both the IP and the session-based masked username in its rules. For instance, AbuseFilter currently allows throttling by IP and it would make sense for it to be able to both throttle by IP (i.e. across masked usernames from the same IP) and throttle by masked username (even if user changes IP). Huji (talk) 01:29, 5 January 2022 (UTC)[reply]

Is IP-based approach truly privacy-preserving?[edit]

If the same IP always resolves to the same masked username, then once one person reveals the association (let's assume for good reason), it'll be permanently revealed. Not sure how masking will be implemented (same hash for all projects, or different hash per project) but this can have broad impacts.

It would be great if the details of the approach are shared here on Meta (as opposed to just on Phab/code docs). Huji (talk) 01:31, 5 January 2022 (UTC)[reply]

The premier example would be the IP of a company gateway not serving too many users: any colleague would see that somebody else in the office has made those edits, and knowing views, interests and language quirks, it might not be too difficult to guess who. The colleague may also have an account and once forget to log in, then signing their IP comment by their user name. Now everybody who knows them can guess where that masked IP is coming from, compromising the privacy of the other user. –LPfi (talk) 13:00, 13 January 2022 (UTC)[reply]

Reserving the naming convention of masked usernames[edit]

We currently don't allow someone to create an account named User:8.8.8.8 for good reason: that is an IP address.

I think, similarly, we should not allow users to create accounts with names that would look similar to the masked usernames used for IPs (regardless of whether the IP-based or the session-based approach is used). Perhaps, masked usernames should look like prefix-[a-z0-9]{8,128} and once the feature is turned on, no user would be allowed to create an account that matches that pattern. 01:35, 5 January 2022 (UTC)

I suppose that's the intent. It might be useful to block registration of such usernames already when the prefix is suggested. –LPfi (talk) 13:03, 13 January 2022 (UTC)[reply]
Yes, if we go this way, we’ll first be figuring out what the naming convention could be (shouldn’t have existing usernames either) and we’ll also not be allowing those names to be registered in the future. –– STei (WMF) (talk) 14:54, 13 January 2022 (UTC)[reply]

Anonymous editing restriction experience at fawiki[edit]

The Portuguese Wikipedia experience is mentioned here, but the (ongoing) experience at Persian Wikipedia (phab:T292781 and Dashboard) is not. I understand the latter is an ongoing experience, but it has some advantages over the ptwiki experience (namely, restriction is only in the main namespace, and several metrics are being monitored), and I think it is useful for others to be aware of it. Given that I am not impartial in this, I refrain from adding it myself, but suggest that someone else does so. Huji (talk) 01:38, 5 January 2022 (UTC)[reply]

The phab ticket seems to tell that 2/3 of IP edits were not reverted, and those were lost (no significant increase in edits by registered users), 18 % of all non-bot edits. That's the raw statistics, I cannot judge the metrics, and long term effects were not discussed. I assume that most large contributions are made by registered users, so many of the lost edits would probably have been correcting spelling mistakes and the like. Whether this is a significant way of loosing new users (who never start editing and never register) cannot be seen from those figures. –LPfi (talk) 13:17, 13 January 2022 (UTC)[reply]

Support[edit]

The session-based system does seem better, and would make it easier to communicate with anonymous editors. I'm an admin on English Wikipedia, and my main interaction with IP editors is reverting and warning them against vandalism. In several cases recently I haven't even bothered posting a warning, since it seems unlikely the right person would receive it. In one case I was trying to have a conversation about some proposed change, and I was talking to several different IP addresses, and it was unclear that it was actually the same person, and I had to keep asking them about that.

I do know some people who have corrections for certain articles, and instead of making the change themselves ask me to do it. Maybe not having their IP address exposed will help lower the barrier to participation? Or maybe there are just other factors which are more important?

In the long run, I'd lean toward banning IP editing completely on English Wikipedia. Yes, it might discourage a little casual editing, but the results from the Portuguese Wikipedia seem to show that effect is minimal. Maybe the vast majority of legitimate editors are simply motivated to edit Wikipedia because it's highly visible, enough to sign up for an account if they don't already have one. It would cut out a lot of vandalism and make communication even more reliable. The idea that IP-based editing provides more privacy is weird, considering that exposing one's IP address seems considerably less private, though session-based anonymous editing would improve the privacy situation a lot. I do feel like having less vandalism to deal with frees up legitimate editors to make more substantive contributions, often with a small number of larger or more thoughtful edits, in contrast to the larger number of tiny edits which casual editors seem to make. English Wikipedia is currently suffering huge fact-checking and neutralizing backlogs, which do need those deep, time-consuming edits to fix. -- Beland (talk) 01:42, 5 January 2022 (UTC)[reply]

@Beland: A deliberate vandal will either browse in incognito mode or delete their cookies regularly, which would completely negate any benefit you perceive from the session-based system as currently proposed. I oppose it because it hasn't been well thought out. MAC addresses would work better than cookies. Anachronist (talk) 02:10, 5 January 2022 (UTC)[reply]
@Anachronist: I don't think the session-based system is going to substantially reduce vandalism; I think it just makes it easier to communicate with no-account editors, both vandals and legitimate. Perhaps it will reduce the number of IP-based blocks admins would make (because the first step would be to block the session pseudo-account) though IP-based blocks would still be the way to deal with the sort of vandal you're thinking of. Based on the comments from the Portuguese Wikipedia, it does seem like banning no-account editors would significantly reduce vandalism. But this masking proposal is for privacy, not security; and it does seem to me the team has thought through carefully how to avoid reducing the current level of protection against vandalism, if that's what you're worried about.
With regard to MAC addresses, aren't they only available on the local Ethernet or other data link? I don't think web servers have access to MAC addresses because IPv4 and IPv6 don't transmit them, and JavaScript engines in web browsers don't expose that information. -- Beland (talk) 07:47, 5 January 2022 (UTC)[reply]
Whether the cookie based sessions will allow communication depends heavily on how many of those editors clear the cookies regularly, actively or through browser features. Do we have statistics on that? –LPfi (talk) 13:24, 13 January 2022 (UTC)[reply]
Yes, I think this is something WMF could collect data on. Izno (talk) 04:45, 15 January 2022 (UTC)[reply]

How do we handle IP vandalism, and also shared IPs violations[edit]

My questions are ultimately about how users report IP violators at English Wikipedia: Administrator intervention against vandalism, and also how we handle Wikipedia: Template:Shared IP violations. Maile66 (talk) 01:43, 5 January 2022 (UTC)[reply]

Concerns by an IP editor[edit]

I have been editing as an IP for a while (there are reasons for it), and I feel that this proposal may be for the worse. Because my IP changes over time, with the first approach it would be completely different each time it changed, and I couldn't be a "range", but a semi-random set of IPs, which would make it hard for non-admins to know all the IPs are me. Similarly, with the second approach, I clear my cookies very frequently, so that'd be worse. What can I do (apart from registering)? --67.183.136.85 03:01, 5 January 2022 (UTC)[reply]

If you had an account, you could add your IPs to your user page – I don't know if the issue is about not registering or not logging in.
There could be an interface for copying your session cookie and reusing it in the next session. Depending on how those cookies are stored and checked at the server side you could do it manually, but support could be included in the web interface, allowing any users to save and reuse the cookies.
LPfi (talk) 13:30, 13 January 2022 (UTC)[reply]

They need to know their IP can be seen by sombody[edit]

Hi. In the current situation they get a banner to inform them that their IP can be seen by users. In the new banner they only get an invitation to make an account. The idea of masking is good but they still need to know their IP can be seen by some users (admins etc.) Gharouni 04:03, 5 January 2022 (UTC)[reply]

Gharouni makes a good point. The banner should include:

  • your IP is ... and it is visible by several users of anti-vandalism teams
  • your masked IP which will be used for your contributions will be ... and will be visible to all users

--FocalPoint (talk) 09:53, 5 January 2022 (UTC)[reply]

And users who have the same IP will be able to see that your edits came from the same IP. –LPfi (talk) 13:33, 13 January 2022 (UTC)[reply]

Suggestion[edit]

"The path is to create a new identity for unregistered editors based on a cookie placed in their browser. In this approach there is an auto-generated username which their edits and actions are attributed to. For example, User:192.168.1.2 might be given the username: User:Anon3406.

In this approach, the user’s session will persist as long as they have the cookie, even when they change IP addresses."

Ok, not withstanding the fact that from where I sit there is no good reason for this change whatsoever (And never will be owing to so called legal and ethical reasons that consistently make any attempt to understand WMF, T&S, LEGAL, etc proclamations useless from the start) would it at least be possible to create a session based identity in a way that allows for some semblance of tracking? For example, generating a username like "Anon01.05.21.12.59.xxx" could be used to track edits by editors by showing the date and time when the user was generated their id, with the xxx part filled in as the number to which they were created at the time (for example, 001 as the first user for that date and time, 015 for the fifteenth, etc)? In this way we can at least attempt to keep track of whose editing without showing the actual isp address while still allowing the community to keep tabs on unregistered benevolent/malevolent editors using a basic date/time setup. TomStar81 (talk) 08:07, 5 January 2022 (UTC)[reply]

Support for IP based identity[edit]

As an admin in German-language Wikipedia, of the two paths described here (IP based identity vs. session-based identity) I clearly prefer the IP based approach. It's just too easy to use a browser's privacy mode or to clear the cookies (I'm doing it myself all the time); changing your IP address at least requires a bit more effort, and we have already a policy against using open proxies in place. I agree with Beland that the session-based identity approach could probably make communication with well-meaning unregistered editors easier, but it just doesn't seem robust enough. Gestumblindi (talk) 08:28, 5 January 2022 (UTC)[reply]

As an admin in German-language Wiktionary I also prefer the IP based approach for the same reasons. --Udo T. (talk) 14:04, 5 January 2022 (UTC)[reply]
The way I understand the proposal blocking by IP remains possible: if you get IP-blocked and clear your cookies, you'll get a new "user name" based on your session but still cannot edit with it because your blocked IP is still the same. Session-based blocking could be additional, but I don't see much advantages with this for the reasons you mentioned. hgzh 15:02, 5 January 2022 (UTC)[reply]

My own idea[edit]

After reading this page and thinking carefully about the IP editing problem, with masked IPs and recognition, we need some questions:

  • 1st: How do we "report" IPs in en:WP:AIV, in which we report the full, unmasked IP address. If it's hidden to all of the people, then the reports and contributions page in the wiki are to read for anyone else?
  • 2nd: If we go with IP-based idenification, we should be sure that blocking IP logs will be feasible to:
    • All users, new accounts: Only the masked IP address. In this type of approach, only masked IP address are shown in the block log.
    • Extended confirmed users: Can see the first part of the IP, and the last part be hidden, and shown to the anon-identity. For example: 115.68.37.254 turns into 115.68..h4jis where anons edits, for new users turn into Anon-5hsjeh4jis. I think this should make into a pattern hash, where two IPs in a small range would change nearly similarity: Eg: 223.0.1.15 --> Anon-bx7t7uv6h2 and 223.0.1.94 --> Anon-bx7t7uvmt5.
    • Administrators and new user rights call "IP viewer": See the complete IP in the contribs and block logs
  • 3rd: If we allow session-based approach, this would be a mess for patrollers and the mixer of them, extended confirmed users. So, in my opinion, it would not be feasible.
  • 4th: If we allow the mix of session and IP based, the username would be as Anon-(10maskIP)-(15maskSE), in which 10 and 15 is the number of letters in the hash. In this case, everyone can "view" the contributions on the same IP (request Anon-(10maskIP) and viewing the contributors. But letting the contributions page intact are a "bad" idea because we don't know whether of same IP address are used.
  • 5th: All accounts used for verifying the IP-addresses can't be registered. For this change, we would have to find accounts with the prefix Anon-...(10char) and rename it. Hopefully, no accounts with this name as they are forged as violation of username policy.

If you like to discuss this idea, please let your comments about my five things to notice. Thingofme (talk) 15:28, 5 January 2022 (UTC)[reply]

  • I am strongly opposed to any partial masking of IP addresses. If people cannot be trusted to see the entire IP address, we should not show them any of it. A semi-trusted IP Geolocation (what country/province the ISP is located in) could be interesting if the WMF can implement it. (talk) 01:47, 7 January 2022 (UTC)[reply]
    I think partial masking is NOT ok, as it's no privacy and the other way is we can't implement that. So I think we should hash the IP into a IP-based approach: Anon-(10-20randomletters) and only checkusers can check the IP address used by an unregistered users. The IPs will be a hashed algorithm based on prime numbers (RSA/SHA256) and we have somethings: Admin can only block one single IP address-hashed user; but checkusers can block an entire range and check range contributions. The rangeblock logs would only be a private log, so it's hard to handle rangeblocks. Thingofme (talk) 07:53, 7 January 2022 (UTC)[reply]

Clearly for IP based and new tool to identify when a registered user is disconnected.[edit]

Hi User:Johan (WMF), just to share as admin that the option of cookies and so on seems to complicate the admin environment without resolve many things concerning wrong social behavior on Wiki projects. As some ones already explained, it will just change the rull of the game of troll (not game of trone ;-) but not changing situation. May be trolls will appreciate this change that make their occupation of disturbing project a bite more exiting...


In the other hand, masking IP address is an excellent idea. It's simple, don't disturb the community habits and give to the technical team lot of time to think and develop new tools. For instance, with masked IP, It should be, for instance, very useful to develop a new tool that permit to see when an IP will be used by a registered instead to be connected. For a registered user, it's indeed a frequent practice to voluntary disconnect the session before writing something unpleasant to another user without being identified. If they are identified as well when they are not connected, that's could limit this practice, while forcing users to communicate in an identified way and therefore with more courtesy.


Best, Lionel Scheepmans Contact (Fr-N, En-3, Pt-3) 15:38, 5 January 2022 (UTC)[reply]

Both?[edit]

I wonder if it would be possible to use both ways to determine identity:

  • cookie = true, ip = foo > Alice
  • cookie = false, ip = foo > Alice (deleted cookies / incognito mode)
  • cookie = true, ip = bar > Alice (other IP but same browser)
  • cookie = false, ip = bar > Bob

That would make it even better than the current system. It will keep the identity of people under IPv6, but also many of the ones behind CG-NAT (in my country many ISPs use CG-NAT for at least half of their IP4 ranges). Geraki TL 16:28, 5 January 2022 (UTC)[reply]

Localization[edit]

Whatever you guys decide on is fine, just don't pet-peeve me by making those temp-accounts English. Create a Mediawiki page for localization. Thanks. Seb az86556 (talk) 17:06, 5 January 2022 (UTC)[reply]

Very confused[edit]

Hi. There are a lot of words on this talk page and on the subject-space page so perhaps these questions are already addressed.

If you're simply hashing IP addresses ("User:192.168.1.2 may appear as User:ca1f46"), isn't this reversible/decipherable? And it seems like the entire trust model is based on hundreds, maybe thousands of users, continuing to have access to the IP address anyway, via some kind of user preference checkbox? What is being improved here, what's the actual benefit?

For session-based identity, the page notes "vandals in privacy mode or who delete their cookies would get a new identity without changing their IP" and then this massive abuse vector doesn't seem to be addressed at all. How will you prevent someone from maliciously editing via dozens of sessions?

Regarding the entire implementation, why not just auto-create accounts for unregistered users? We already have that flow easily established (both user registration and logged-in user sessions) and it wouldn't require all this other work.

Thanks in advance for any guidance you can provide. --MZMcBride (talk) 17:42, 5 January 2022 (UTC)[reply]

It would be technically possible to create a confidential database table containing a list of IP addresses and their "hash". This could be as simple as:
  • 127.0.0.1 is "0"
  • 192.168.0.1 is "1"
  • 198.51.100.0 is "2"
  • 198.51.100.4 is "3"
  • 127.2.3.10 is "4"
There would be no way to decipher "4" to "127.2.3.10" just from knowledge about other addresses. Classical hashing/encryption algorithms, on the other hand, may not be simply usable due to known-plaintext attacks. ToBeFree (talk) 18:05, 5 January 2022 (UTC)[reply]
This does sound like auto-creation. I'd like to see this whole initiative renamed to something like Nymity: accounts for readers --
  • Readers are automatically assigned accounts (essentially what the session model gives you).
    This lets them set and preserve preferences
    If they want this to persist across machines / browsers, they can convert this to a user/pass
    On conversion, their old prefs carry over, and any old edits can be reattributed
  • This gives us more visibility into usage patterns (as we tune the sites)
  • Edit-analysis tools, from quality-assessment to vandal fighting, now have extra data: cookie in addition to IP + fingerprint
  • Readers can see their nym, and choose to generate a new nym if they want (refreshing the cookie)
    Of course to customize and choose your own name, you are still welcome to make a user/pass account
  • Update common edit-moderation tools to show nym edits clustered by {IP range, nym}
I think combining the above (which would be a tooling + user-experience upgrade) with new NDA-style restrictions on what editors say to whom, is a very counterproductive idea. Don't add a new NDA or clickwrap honor pledge. It's not necessary or helpful for the core privacy concern; but it will cause headaches, confusion, and grief to our committed and often very-literal editors. Simply implementing the above will reduce by almost 100% the # of readers whose IP information is exposed to other casual readers, search-engine spiders, &c. –SJ talk  17:53, 7 January 2022 (UTC)[reply]

Idea[edit]

Hello. I would suggest that the masked identity be permanent for each anonymous user. AlPaD (talk) 20:29, 5 January 2022 (UTC)[reply]

Maybe I'm missing something here. If any unregistered user wanted that, couldn't they just get a permanent masked identity – and even one of their own choice – by registering some user name? ◅ SebastianHelm (talk) 07:18, 6 January 2022 (UTC)[reply]
Hello, yes no needed. AlPaD (talk) 20:59, 7 January 2022 (UTC)[reply]

Prefer IP[edit]

I am leaning towards the IP-based identities, even if encrypted, as cookies seem more complicated to deal with and very bothersome to keep shutting their annoying pop-ups (very standard in Europe). I have to mention that I prefer that till this day, one could use Wikipedia without cookies, unless he wants to log in to edit with his username. --Mahmudmasri (talk) 01:36, 6 January 2022 (UTC)[reply]

I also support the hashed IP approach. Wayne (talk) 09:52, 6 January 2022 (UTC)[reply]

What does WMF need us to read?[edit]

I received a notification at w:User talk:SebastianHelm: How we will see unregistered users from WMF that contained a request for feedback, about which I had a question. Since I have not received a reply, I'm reposting it here below.

Thanks for the notice, Johan. For the suggestions for which you would like feedback, you're linking to a page with over 9000 words. That's a reading time of over an hour, and the TOC doesn't list a basic explanation of the “two suggested ways”. Do we have to read it all, or can you narrow down what we have to read (at least the core text; of course it can use terms that are explained elsewhere and are linked to the appropriate places; that's the beauty of hypertexts) in order to be able to give you the feedback you're asking for? ◅ Sebastian 19:02, 4 January 2022 (UTC)

I might add that over at the English Wikipedia, we have nice recommendations such as w:Wikipedia:Writing better articles that would help for such a meta-article, too. ◅ SebastianHelm (talk) 07:48, 6 January 2022 (UTC)[reply]

Yes, it would help to just get the current proposals to comment on and to provide a more concise version of the rest (I'll make a start: the WMF legal statement can be cut to "IPs must be masked for legal reasons but we can't tell you what these reasons are" without losing any real information). I am in favour of the still-mentioned proposal 3 (disable IP editing as on the Portuguese Wikipedia), which would free up developer time for more useful issues. Kusma (talk) 19:57, 6 January 2022 (UTC)[reply]
@Kusma The legal statement is not just that, but also "If we told you these reasons, we would basically tell everyone about a plan to cause trouble to the projects(Wikimedia Foundation) and the users." I.E, if they disclosed the reasons, Wikimedia Foundation could be likely sued, for those reasons, even if the reasons are false. Techie3 (talk) 02:39, 11 January 2022 (UTC)[reply]

Nickname considerations[edit]

When displayed, generated user names should contain one of #, @ or / to avoid collisions with registered user names.

I do expect a lengthy code.

  • When displayed, code shall be grouped by three or four characters.
  • anon#5F28-B73C-D218-6AE3
  • This individual might be addressed in conversations as @5F28, this is an interesting aspect.

The word “anonymous” etc. should be displayed in project or even user language.

  • The rendered nick shall be generated from the internal code.
  • Internally it might be: 5F28B73CD2186AE3 or whatever.
  • Visible in English perhaps as: anonymous#5F28-B73C-D218-6AE3
  • Greek readers should see ανωνυμιών#5F28-B73C-D218-6AE3 and in Russian анонимный#5F28-B73C-D218-6AE3 etc.
    • Digits are understood in almost every language and scripting. The keyword in local language is explaining the meaning of this strange thing. Not in all languages there is a concept of letters – therefore latin letters should remain. From URL users might have learnt a few latin letters, better not transferred into Alpha Beta Gamma Delta.
    • Ponder on right-to-left scriptings. Should work.
  • There is a need to resolve generated nicks backward into internal code.
    • If there is a (e.g.) # inside, followed by supposed number of hexcodes and perhaps some hyphens, or not, then drop the leading word and use the hexcode sequence only.

Greetings --PerfektesChaos (talk) 09:01, 6 January 2022 (UTC)[reply]

Language localization is ok, but provide translates to recommend the Anon letter, but we have to rename accounts which has the prefix Anon. Also, IP address can be registered as normal, and IP can only be checked by checkusers, even unregistered users. Editing banner would say "Not logged in, create an account to customize your username and having more benefits." Thingofme (talk) 13:09, 6 January 2022 (UTC)[reply]
The pound sign (#) cannot be used for obvious reasons. @ seems interesting. Izno (talk) 04:52, 15 January 2022 (UTC)[reply]

Page by page revealing[edit]

Hello,

This may have got resolved in a discussion and I missed it, but last I recall the question of how to handle the WMF (more accurately, Legal) wanting to log cases of IPs being revealed (rather than them being revealed as default to those with that setting) and the issue that doing that one by one would be a massive pain and disruption to the workflow hadn't been resolved.

Several people had mooted the possibility of revealing a page at a time, and I think Johan said they'd consider it and ask Legal (apologies if I am incorrect about that final aspect). Did this get escalated and if so, what was the outcome? Nosebagbear (talk) 13:13, 6 January 2022 (UTC)[reply]

I prefer IP[edit]

I prefer IP because a vandal might clean cookies to continue to vandalize.--Simonk (talk) 14:33, 6 January 2022 (UTC)[reply]

Account age[edit]

The content page mentions that there will be restrictions based on "account age". This is a problematic metric. How many years old is my account? I recommend that you recommend a metric such as "at least 100 edits in 12 of the last 18 months". (talk) 20:09, 6 January 2022 (UTC)[reply]

Why choose one, if we can have both[edit]

Moving to a session-based identification for anonymous users seams in line with the 'assume good faith' philosophy, and could potentially expand the number and quality of the contributions from occasional users, perhaps even 'convert them' to logged-in users.

The down side of such change would be, that it would be harder to detect some of the most common types of vandalism.

Now, I understand that moving to a session-based identity for anonymous users will take some work on the platform. If we are already working on supporting cookie-based identities, why not keep some of the benefits that we have from the IP-based identification?

So basically, besides getting a handler (e.g. anon-1fe49afc7bc5), for whom we can see the User contributions, we can also have a hashed-IP identifier for that user (which might be shared by different anonymous users), and direct access to any contributions done from that (still hidden) IP, be able to (temporarily) block anonymous contributions from said IP (even without seen it!), or pass it forward to a checker for more information.

I understand there are also other ideas of how to make this and other frequently used tools possible under the new scheme.

I short, I agree with the session-based identification, but let's also work on improving the mechanisms available to mitigate vandalism.

MarianoC 21:57, 6 January 2022 (UTC)[reply]

There's a discussion about this up-page: see § IP-based versus session-based masking: Why not both?. You might want to move your comment there. — Scott talk 22:52, 6 January 2022 (UTC)[reply]
Thanks! MarianoC 10:49, 7 January 2022 (UTC)[reply]

Signature[edit]

If an unregistered user contributes to a talk page, his IP is not only recorded in the edit history, but also in the signature within the page body.

What will happen to older discussion statements with IPs in their signature? Will somebody build a bot, which will replace all old signatures (very, very many) with the new session-based identity?

But even if so, you can still see the old IP signatures, if you watch older revisions in the history. To hide all editor's IPs, it will be necessary deleting all older revisions (Revision deletion). But this will be not confirm with our licence policy. --Indoor-Fanatiker (talk) 03:45, 7 January 2022 (UTC)[reply]

Expand the constituency[edit]

I have been notified by Johan of this proposal and asked to comment because I am an admin on the German Wikipedia. While I appreciate the effort of reaching out, I have a hard time providing comments, because:

  1. The proposal reads more like an essay than a proposal, it lacks a succinct summary and clear questions
  2. The whole thing feels like a foregone conclusion (Whatever the comments, the foundation wants to turn this into a tool for converting unregisteerd users into registered users, so...)
  3. Feedback mechanism is so unstructured that ignoring it or splitting it into a myriad sub-threads seems a given.
  4. The constituency for notification is both too narrow and too broad, see below.

On the last point, even though I am fairly active in de:WP, I almost exclusively deal with RfDs and DELREV as an admin. I don't do interventions against vandalism, and hardly block anyone. I believe that the notification should also go to non-Admins who are actively contributing to fighting vandalism. In de:WP, see de:Wikipedia:WikiProjekt Vandalismusbekämpfung/Ansprechpartner. Also, all past and current checkusers should be notified, not all of which are admins. See de:Wikipedia:Checkuser#Checkuser-Berechtigte. I have written a notice about this proposal and feedback process in the German version of the Signpost (Kurier). --Minderbinder (talk) 08:30, 7 January 2022 (UTC)[reply]

Thanks! This is absolutely not a conversation where we're just looking for feedbacks from admins, nor do we think all admins are interested. But we sent out a reminder to all admins, because they're a group who have a high likelihood of being affected – both because we wanted them to be aware, and because picking one group who have a high chance of being interested also helps spreading awareness to others in the communities without pinging every content contributor. We're equally interested in all feedback here from anyone who feels affected in any way. /Johan (WMF) (talk) 13:44, 7 January 2022 (UTC)[reply]
Hej Johan, thanks for the answer. Your approach of spreading awareness to others seems to be working, as following my summary there is a discussion on the German signpost talk page now. Check it out. --Minderbinder (talk) 15:36, 7 January 2022 (UTC)[reply]

auto-generated usernames should be distinct from all normal usernames[edit]

As an editor or reader I don't care about IP addresses as such but I do care about (1) being able to quickly distinguish edits from unregistered and registered editors and (2) being able to make a reasonable guess at which edits are from the same editor. As I understand the "session-based approach" both aspects would be jeopardized: (1), since autogenerated usernames might be indistinguishable from real ones and (2) because the session cookie may change more often that the IP. A drawback of both schemes is that the IP range (which typically is more stable than the IP) is hidden.
But I think one could combine full masking with maintaining uses (1) and (2) even better than possible currently by simply using three hashes (one for the IP range, one for the full IP and one for the cookie (or cookie+IP))? That would obscure all identifiable information, but maintain distinctions that are useful for seeing different edits as "belonging to each other" or communicating with the unregistered editor. --Qcomp (talk) 13:43, 7 January 2022 (UTC)[reply]

Currently, MediaWiki prohibits usernames that look like IP addresses. That should also be implemented for whatever form the obfuscated identifiers take. AntiCompositeNumber (talk) 05:30, 8 January 2022 (UTC)[reply]
Auto-generated usernames should have a prefixed pattern like Anon-..., so it is easily identified to the usernames. IP masks are pseudo-random, so it's rubbish and easily be identified with users. Also, random names are banned from creating cause it violates the username policy. Thingofme (talk) 10:31, 8 January 2022 (UTC)[reply]

Aporte de argumentos sobre la relevancia en Wikipedia de: Centro de Investigación y Desarrollo Agrícola de Ohio[edit]

Questions from 魔琴[edit]

Hi. I have 2 questions:

  1. I am from mainland China, so I wonder, since users from mainland China can't sign Confidentiality Agreement for Nonpublic Information, can they have access to unencrypted IP addresses?
  2. I participate in SWMT, but I don't have global rollback/sysop right (lacking experience...) How will it affect my work?

--魔琴 (talk) 05:21, 8 January 2022 (UTC)[reply]

That's the problem. The Confidentiality agreement for nonpublic information sign that user mustn't live in countries that censor Wikipedia like mainland China, so zhwiki is the largest wiki in which there are no local checkusers, even with 65 admins. For IP address, I think that user are not trusted to see the IP address, so personally, I think only checkusers can check unregistered users for their IPs. They would be called as a random name, and would not affect global tasks. However, rangeblock would become impossible and only for checkusers. Thingofme (talk) 14:27, 8 January 2022 (UTC)[reply]
"[S]o zhwiki is the largest wiki in which there are no local checkusers." No, the zhwiki don't have local CU because the Foundation removed CheckUser access from all local users due to "security concerns" in Mar 2018, accordingly after some still-unknown CheckUser posted IP-user relationships on the village pump. --魔琴 (talk) 04:48, 9 January 2022 (UTC)[reply]

Compartmentalization[edit]

Regardless of which approach are being pursued, I think it should be compartmentalized per wiki so if you have permission to unmask on one wiki, you can't unmask based on information gathered from another wiki; And only people with some global unmask permission can unmask globally. This would prevent the issue if someone with nefarious intent manage to get unmask permission on a smaller wiki to be able to unmask all IPs on all other wikis. AzaToth (talk) 13:17, 8 January 2022 (UTC)[reply]

Please don't use the cookie approach[edit]

Personally, i generally use session cookies, which are deleted after closing the browser. To avoid tracking while surfing on the internet. Many others do too. And i know of several people who delete their cookies manually on a regular basis. All of these cases would be harder to identify and attribute recurring IP-vandalism to earlier cases. --Ghilt (talk) 13:52, 8 January 2022 (UTC)[reply]

Yes, I'm also one of these habitual "cookie-deleters". I think that the cookie approach wouldn't work well, see also above. By the way, Johan, I suppose you don't want a "vote" on these two approaches, and I understand this, but I think, as this is the main choice we seem to have here, I would be easier to see what people prefer if we had a dedicated section/page where people could state which approach they prefer and why instead of this general feedback page. Gestumblindi (talk) 15:24, 8 January 2022 (UTC)[reply]
Yes, we should create a RFC for which we vote on which way should we implement. We would count the votes and consensus and we will determine the result. The talk page only go about what "volunteers", "users" think. Thingofme (talk) 15:34, 8 January 2022 (UTC)[reply]
Yeah, some browsers have that "auto delete cookies when closing" option. --魔琴 (talk) 16:52, 8 January 2022 (UTC)[reply]
Including Firefox, not just some niche ones. And it also has the anonymous window feature, and a "clear cookies now" button in the configuration menu. Do we know how widely those are used, generally or when visiting WMF sites? –LPfi (talk) 14:00, 13 January 2022 (UTC)[reply]
So cookies are terrible, as they also perceive data privacy, and can be deleted very often, so we can't block cookies. Thingofme (talk) 04:21, 14 January 2022 (UTC)[reply]

Some thoughts on how the media could perceive this[edit]

@Johan (WMF): The change needs to be communicated carefully to the general public, I think. In the past, there have been several articles by investigative journalists (such es here in Switzerland, or in Germany) about manipulation of Wikipedia that relied in part on the IP addresses in the version history. They then were able to show "this article has been edited anonymously from an IP that can be traced back to corporation X", or that it is an IP from the federal government of Y, and so on. Making the IP addresses no longer visible to the public could be perceived as an attempt to sweep attempts to manipulate Wikipedia under the rug, to make the project less transparent and making the journalists' work harder. To me, the privacy enhancement / legal reasons seem to be clear and convincing enough, but I wouldn't count on the media to see and to depict it that way! Try to avoid "Wikipedia hides manipulation!" headlines... Gestumblindi (talk) 19:02, 8 January 2022 (UTC)[reply]

Thank you for this insight. –– STei (WMF) (talk) 06:53, 11 January 2022 (UTC)[reply]

Proposal: Keep own IP address unmasked[edit]

This discussion page has been brought to my attention by a notice on German Wikipedia’s version of the Signpost.

I do not have the time nor am I in the mood for reading all the discussions on this page, so I do not know whether a remark like mine has already been put forward earlier; apologies if this is the case.

I would like to propose that logged-out users still be able to see their own IP addresses (unmasked) as well as retrieve them via the API (meta=userinfo). In terms of privacy this should be entirely unproblematic (since the users are only shown information about themselves, not about other users). Given that sysops, CheckUsers and others are able to see unmasked IP addresses (and potentially investigate information associated with them, such as geolocations), it would only be fair if (potential) logged-out contributors were able to check beforehand the information exposed about themselves to those users. (As an aside, this also applies to logged-in users, who can at the moment see their IP address before logging in, albeit in their case it would only be exposed to other users in case of a CheckUser investigation against them.)

To rule out any misunderstanding: When talking about “being able to see one’s own IP address” I mean somewhere in the UI, for example in Special:Contribs, not everywhere a masked version of the IP address would be shown instead to other users; not in signatures, for instance, which are obviously generated once an edit is made and not easily available for being altered on-the-fly by the UI.

I would expect this to be piece of cake technically since the means of displaying an unmasked IP address in Special:Contribs are already there, they would just have to be made conditional based on whether the requested contributions are one’s own (as redirected by Special:MyContributions). The API query meta=userinfo would not even need to change at all since by design it only returns information about the calling user. --2A02:8108:50BF:C694:A1E3:B362:5009:4A0B 20:22, 8 January 2022 (UTC)[reply]

The IP address in the page w:Wikipedia:Get my IP address; or we can get it by many other ways, not just in wikipedia. Thingofme (talk) 01:44, 9 January 2022 (UTC)[reply]
Yes; as long as these ways of getting one’s own IP address are still there, there should be no problem. (As far as I can see, w:Wikipedia:Get my IP address is not a Special Page, and I don’t know whether there is an equivalent in every language version; for example, I don’t know of any in the German language Wikipedia, but I haven’t bothered looking for one, since de:Special:MyContributions aka de:Spezial:Meine Beiträge did the trick.) As for the API (meta=userinfo) it’s (probably) mostly a matter of avoiding breaking changes, in case there are any consumers out there relying on it returning an IP address (instead of a masked version thereof) as user name when queried without being logged in. --2A02:8108:50BF:C694:986:2570:11A4:4D82 12:05, 9 January 2022 (UTC)[reply]

Cryptographic advice[edit]

At this point, the talk page has become too cumbersome for me to parse and make heads or tails out of. But if the Wikimedia Foundation does go with "hashing", feel free to reach out to me if you'd like any cryptographic advice. I used to do cryptography related work for Mozilla and Twitter, now at Dropbox, and would be happy to threat model it if that's the route that y'all go down. — Marumari (talk) 02:20, 9 January 2022 (UTC)[reply]

Good when it comes to privacy protection, bad when it comes to LTA-users[edit]

In general, I do think that it is good to have more privacy protection for unregistred users. When it comes to privacy, this is definitely a good idea. However, couldn't that demotivate people from creating accounts? Also, smaller projects that struggle with constant attacks from LTA IP-offenders (e. g. Croatian Wikipedia), could take damage from that, atleast in my personal opinion but I am not making claims here. During the past, the IP helped us sysops to identify banned users, who kept returning as IP-users. If the IP is no longer visible, it might be a problem to identify banned users who stopped creating accounts and decided to edit/troll as an IP (often in use of a VPN). --Koreanovsky (Ča–Kaj–Što?!) 13:23, 12 January 2022 (UTC)[reply]

Access for community tools[edit]

Hi, I see the two approaches about the IP-masking (IP vs. session). I personally would like to choose the session-based approach as it offers new opportunities on analyzing user activities (and keep the option open for blocking or deleting cookies). However, I see that many editors will be against it. What we should ensure that statistical and similar tools developed by community members (for example on toolforge.org) will not be broken after this move. Will we provide automatic access for these tools to the IP addresses (or session data) or did you think on this issue? Samat (talk) 22:31, 15 January 2022 (UTC)[reply]