Talk:Privacy policy/Archives/2018

From Meta, a Wikimedia project coordination wiki

Reporting privacy violation - IP to real person mapping

Where can one report mappings, true or not, that have been made on Wikimedia sites from an IP address to a living person? 09:19, 18 May 2018 (UTC)

On the English Wikipedia, please contact oversight. On other sites, contact their local oversight process. For meta, and other sites without an oversight process, contact a Steward. TheDragonFire (talk) 11:33, 18 May 2018 (UTC)
Meta have their own oversighters. Meta:OS — regards, Revi 09:34, 22 May 2018 (UTC)

Consent and other changes

I'm glad that the new text removed the sentence «you consent to the collection, transfer, storage, processing, disclosure, and other uses of your information in the U.S.», which was quite user-hostile because users don't quite consent to anything. Saying «We will access, use, preserve, and/or disclose» is definitely more honest than «We may access, preserve, or disclose»; same for all the other removed "may" and removed «Information available through public logs will not include personal information about you».

The sentence «Depending on your jurisdiction, you also may have the right to lodge a complaint with a supervisory authority competent for your country or region» is also a welcome admission. The WMF often sounded tone-deaf on this matter in the past. --Nemo 17:56, 21 May 2018 (UTC)

Push for EU standards

There seems to be a wide push for USA companies to apply the standards of the GDPR worldwide, for all users:

The implications of GDPR may be unclear for our free software, but it would still be helpful if Wikimedia Foundation were ready to answer similar questions. --Nemo 06:06, 4 April 2018 (UTC)

Strongly agree. Unjustified expansion of application GDPR is a very dangerous and toxic initiative for free knowledge world (and primarily for Wikimedia Commons). It is very important that the WMF Legal Team investigate this topic and publish a special appeal and lead a public campaign against this expansion. @EHershenov (WMF) and TSebro (WMF): what your opinion about this issue? Maybe someone is already working on learning this?--Kaganer (talk) 15:35, 4 April 2018 (UTC)
GDPR is particularly dangerous for machine learning and data analytics but we needed to pay this price some time.Erkin Alp Güney (talk) 07:30, 22 May 2018 (UTC)
Hi Kaganer. Since privacy is one of the key values of the Wikimedia movement and the Wikimedia Foundation, the Foundation legal team works hard to monitor related developments all around the world, not just in law, but also in users' expectations and understanding of what constitutes good privacy practices. Privacy is much-discussed these days for a variety of reasons, and we are using this current global conversation as impetus to strengthen our own practices. That's why we reexamined our Privacy Policy and decided to improve it with some minor changes, and why we want to hear from users about what they'd like to see as our privacy practices continue to evolve. We will continue to support and protect the projects and the movement, including user privacy, and we welcome everyone’s feedback on how we can best achieve this. TSebro (WMF) (talk) 21:40, 23 May 2018 (UTC)
@TSebro (WMF): Dear Tony! Are you sure that your speech is exactly the answer to the questions asked above? Is it possible for us to see your replicas in a less "abstract" form? So that we do not have the impression that these are the robot's answers? --Kaganer (talk) 23:07, 23 May 2018 (UTC)


It's troubling that the updated FAQ section and the link on "Wikipedia:Courtesy_vanishing" make specific reference to a guideline specific to the English Wikipedia. What does "for further reference" entail? --Nemo 17:32, 21 May 2018 (UTC)

Hi, could you clarify where you’re seeing that problematic link to enwp’s Courtesy vanishing policy? I can’t find it in the FAQ, but wherever it is, I think we can replace a link to enwp vanishing policy with the Meta version of the page I posted yesterday, so if I can get a pointer I think I can fix this. Kbrown (WMF) (talk) 13:54, 23 May 2018 (UTC)
Sure. Here you go. Hope this helps. All links to the English Wikipedia anywhere must be removed and the text rephrased so that it's actually valid for all users. Thanks, Nemo 06:35, 24 May 2018 (UTC)

Hidden text

Why move the paragraph about "Your username will be publicly visible" in the collapsed box? This also breaks the translation unit and forces new translation for all languages. Please just revert (the text is identical).

The paragraphs on "Publicly Visible Information" were hidden too. --Nemo 17:56, 21 May 2018 (UTC)

Hi Nemo_bis. Our goal in moving some text to collapsed boxes was to make the policy more readable overall. This makes it easier for people to read the main points of the privacy policy, while giving them the opportunity to expand out particular sections if they wish to read further details. We welcome community perspectives on how we can best present this information. If others have opinions about the collapsible box, please let us know. TSebro (WMF) (talk) 21:48, 23 May 2018 (UTC)
I welcome any effort to improve readability. Hiding the dust under the rug is no cleanup though. If you're serious about this, I suggest that you set yourself a goal of reducing the privacy policy to half of its current size (let's say 20k characters less), to bring it closer to what it used to be.
Unnecessary fluff can be removed, then you won't have a need of collapsed text. Of course the information about IP addresses being made public is not among the information which would be removed. --Nemo 06:39, 24 May 2018 (UTC)

Comparison with Wikia

I think makes for an instructive reading. --Nemo 12:12, 23 May 2018 (UTC)

Navigation and clarity

Would it be possible to add some sort of headings or anchors or TOC for the convenience of those trying to communicate with volunteers who are trying to revdelete or oversight, particularly the section defining "personal information"?

According to, the "Privacy policy" document page is more than 5000 words long, is written on a college graduate level, and takes 19 minutes to read.

The section I am referring to in particular is in the first table under the subheading "definitions" under what looks like a Level 2 heading "welcome". It is in the 6th cell down from the top. If that description is confusing, this is why it needs navigation aids. I once exchanged about a dozen emails with a Berkeley-educated admin trying to describe a particular PII violation before they could understand what I was talking about. My conversations with non-native English speaking oversighters has been even more of an adventure. Can this language be made more user friendly, for instance "(c) any of the items in subsections (a) or (b) when associated with your user account". 'My' user account? "a" and "b" and "c" all together plus the user name? I have to read this several times to get it, at least I think I get the intent of the paragraph.

There is also no mention of geolocation. I have seen someone take an IP and use it to approximate a user's location and associate that city or neighborhood with the user name, this particularly has happened off-wiki. Is it worth mentioning location separately? Did you know that 87% of users in the U.S. could be matched to a data base with nothing more than their zip code, sex, and birth date?

I see also the definition of "medical conditions or disabilities" has been changed. There used to be a link to a fuller explanation, that someone could be on a dialysis list or organ transplant list, and revealing this information could lead to them being identified IRL. The way it has been changed to read makes it look like you cannot say someone has a cold. This might be particularly chilling for someone trying to report a potential medical emergency.

Has anyone asked the people who actually use this to request oversight or process the requests what they think? Some of these people probably do not want to comment publicly, considering the subject matter, but they are the ones who can tell you more about how to make it more usable. —Neotarf (talk) 00:02, 25 May 2018 (UTC)

Checklist for digital security

This is a more general/abstract/geeky comment. I have just run across this checklist for digital security that might provide a tool for thinking about the privacy policy.

Checklists have been used in aviation and in the hospital industry, this is about using them for other applications. For anyone who wants to go down this rabbit hole, here is a list of associated links:

  • [1]Digital Security and Privacy Protection UX Checklist (DSPPUX-Checklist) "This checklist provides suggestions to promote digital security and privacy for people who are designing and developing tools for targeted communities."
  • [2]"The Checklist: If something so simple can transform intensive care, what else can it do?" by Atul Gawande in The New Yorker
  • [3]Link to tweet about github checklist from @geminiimatt
  • [4]"The Secret to Ensuring Follow-Through" by Peter Bregman, Harvard Business Review (more on checklists to keep the ball from dropping during handoffs)

It seems like some of these ideas could be used to evaluate Wikipedia institutions and processes.

Just for starters, looking at the github checklist, where is WMF sensitive data stored? The arbitration committee has a notoriously leaky mailing list, and emails containing PII are retained by the individual arbitrators, even if they have intentionally disclosed PII in the past. Information remains on the mailing list in perpetuity, and is available to each new tranche of arbitrators. Yet, only a few weeks ago, I was asked by the committee to disclose information to the mailing list that could be used to identify me in the future.

From the New Yorker piece: "Pronovost [corporation] also insisted that each participating hospital assign to each unit a senior hospital executive, who would visit the unit at least once a month, hear people’s complaints, and help them solve problems...The executives were reluctant. They normally lived in meetings worrying about strategy and budgets...In some places, they encountered hostility. But their involvement proved crucial. In the first month,[description of a specific problem]... This was a problem only an executive could solve."

With Wikipedia's decentralized problem-solving structure, when something isn't working, who do you call? —Neotarf (talk) 01:12, 25 May 2018 (UTC)


I am curious about whether the current policy is compliant with GDPR when it comes to Europeans people in its databases. For example, has every donator currently receiving sollicitations (by email) to make a donation explicitly given consent for it ? Or is WMF boldly deciding not to follow the requirements in that policy ? Thank you for the answer. Anthere (talk) 21:20, 21 May 2018 (UTC)

it goes without saying that the current change of policy seems to be particularly fitting well with the GDPR calendar... Anthere (talk) 21:21, 21 May 2018 (UTC)

Just to avoid confusion: The changes do not make the privacy policy GDPR compliant: A lot of the mandatory information required by art. 13 GDPR is missing. This is a little confusing in light of the timing of the current changes, which would suggest otherwise. —Gnom (talk) Let's make Wikipedia green! 21:47, 21 May 2018 (UTC)

ok. Thank you for the clarification. Anthere (talk) 17:25, 22 May 2018 (UTC)
Official summary on article 13 etc.: --Nemo 20:47, 22 May 2018 (UTC)

WMF-policy is not compliant to GDPR at least for following rules and probably more: Not clear is

  • for how long the data will be kept;
  • who else might receive it;

You're much too late to make changes IMNSHO. Perhaps following USA laws and ignore those valid in EU?  Klaas `Z4␟` V17:33, 23 May 2018 (UTC)

See also topic about Data Protection Officer Taraseq (talk) 13:26, 24 May 2018 (UTC)
I am not convinced that the WMF needs a Data Protection Officer. But ianal. Not worried about this one.
However, I know that the WMF currently collects quite a bit of money from donations coming from European donors. And I know for a fact that it uses a database of previous donators, that the WMF send emails every year to sollicitate for further donations. This database contains information about European donors. Hence my curiosity with regards to the compliance, as... as a previous donor and a recipiendary of those targeted emails, I would somewhat expect that the topic be raised ? Anthere (talk)
@Anthere: this privacy policy does not deal with that I think. By going to you will find a link to the FAQ, which links to the Donor privacy policy. I agree that 1, that link should be directly advertised on, and 2 some crosslinking might be worthwhile. —TheDJ (talkcontribs) 11:05, 25 May 2018 (UTC)
There is a direct link in the fine print: "By donating, you agree to share your personal information with the Wikimedia Foundation, the nonprofit organization that hosts Wikipedia and other Wikimedia projects, and its service providers pursuant to our donor policy." Jeroen N (talk) 11:50, 25 May 2018 (UTC)

Displaying IP addresses of anon-users

One of the things that always surprised me about mediawiki is that we publicly expose the IP address of editors who we claim to be "anonymous". In reality, through both our software and policies, registered editors are far more anonymous (or pseudonymous) than unregistered editors. This, I don't believe, was ever an active specific choice to expose IP addresses, and is something which we have built many tools upon to help vandal-fighting and sockpuppet-fighting. However, it seems to me to be quite contrary to our general culture of being extremely high privacy-conscious. It is also counter to the practice of most (all?) other websites which allow unregistered contributions: where newbies are automatically assigned a random username e.g. 'Newbie123456789'.

Even if it is not legally required of us by GDPR or other regulations, it feels to me to be the right thing to do to NOT display IP addresses, and instead display an auto-generated ID number.... Admins, or Checkusers or some other level of user-right should still be able to query for the IP address to do vandalfighting of course. Whether the auto-generated ID should be persistent to the IP address, and whether such a system should be retroactive are questions of software implementation (I would argue for "no" in both cases, personally).

Thoughts? Wittylama (talk) 12:49, 22 May 2018 (UTC)

Note, "we" (as in Wikimedia and MediaWiki) don't claim they're anonymous. The official term is unregistered user. --Nemo 20:44, 22 May 2018 (UTC)
I think I agree with Wittylama. I bet this has already been discussed elsewhere, though. --Gnom (talk) Let's make Wikipedia green! 04:43, 23 May 2018 (UTC)
The main reference is mw:Requests for comment/Exposure of user IP addresses. --Nemo 06:23, 23 May 2018 (UTC)
Thanks for the suggestion, Wittylama. We've passed it along to our technical teams for consideration. TSebro (WMF) (talk) 21:55, 23 May 2018 (UTC)
I think your reply here is incredibly lame and disappointing. --MZMcBride (talk) 03:04, 24 May 2018 (UTC)
Hi Wittylama. I would certainly be interested if you gave these ideas more thought and probed the nuance here further. How and if we handle users have not logged in is a pretty complex subject. For example, we could pretty trivially require that all users log in, which pretty neatly solves the problem of exposing IP addresses. Do we want to do that today? Do we ever want to do that? How much of the wiki's strength and identity is tied to the ability of drive-by, casual contributors to make edits? Do we want to retain that capability?
Brion has suggested eliminating the use of IP addresses entirely. Your position seems to be more of a compromise, where IP addresses are still retained, but less exposed. This is not a novel idea, but it quickly raises difficult implementation questions. Namely, how would this actually work? Would every edit that's not logged in get a new unique user ID? If so, how do you prevent abuse? How would we track a single computer user editing across many articles? If you try to persist the identity for more than one edit, how long do you do this? Let's say you can auto-create an account for the first edit and assign it a user name such as "Newbie123". What password would this user account have? What e-mail address would be associated with it? How long would it stay logged in and once it's logged out, how would anyone get back in?
Consider a bit deeper what you're asking for regarding the privacy of accounts. You acknowledge that knowing which IP address "Newbie123" is using could be important and we may need to expand the pool of users with access to this information to include administrators. This stands pretty directly in contrast with the people who think storing IP addresses at all is risky and problematic. And even to people like you who are seeking a compromise solution, you'd be expanding the pool of people with access to what previously would've been considered private and confidential info (IP address info of registered users). Is it reasonable to expand the pool of users who can view the IP addresses of users? Would such an expansion be better than the status quo of exposing IP addresses to the general public? Are we okay with destroying the ability of projects such as <> to function?
As Nemo notes, there's a draft document about this topic. You're more than welcome to help expand it. --MZMcBride (talk) 03:25, 24 May 2018 (UTC)
I have the same thoughts than Wittylama. Maybe it should just say "Unregistered User" rather than showing the IP for years. IPs at least IPv4 can be shared. On articles it's not important who is who and on discussions it's often clear or someone can give himself a nickname or maybe different colors. Everyone who wants to be anonymous should better register (under this or that name). --Kungfuman (talk) 19:36, 24 May 2018 (UTC)
I fully agree with Wittylama. The danger of IP is something a user may not be aware of. I think it is very wrong to display them in clear. I would support having them converted to some random id per IP so contributions from a single IP can be tracked - but other editors will not be able to tell where they came from. It should be possible for highly privileged person to find the corresponding IP for purposes like finding trolls. Possibly a new identity should be generated if the last access has been some time ago say three months as that possibly indicates a new user is using a dynamic IP. I believe displaying IPs is by far the greatest privacy problem, it is a real problem and can cause danger to people in repressive regimes and it should be fixed as soon as possible. Dmcq (talk) 23:07, 24 May 2018 (UTC)

If one were to use status quo i.e. display of unregistered users IP addresses, then have a universal one to one mapping to a different ID for every possible IP address, it should be simple to implement. The set of people that are able to view IP addresses would be reduced dramatically. All the current tools for dealing with trolls and defacing encyclopaedia entries would be retained. A good improvement in privacy with little effort, and provides time to consider if there are better options. Persistency of the ID number would be identical to the existing persistence of unregistered users IP addresses. Worst case is someone decrypts the mapping, and you end up with status quo i.e. unregistered users IP addresses available. — The preceding unsigned comment was added by (talk)

It's more complicated that just GDPR. Remember that people are legally licensing the content they add to the page, (as mentioned in the line just above the Publish Changes button. They are signing a contract of sorts. And contracts are an important exception to GDRP. Now I agree that our implementation is somewhat balancing on the edges of both Contract law requirements and GDPR requirements and I think that is exactly why you won't hear the lawyers comment on this. I think this can only be really settled with court cases. —TheDJ (talkcontribs) 08:52, 25 May 2018 (UTC)

Comment from unregistered user

80% of constructive edits come from IPs. However, when an IP removes an edit by a tendentious registered editor the editor adds it back, claiming that 80% of vandalism comes from IPs. He then goes to RfPP or a friendly administrator's talk page and requests semi-protection. He then opens the IP's contribution record and reverts all the other edits the IP made. The administrator who does the protection adds the page to his watchlist, accesses the IPs contribution record, reverts any edits by the IP which have not been reverted and protects all the other pages the IP has edited. Every time a page on the watchlist is edited by an IP (should protection have expired) the same thing happens. The Foundation is presumably unaware this is happening because in Privacy policy/FAQ it says

A local community of editors or contributors (for example, the English Wikipedia community or the Malay Wiktionary community) or the Wikimedia Foundation itself may decide to place temporary or permanent restrictions on what you can change.

These permanent protections are made on the authority of a single administrator who gives no reasons and notifies his action to nobody. The protections are unchallengeable because other administrators are not permitted to remove them. At the risk of trivialising the scale of the problem here is the revision chronology of three articles on three days in the summer of 2015:

en:Hemen Majumdar

22 August

  • 13:48 IP corrects the birthplace from Bongaigaon to Kishoregani, adds the death date (22 July 1948), corrects the occupation from "Philosoper" [sic] to "Painting" and corrects the birthdate in the persondata from 19 September 1871 to 14 April 1894 (the correct date is already in the infobox)
  • 14:12 Jc3s5h reverts
  • 14:17 IP reverts, pointing out that the corrections are sourced
  • 14:21 Jc3s5h reverts
  • 15:25 IP restores correct information
  • 15:26 Favonian reverts
  • 17:51 IP restores the correct information, pointing out that "Philosoper" is not a word
  • 17:55 Favonian reverts
  • 17:56 Favonian protects the page

en:Greenwich Mean Time

21 August

  • 20:31 IP adds the essential information that the Greenwich Time Signal is wrong (by many seconds) when broadcast over digital radio
  • 20:46 change is accepted by a senior editor who holds a science degree from Oxford University

22 August

  • 12:57 Jc3s5h reverts
  • 13:41 IP restores the change, pointing out that there is consensus for it
  • 14:07 Jc3s5h reverts
  • 14:20 Jc3s5h requests page protection
  • 14:26 NeilN protects the page

en:Prime meridian (Greenwich)

23 August

  • 11:40 IP removes a link to a disambiguation page and replaces it with a direct link to the relevant article
  • 12:03 Jc3s5h reverts
  • 12:30 IP restores the link and says (s)he has raised the matter at ANI. The ANI filing ends with this statement:

The examples of edits that he has provided appear to have little to with the case as given. In addition, the case page has been semi-protected to ensure that none of the IP addresses accused can respond to the case (a request for removal has been filed). It is worth noting that there has been a flurry of editing activity at Prime meridian (Greenwich) within which two editor4s are attempting to WP:OWN the article by objecting to what others are editing in. I have to note that Jc3s5h who is also attempting to own the article has also chimed in and reverting the same good faith (and correct) edits.

- 12:24, 23 August 2015

A concomitant filing at RfPP notes:

This is a case page raised against a sockmaster, but is alleging that a number of dynamic IP address are the sock puppets of the sockmaster. The IP's are vaguely related in that they all geolocate to arround the London area (not a reliable guide because dynamic IP address will always geolocate to the server when they are not assigned).

In spite of the allegation being predominantly against IP address editors, the page has been semi-protected to make sure that none of those IP address editors can actually respond. It is believed that the case has been raised in an attempt to prevent at least one IP editor from editing in the same article as the person who raised it. If the semi-protection is not going to be removed, then the cae should be closed on the basis that response to the allegations has been blocked. - 12:05, 23 August 2015

The ANI complaint continues:

Since I have been deliberately prevented from responding at the SPI. According to the evidence put up by Jc3s5h, everyone who edits Wikipedia and lives in London must be sockpuppets of each other. That is his sole evidence. With a population of 8.6 million people, I await all the future SPI cases whenever he dislikes anyone's editing.

- 12:41, 23 August 2015

  • 12:45 Jc3s5h requests semi-protection of Prime meridian (Greenwich)
  • The IP responds at ANI:

That is nothing more than your determined attempt to make sure that I cannot respond to your malicious allegations where you made them. I note that you continue to add irelevances secure in the knowledge that they cannot be answered... And both users, are intent on proving sockpuppetry relying solely on the fact that the accused happens to geolocate to London. Problem is: that the case falls down because this IP address does not geolocate to London, but around 10 miles or so east of where I am (though it probably will do if my IP address changes - something over which I have no control).

- 13:04, 23 August 2015

  • At RfPP the IP responds:

See next section. This is an attempt by Jc3s5h to prevent a user from editing his article. He has also taken to raising an SPI making surious allegations based on evidence no stronger than that I live in London (which I don't by the way) and protecting the SPI so that the accused cannot answer. I also submit that this is also in relaliation to the recent ANI case raised.

  • 14:30 NeilN reverts the correction at Prime meridian (Greenwich)
  • 14:31 NeilN protects Prime meridian (Greenwich)

NeilN has the final word at RfPP:

If infestation continues, next duration will be longer.

- NeilN 14:32, 23 August 2015

Nobody determined the unprotection request, however NeilN made this comment:

Note I have blocked the IP as a sock. -- NeilN 14:57, 23 August 2015

A registered editor attempted to enforce fair play on the SPI page:

None of the IP addresses are anything other than what they appear to be. Legitimate IP addresses used by the UK's largest telecom provider that geolocate more or less exactly where they say they are.

- DieSwartzPunkt 17:26, 23 August 2015

but with no luck:

This tells me you didn't take my suggestion and/or have not done any analysis (or are ignoring it).

-- NeilN 17:31, 23 August 2015

Exemption for "academic, artistic or literary expression"

Art. IV, §1 defines 'personal information' as:

"[A]ny information relating to an identified or identifiable natural person; an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person."

It seems clear to me that IP addresses would likely be considered 'personal information' that is protected by the GDPR. However, there is an important exception to the GDPR regulation that is found in Art. 85, which says:

(1) "Member States shall by law reconcile the right to the protection of personal data pursuant to this Regulation with the right to freedom of expression and information, including processing for journalistic purposes and the purposes of academic, artistic or literary expression."
(2) "For processing carried out for journalistic purposes or the purpose of academic artistic or literary expression, Member States shall provide for exemptions or derogations from Chapter II (principles), Chapter III (rights of the data subject), Chapter IV (controller and processor), Chapter V (transfer of personal data to third countries or international organisations), Chapter VI (independent supervisory authorities), Chapter VII (cooperation and consistency) and Chapter IX (specific data processing situations) if they are necessary to reconcile the right to the protection of personal data with the freedom of expression and information."
(3) "Each Member State shall notify to the Commission the provisions of its law which it has adopted pursuant to paragraph 2 and, without delay, any subsequent amendment law or amendment affecting them."

The legal status of Article 85 of the GDPR is that it requires Member States to enact certain laws on the subject. Unfortunately, however, I have not found any guide to how Member States have enacted this provision into their national laws. The right to freedom of expression in Europe is far less than the right under the First Amendment in the United States. In addition to protection under various national laws/constitutions, w:Article 10 of the European Convention on Human Rights (links to en-WP)says in relevant part:

"Everyone has the right to freedom of expression. This right shall include freedom to hold opinions and to receive and impart information and ideas without interference by public authority and regardless of frontiers. ... The exercise of these freedoms, since it carries with it duties and responsibilities, may be subject to such formalities, conditions, restrictions or penalties as are prescribed by law and are necessary in a democratic society ... for the protection of the ... rights of others."

In my opinion, it is likely that the use of IP addresses to identify Wikipedia contributors would be protected freedom of expression for and not subject to GDPR's restrictions for a couple of reasons: (1) it is necessary to attribute content to editors for the purpose of copyright; (2) the contributors knowingly agree to irrevocably publish their content under the Creative Commons license, which requires attribution; (3) the corollary of #2, which is that since the content of most Wikimedia projects is licensed under CC BY-SA, the other editors need to know the identity of contributors to properly attribute content; (4) the contributors knowingly agree to publish their content knowing that other editors will make adjustments to their content; and (5) the needs of Wikimedia projects to prevent vandalism. I updated the content of w:Wikipedia:Data mining on English Wikipedia in March to warn data miners of the problems GDPR presents and am going to clarify the information even more after posting this comment. AHeneen (talk) 03:00, 26 May 2018 (UTC)

Confidential data

I see that a sentence was removed: «If you choose to provide your email address, we will keep it confidential, except as provided in this Policy». Other sections kept similar sentences, for instance «We keep IP addresses confidential» and «We keep information obtained by these technologies confidential», in addition to «In the extremely unlikely event that ownership of all or substantially all of the Foundation changes, or we go through a reorganization (such as a merger, consolidation, or acquisition), we will continue to keep your personal information confidential».

What does this mean? Does it mean that other parts of the policy can allow such data to be shared even without saying it explicitly? --Nemo 17:56, 21 May 2018 (UTC)

  • It looks to me like the biggest change is to how email addreses will be handled. The changes around it imply that the foundation will now use email addresses to solicit funds. Also, WMF will possibly share email addresses with other entities that further its "charitable mission." Why was the language that protected email addresses removed? It looks a little like a sleight of hand maneuver to allow the giving of information (not "sell") to other organizations that may have donated or contributed to WMF "charitable mission" with no definition of what the mission is or who may use the information to further it. — The preceding unsigned comment was added by 2600:8800:1300:16E:F15F:D980:8971:23A0 (talk)
    Really? I've read the changes in the exact opposite way, making the policy more permissive about emails (though arguably nothing extraordinary)=: see #Confidential data. --Nemo 20:43, 22 May 2018 (UTC)
    I think you two have the same point. --Gnom (talk) Let's make Wikipedia green! 04:46, 23 May 2018 (UTC)
  • Hi Nemo_bis. We are not changing our email handling practices. We just removed this sentence for the sake of clarity and readability. As the policy provides elsewhere, we are committed to keeping Personal Information, including email addresses, confidential as described in the policy. When you use the "Email this user" feature, your email address may become visible, as disclosed in the interface. TSebro (WMF) (talk) 21:45, 23 May 2018 (UTC)
    If you think that sentence was redundant, why did you keep it in two other paragraphs? I know how Special:EmailUser works, but your sentence here may be misunderstood: MediaWiki never makes the email address visible. The users' email clients do. --Nemo 06:43, 24 May 2018 (UTC)
    TSebro (WMF), I wasn't specific enough. In the section regarding emails/data, a new phrase appears. It is "charitable mission." It appears in different ways We and our service providers use your information for the legitimate purpose of pursuing our charitable mission, including: is one such instance. This appears to broaden how WMF may use information. It isn't just relocation, it's a new phrase with "charitable mission" (or in another place in all instances, in keeping with our legitimate charitable purpose of pursuing our mission. This seems like an expansion of how the data will be used and who may use it. If it's not, why the new phrasing? 2600:8800:1300:16E:44A8:FE7:35B4:CB3D 09:28, 27 May 2018 (UTC)


What motivations are behind this change? I assume there are financial benefits - perhaps to offset server costs?

I assume the implementation of EU´s en:General Data Protection Regulation on 25 May 2018. Alexpl (talk) 08:37, 24 May 2018 (UTC)
As User:Gnom already pointed out, these changes do not make the Foundation GDPR compliant. They also don't mention anywhere that that is the intention behind these changes, as far as I can see. The timing looks like a strange coincidence. Jeroen N (talk) 09:21, 24 May 2018 (UTC)
See also the forum about Data Protection Officer. Klaas `Z4␟` V20:53, 24 May 2018 (UTC)
We made these minor changes to the text to improve readability and clarify what the policy means by the phrase "personal information." We are also collecting any additional comments on how our privacy practices can be improved. TSebro (WMF) (talk) 04:56, 5 June 2018 (UTC)


So - can Google somehow get my IP adress from Wikimedia/Wikidata/Wikipedia despite me editing there under a username only - Yes or No ? Alexpl (talk) 08:40, 24 May 2018 (UTC)

If you don't visit Google, only Wikimedia project, then the answer should be no. Stryn (talk) 14:03, 24 May 2018 (UTC)
To confirm, Styrn is correct: no, Google does not get your IP address or any nonpublic personal information. TSebro (WMF) (talk) 04:58, 5 June 2018 (UTC)
I thought they were somehow engaged with Wikidata... Alexpl (talk) 15:51, 24 May 2018 (UTC)
They aren't. --Nemo 15:55, 24 May 2018 (UTC)

Diff quality

The quite unreadable diff doesn't help the conversation. I recommend that you revert it and apply things like translation unit changes and uppercase changes in separate diffs. Also, some translation units don't follow best practices for translatability. --Nemo 17:28, 21 May 2018 (UTC)

Is there a reason a link to the diff wasn't included in the blog post? Or some kind of summary of the changes? I read a few references to "minor edits" without a description of what was actually changing. I eventually found <> myself, and I agree with Nemo that this diff is not enjoyable to read, even for long-time editors. It's not immediately clear which paragraphs were removed, which were added, which were reformatted, and why. --MZMcBride (talk) 23:40, 21 May 2018 (UTC)

I tried to do my own quick diff, but it's still nasty. Attempt 1: <>. Attempt 2: <>. Woof. --MZMcBride (talk) 02:05, 23 May 2018 (UTC)
  • Agreed, it's pretty disappointing that there's no plain English summary of the changes that I can see, let alone no easy-to-access full diff. Come on WMF; y'all can do better than this. — OwenBlacker (Talk) 06:00, 22 May 2018 (UTC)
  • I too came here looking for the actual changes being made. I saw the banner announcement, read the blogpost, read the message on the mailing list... but other than saying that there are some minor changes, nowhere does it actually tell you what these changes actually are. If they're that minor it should be easy to identify them. Since the timing is specifically the same as GDPR, and yet the comments here on this talkpage indicate that these changes do not actually address GDPR issues, is this just a conspicuous coincidence? Wittylama (talk) 12:41, 22 May 2018 (UTC)
  • +1 - Just had the privacy banner appear hence the lateness - Like everyone above I too came here to see what had actually changed ..... I didn't really expect diffs .... just a "this has been added" and "this has been removed" ..... Without sounding disrespectful I'm not going to spend all my life reading Privacy policy (FWIW I don't read any of that on other sites either), I guess I just liked to have known what those minor changes were –Davey2010Talk 01:21, 24 May 2018 (UTC)

As the person who did most of this wikification, I guess I’m the best person to answer this. The answer is, unfortunately, not terribly satisfying: a combination of limitations on the way the content was built up and updated and constraints on version control between different formats as the text wandered through various processes meant that we ended up having to choose between getting the content up in a timely manner or getting the diff viewability and translation markup perfect. The team estimated that the latter would be a considerable additional time investment and we chose the former in this instance and, well, here we are.

So, the bad news is that we don’t really have any feasible way to go back and re-do all the changes in a more diff-able manner, because the changes don’t exist, even on our end, in that format. The good news is that my team has been working with Legal this week this week on better way to address the version control issue going forward. Kbrown (WMF) (talk) 13:53, 23 May 2018 (UTC)

@Kbrown (WMF): a mark-up page with striken and inserted text seems like it would be the simplest option for something like this, agree with everyone on this page that trying to determine what was changed is even challenging for us seasoned editors. — xaosflux Talk 01:49, 24 May 2018 (UTC)
Kbrown, nobody has asked that you rewrite history. Just read your own diff, separate it in different parts, revert your edit and apply all the different edits separately. You can save the wikitext on local files and test how the diff looks like on a sandbox, don't worry. --Nemo 06:31, 24 May 2018 (UTC)

@Nemo bis, MZMcBride, OwenBlacker, Waldir, Wittylama, Davey2010, and Xaosflux: I created a hopefully more easily-readable diff over at Privacy policy/2018 update. Hopefully that's more useful. :) I should probably also note that I made this myself and it's not totally impossible I missed some stuff where the changes involved section moves (which are more difficult to display in a format like this). Joe Sutherland (Wikimedia Foundation) (talk) 21:52, 29 May 2018 (UTC)

@JSutherland (WMF): Thank you; good work! — OwenBlacker (Talk) 22:23, 29 May 2018 (UTC)
Thanks so much JS :), –Davey2010Talk 22:40, 30 May 2018 (UTC)

What are the changes???

Even after spending 1/2 hr chasing down links, I have no friggin' idea what the changes are. Whatever you are doing is not, and I repeat not, transparent. G41rn8 (talk) 02:29, 23 May 2018 (UTC)

Hi G41rn8. Yes, a few of us have been wondering the same in the #Diff quality section of this talk page. It's confusing why it's so difficult to discern what changed. --MZMcBride (talk) 01:41, 23 May 2018 (UTC)
I strongly agree, MZMcBride! --G41rn8 (talk) 02:29, 23 May 2018 (UTC)

Do we know what the changes are yet? Jason Quinn (talk) 14:57, 27 May 2018 (UTC)

@Jason Quinn: I had a go at creating a more easily-readable diff at Privacy policy/2018 update. Perhaps that's useful. Joe Sutherland (Wikimedia Foundation) (talk) 21:48, 29 May 2018 (UTC)
@JSutherland (WMF):. Brilliant. Belated big thanks for that. That's a great way to display the info and this should be done for all changes in the future. Jason Quinn (talk) 13:15, 10 June 2018 (UTC)

Template issue on policy page

If you scroll down the page and to the collapse box containing items that the policy does not cover (titled "More on what this Privacy Policy doesn’t cover"), a </noinclude> closing wiki markup tag can be seen. It looks like it's coming from Template:Anchor. I would have just fixed it, but only accounts can edit on and I don't have one... Who has an account there that can fix this? ~Oshwah~(talk) (contribs) 22:53, 6 July 2018 (UTC)

Hey Oshwah - just fixed it, thanks. Joe Sutherland (Wikimedia Foundation) (talk) 22:07, 7 July 2018 (UTC)

Allow people contributing pictures to conceal camera make and model for privacy reasons

When I uploaded some photos taken using my smartphone I didn’t realise that the wikiMedia website would display all the EXIF metadata from the camera. Please can you add a privacy feature to the user account to hide camera make and model information for my contributions Adrian816 (talk) 14:23, 27 February 2018 (UTC)

I think that's a technical question about removal of the EXIF metadata from your existing uploads, and about removing EXIF during your future uploads. Perhaps ask Commons:Village pump? --Gryllida 22:30, 27 February 2018 (UTC)
If your photo is displayed somewhere and you download it from there, the downloaded file has no EXIF metadata. The problem though is that without the metadata, there is no evidence that you're the photographer. Guido den Broeder (talk) 23:47, 1 March 2018 (UTC)
But metadata may be edited by a some easy ways, and this is not a evidence in the general case. --Kaganer (talk) 14:49, 10 April 2018 (UTC)
In my POV show the metadata of a file in Commons improves the transparency and openness of the project. But many persons as Adrian can't be aware of this. We can suggest a legend or banner in the Upload Form to prevent the people to be uninformed of this technical characteristic. ProtoplasmaKid (WM-MX) (talk) 20:17, 21 May 2018 (UTC)
ProtoplasmaKid, Good idea. Rutheni (talk) 10:15, 24 May 2018 (UTC)

I sugest the commons help me —The preceding unsigned comment was added by (talk) 16:27, 7 August 2018 (UTC)

UI after ‘describe the edit you made’

Many people here bring up good ideas about hiding camera metadata, ip addresses and so forth. My thought is that a UI window could be added after the ‘describe the edit you made’ window that could give the user privacy options. The purpose of this would be 1.) to let new or uninformed users make their edits without too much effort up front thereby not ‘scaring them away’ and 2.) to have transparent options to publish or delete and explanation about what publishing metadata or IP info (or other similar info) can mean for the user. Victorgrigas (talk) 13:53, 28 May 2018 (UTC) Victorgrigas (talk) 13:53, 28 May 2018 (UTC)

Hi Victorgrigas. Thanks for the suggestion; we've passed it along to our technical teams for consideration. So that I understand your suggestion correctly: are you suggesting the creation of an extra dialogue box that explains the privacy implications of their submission? Some Wikipedias already have a warning for people editing while not logged in, encouraging them to create an account to hide their IP address. We can investigate what kind of privacy information users want to see. TSebro (WMF) (talk) 05:03, 5 June 2018 (UTC)
Yes that's more or less what I'm suggesting, basically add a way AFTER someone has pressed 'edit' and made changes but BEFORE they press save and publish to inform them about the privacy options and implications of saving and publishing. Victorgrigas (talk) 14:00, 5 June 2018 (UTC)


Policy is always nice, but without enforcement it is meaningless. There have been multiple breaches of privacy of more than one individual, and by the very people who are supposed to be enforcing it. In some cases, volunteers entrusted with PII have even released this information intentionally. There is still no standard for vetting and training oversighters, or reporting mechanism for breaches. The NDAs signed by oversighters are meaningless, because they are volunteers, not employees or contractors. In spite of multiple failures of the system, there has been no public investigation or examination of the system itself, or even an acknowledgment of the problem. Perhaps it is time for an outside evaluation? —Neotarf (talk) 23:13, 24 May 2018 (UTC)

Hi Neotarf. Concerns about users with Oversight or Checkuser Permissions should be raised with the local Arbitration Committee or with the Ombuds Committee, and the Foundation will work closely with them to resolve any issues. TSebro (WMF) (talk) 04:59, 5 June 2018 (UTC)
So, TSebro (WMF), concerns about the arbitration committee or the ombuds committee should be taken to the arbitration committee or the ombuds committee, and everything will be just fine? The problem with that is that the WMF doesn't *know* whether everything will be just fine, because they aren't tracking it. It may be that instead of "working closely with the Foundation", the committee members will instead choose to block the user so they can't object, block their talk page so they cannot ping someone to unblock them, publish further dox about the individual, remove their email access so they cannot contact someone else to remove the dox, and send them bounce notifications if they try to email the committee directly. I would also note that any private information that someone sends the arbitration committee will then end up being stored in the personal email accounts of the individual arbitrators, even after they are no longer participating with Wikipedia. And the arbcom mailing list has historically been very leaky. Also, if the user is in an region with heavy internet censorship or surveillance, or a repressive government, expecting them to send more and more emails to try to get something removed may actually be dangerous advice and bring additional unwanted attention. It would be better just to have an arbitrators that didn't publish personal information in the first place. Neotarf (talk) 01:30, 15 June 2018 (UTC)

Additional privacy and safety issues

I hope this is not too far off topic, but I don't see a better place to post these concerns.

My first concern is with potential medical and child safety situations and their reporting and followup. I was 17 years old the first time I had to handle a suicide attempt, and within minutes was able to get professional and medical attention to the situation, and was kept informed of the situation throughout the night. The individual survived and later sent me a thank you note. This was in the context of a small non-profit organization with almost no resources. In contrast, with the situations I have been involved with in the Wikimedia organization, no one knows what to do or how to report or how to know whether the situation has been turned over to an appropriate professional. It is known that some people do carry a special phone at times, but they may not be fluent in English or they may just cancel calls without evaluating them. In addition they may not have enough medical skills to make an evaluation or they may have a personal friendship or history of conflict with the individual in question that makes it awkward to report to them or discuss and have taken seriously any information you may have about their medical history that may bear on the situation. In addition, the nature of Wikipedia's pseudo-court/police model of governance makes it less likely that any situation will make it out of the talk pages and get appropriate attention. Administrators and arbitrators are more likely to try to sanction anyone who tries to report something, instead of turning it over to someone who knows how to handle it. In the real world you do not want security handling these situations, they simply escalate and get out of hand; you need someone with management skills. In the context of Wikimedia, these ad hoc solutions that individuals try to put together when they are faced with a situation are simply not working.

The other concern is with PII (Personally Identifying Information) and the refusal of volunteer oversighters and arbitrators to remove it. In December 2014, while serving as an arbitrator with checkuser privileges, user:AGK sent an email refusing to remove Personally Identifying Information, as defined by the Wikimedia Privacy Policy. This year he is a candidate for the 2018 arbitration committee election and I have asked him about this on his talk page. His answer seems to indicate that arbitrators believe they have the authority to suspend the privacy policy retroactively on an individual by banning them, and at that point they do not have to answer for their actions. This to me underscores the necessity of using paid staff who have signed a meaningful NDA for these positions. Trying to get something revdeleted or oversighted is very, very difficult under any circumstances unless you know someone personally. Once you have been named to an arbitration case, you automatically become high profile, whether you want to or not, and there is an almost certain probability you will be stalked on Wikipedia and beyond. This makes it even more urgent that any information, accidental or otherwise, be removed as quickly as possible. Even if you do know someone, and they speak fluent English, it can take dozens of emails to get them to understand and agree. The format of the policy page itself is not much help since it does not incorporate a TOC for navigation or paragraph numbering.

I don't want to just point out issues without suggesting solutions, so I am proposing a button for flagging potential safety situations. Every social media from YouTube to Facebook to Twitter has a button like this, it is extraordinary that Wikipedia does not. Since administrators and arbitrators are usually the first to be made aware of a situation, they should be trained on how to respond to dox and where to request a medical or child safety evaluation. It would take a less than 10 minute self-paced test, and they could be asked to complete it the first time they pass an RFA and every year thereafter, in order to stay current on the latest safety procedures. The proposal is on the community wishlist talk page; I'm sure it can be improved on. Pinging TSebro (WMF) and Mdennis (WMF). Also Doc James might know where to refer any medical related questions.

Thank you for your attention to this. —Neotarf (talk) 22:31, 29 November 2018 (UTC)

Protected edit request on 16 November 2018

Two elements from this page are untranslatable:

  • {{Hidden |header=Publicly Visible Information |content=<translate><!--T:94-->
  • {{Hidden |header=More on Usernames |content=<translate><!--T:250-->

Please add translate markup on both:

  • {{Hidden |header=<translate>Publicly Visible Information</translate> |content=<translate><!--T:94-->
  • {{Hidden |header=<translate>More on Usernames</translate> |content=<translate><!--T:250-->

Trizek (WMF) (talk) 09:41, 16 November 2018 (UTC)

Done and pushed to translation. @Trizek (WMF): thanks for the corrections  — billinghurst sDrewth 09:54, 16 November 2018 (UTC)
Thank you billinghurst! Trizek (WMF) (talk) 10:18, 16 November 2018 (UTC)

error in translation : tags T:109/T110 -> english text is never translated in the local pages

Hi all, wrong tagging detected in the original English page around tags T:109/T110 Text "More on Locally Stored Data" remains in english; why ???

Thank you — The preceding unsigned comment was added by Wladek92 (talk) 06:39, 28 July 2018 (UTC)