Talk:Access to nonpublic information policy

From Meta, a Wikimedia project coordination wiki
Jump to: navigation, search

Internal policy on ID collection[edit]

This was posted by Geoff on the Privacy Policy talk page but I think would be interesting to those here as well.

Wikimedia Foundation - Internal Policy[edit]

Purpose[edit]

The Wikimedia Foundation (“WMF”) may sometimes need to collect copies of identification documents (“IDs”) from community members pursuant to established policies of WMF or the community. Examples where community members may need to identify themselves include the following:

  • Candidates for the WMF Board of Trustees
  • Candidates for the Funds Dissemination Committee
  • Recipients of WMF grants
  • Representatives and agents of user groups and thematic organizations
  • Community members with access to nonpublic user data information [GRB Note: we are currently not keeping such IDs on file.]

This internal policy summarizes the approach to be taken by WMF employees and contractors when handling and storing such community member IDs. The required ID depends on the criteria of the particular policy or practice, but may include copies of passports, driver’s licenses, and other government-issued documents showing real name and age.

Collection, Storage, and Access[edit]

Copies of IDs provided to WMF by community members will be kept confidential, consistent with any applicable requirements of the WMF privacy policy. Physical copies of IDs will be kept in locked cabinets designated for this purpose. Electronic copies of IDs will be protected by passwords or other electronic protections in files designated for this purpose.

Access to IDs will be limited to a “need to know” protocol determined by the program administrator. Usually that means only the principal administrators of a program will have access to those IDs. WMF will not share the IDs with outside third parties, unless required by law, covered by a non-disclosure agreement approved by Legal, or necessary to protect the rights, property, or safety of WMF and its employees and contractors.

Destruction[edit]

IDs will be kept as long as necessary to satisfy the need of the applicable policy and practice requiring the IDs. Such IDs will be destroyed as soon as the need for the ID has expired. Depending on the program, some IDs may need to be retained for a period of time for legal and financial purposes beyond the immediate purpose of the policy and practice. For example, IDs may need to be retained after the life of a grant to prove expenditure responsibility to government officials in the case of an audit. Check with Legal and Finance for any legal or finance record retention requirements.

V.1.1 (2013-03-14)

Lowering age to 16[edit]

The following discussion is closed: closing as this discussion appears to have run it's course as far as the current policy discussion is concerned. Since there does not appear to be consensus the legal team has decided to keep the current policy (16 for OTRS 18 for other positions under the policy) Jalexander--WMF 11:18, 12 February 2014 (UTC))

Hi all, one of the WMF staffers above wanted to know what other people thought of reducing the age requirement to 16, so please share your thoughts!

The legal implications could probably be solved by some sort of parental permission.

The reasoning for doing this is two-fold:

  1. There have been many competent admins that are under 18. People should be judged on their abilities, not automatically disallowed due to age. I trust the community (used generally) to continue to elected only the most mature and best candidates.
  2. OTRS agents have access to more private information than many functionaries. I see more private info on the info-en queue than I do as an oversight at Wikidata. If they can be trusted at 16, why not others?

I look forward to hearing some other perspectives on the issue. Ajraddatz (Talk) 22:36, 3 December 2013 (UTC)

Except 16 year olds cannot be trusted with private information. The minimum age for OTRS access must be raised to 18. What we currently have is a glaring breach of security which the WMF should rectify as soon as possible. DanielTom (talk) 23:26, 3 December 2013 (UTC)
Why can't 16 year olds be trusted with private information? There are already 16 year old OTRS volunteers who haven't gone posting it everywhere. If you are going to make a blanket statement that all 16 year olds can't be trusted with private information, you'd better be ready to back it up with proof that every 16 year old can't be trusted. Ajraddatz (Talk) 23:46, 3 December 2013 (UTC)

I would be concerned about a blanket lowering of the age to 16; I would not want to have a member of en.wikipedia's arbcom be under 18. --Rschen7754 02:16, 4 December 2013 (UTC)

Why? It seems like a lot of people have problems with it, but none are able to give any sort of logical reasoning (yet, anyway). Does a person's judgement really turn from bad to good the day they turn 18? Also, this is about identifying and having access to non-public data. Enwiki could continue to have an 18+ restriction for arbcom if it wanted to. Ajraddatz (Talk) 03:04, 4 December 2013 (UTC)
Some permissions and accesses to non-public data are more sensitive than others; also, I think there are legal reasons as well (note that they also must be of age in their locality). --Rschen7754 03:18, 4 December 2013 (UTC)
The legal reasons can be easily removed (if they exist in the first place). I was granted top secret security clearance in Canada when I was 17. My parents might have signed off on it, but either way, the legal implications of that are significantly higher than any information that might cross my path on Wikimedia. Could be different in the US, I suppose. Ajraddatz (Talk) 03:29, 4 December 2013 (UTC)
I don't think OS on Wikidata is a very good comparison; we have less than 10 suppressions a month. OS on en.wikipedia, or CU, or steward, or en.wikipedia ArbCom (note w:en:User:AGK/ACE2012) are all very high-profile positions that do regularly expose those who hold the permissions to real-life harassment. en.wikipedia ArbCom does regularly do investigations into editors alleged to be pedophiles, too. --Rschen7754 03:44, 4 December 2013 (UTC)
You forget that the smaller-wiki OS like what is found on Wikidata is the norm, not the enwiki situation. Projects could definitely make it more restrictive if there was consensus for that, but for the majority of projects, I'm still not seeing an argument against 16 year olds having access to the tools. OTRS volunteers are also harassed (as I myself have been in that capacity), yet they can be 16. Also, there are many people 18+ who could not handle the stress from harassment that can happen with these or other permissions (like admin), and likewise many 16 and 17 year olds who probably could. This proposal is about removing a barrier and put the focus on a case-by-case basis instead. Ajraddatz (Talk) 03:49, 4 December 2013 (UTC)
Because of checkuser-l and the CU wiki though, as well as #wikimedia-checkuser/privacy, the lowest barrier to entry will be effective everywhere. --Rschen7754 04:02, 4 December 2013 (UTC)
That's a very good point, and does separate OTRS from functionary roles to some extent. The OTRS mailing list does deal with hard cases and harassment, but not to the extent that those would I suppose. However, you haven't addressed my point about some 16/17 year olds being able to handle the harassment/stress and 18+ clearly not being able to (see some of the enwiki arbcom resignations) - surely there are some people in that age range who could do the job just as well as you or I? I'm probably fighting my usual losing battle of case-by-case selection over fixed rules here, but hopefully it's at least food for thought. Ajraddatz (Talk) 04:09, 4 December 2013 (UTC)
At least in my opinion, not being of age would make it harder to use the legal process if necessary, to get relief from harassment. --Rschen7754 10:46, 4 December 2013 (UTC)
There should only one rule: being in legal age in oneself's own Country, for example being 16 in most of EU-countries is meaningless. --Vituzzu (talk) 13:38, 13 December 2013 (UTC)
I don't believe that harrasment should be an argument in this matter for OTRS-agents. If a case is really hard you can just leave it to someone else or ask someone to help you. I know an OTRS-agent who is a younger than I am but who handeled a difficult case ways better than I would have done. And believe me that case was really nasty. Vituzzu here has a valid argument. What if a 16 year old persons abuses the information. Who is legally responsible? I think that those questions are more interesting than deciding if 16 year olds ar capable of dealing with really difficult cases. Natuur12 (talk) 14:46, 17 December 2013 (UTC)
Thank you all for your discussion on whether the minimum age for access to nonpublic information should be lowered to 16. The discussion raises some very important points, like the fact that some 16 year olds are incredibly resourceful and responsible enough to handle such access, and that OTRS agents may be 16 years old despite being able to see certain private information. We would just like to note that we did not set the minimum age for OTRS agents as it is just a reflection of standing practice and has no legal significance. We are curious to hear the thoughts of other members of the community on this issue, so please free to contribute further to this discussion. :) MBrar (WMF) (talk) 00:28, 19 December 2013 (UTC)

MBrar (WMF): One of the issues that Wikipedia in general struggles with, and that this policy seems to imply but not directly tackle, is how to handle contributions from people younger than the age in which they can be an independent party to a legally binding contract/agreement (henceforth: "age of majority"). If I recall correctly, we've had images deleted from Commons because the uploader, who was not of the age of majority during the time of the upload, got upset with Commons over some petty fight, realized that because he was too young to be party to the licensing agreements they weren't valid, meaning that the release under a free license was not irrevocable, and was therefore able to take his images with him when he left. Had that person been over the age of majority when they uploaded their photos, they wouldn't have been able to force the deletion. When dealing with a policy that requires certain groups to become signatories in confidentially agreements, the age of majority problem must be explicitly addressed. I don't have an issue with 16 year old OTRS agent if they are a citizen of one of the small number of territories that consider 16 to be the age of majority, and at the same time, I would strongly object to an OTRS agent that was 18 in one of the small number of territories that consider the the age of majority to be 19, 20, or 21. The sailent point is that they must be able to enter into a legally binding contract - the confidentially agreement - independently and in their home country. The Access to nonpublic information policy must, in my opinion, make that clear in the wording of the policy. Sven Manguard (talk) 17:44, 28 December 2013 (UTC)

Thank you Sven for your comment and I sincerely apologize for the delay in responding. You raise a very interesting point! The policy’s current minimum age requirements represent a cut-off point to simply ensure that community members younger do not apply for access rights. When approving an individual’s request for additional responsibility, their ability to enter into a confidentiality agreement is assessed before approving their request. However, the "age of majority" is ambiguous in a lot of countries. Many allow minors to be considered an "adult" in relation to contract accountability, marriage, crimes, etc. at very different ages. We chose to include a specific age in the policy because age 18 was our prior standard, most countries will hold someone legally accountable at that age, and it is unambiguous. The second sentence of the “(a) minimum age” paragraph attempts to convey the age requirement is needed to ensure community members can be legally held to the confidentiality agreement. Do you feel this sentence needs further clarification within this paragraph? MBrar (WMF) (talk) 21:54, 6 January 2014 (UTC)
MBrar (WMF) - To me "access to nonpublic information requires legal accountability in part because of the need to ensure confidentiality with respect to others’ nonpublic information" is not the same as "the confidentiality agreement is a legally binding contract, and the ability to enter into a legally binding contract in your home jurisdiction is a requirement". It needs to be unambiguous. Personally, I think that there are several sections of the policy where the authors got caught up in trying to make the policy seem nice, and as a result, wound up with diluted or unclear meaning (see the comments in the "Change in disclosure criteria" and "Comments on the current draft" sections). While being nice is good, a policy with legal considerations has to first and foremost be unambiguous. To accept anything less would open the door to years of fighting over the boundaries. Sven Manguard (talk) 22:09, 6 January 2014 (UTC)
I understand your point on making the policy language as clear as possible. Like I mentioned above, the age of legal accountability is often ambiguous within certain jurisdictions. For this reason, we set a concrete minimum age in the policy. We felt it would introduce more ambiguity to require community members to simply reach the age of majority required in their home jurisdiction, since this will be different in different contexts. Also, it would be difficult to police such a requirement and would likely introduce additional legal complications. MBrar (WMF) (talk) 01:22, 8 January 2014 (UTC)
Only adults should be trusted with private information. --Sue Rangell (talk) 20:42, 13 January 2014 (UTC)
Also, the determination should be made according to the age of consent where ever the servers and offices are located (I'm not sure where that is), because that is where the lawsuit will take place if, say, somebody suffers identity theft or other harm because we allowed a minor to have access to somebody's personal info. --Sue Rangell (talk) 19:27, 8 February 2014 (UTC)

Suggestion about adding some hash algorithms for verification of private data, such as Checkuser data[edit]

The following discussion is closed: closing as the discussion appears over, please reopen if necessary. Will archive shortly if not. Jalexander--WMF 00:22, 13 February 2014 (UTC)

Is there already a hash algorithm or something that helps verify a piece of private data? If not, I believe a hash algorithm can greatly reduce the possibility that user with access falsifies data or makes typos while using privacy data, and increases the verifiability of privacy data.

For those who need more clarification, let me outline what the hash algorithms should like:

The hash algorithm itself should be stored by Wikimedia privately, which would calculate a hash from a given piece of private data. The hash generator only allows input from existing, legitimate privacy data--In other word, you can't input your own data and get a hash.

As a common hash algorithm would do, It would be easy to verify if a piece of data matches its hash, and it would be difficult to deduce the data from the hash, or create two data with same hash.

It should be difficult to deduce this hash algorithm from a series of known input/output pairs and/or verification algorithm(if that is made public). The hash algorithm may also be changed over time to prevent accidental cracks.--朝鲜的轮子 (talk) 13:23, 5 January 2014 (UTC)

朝鲜的轮子 - I know our Tech folks are reading this thread and will opine if they feel this is feasible operationally. That said, this is not our present practice now, so our present version of the privacy policy cannot promise this. Interesting idea and thanks for your time on this. Geoffbrigham (talk) 23:30, 7 January 2014 (UTC)
朝鲜的轮子, I'm not sure I understand your proposal, so it would be nice to get some clarification. We couldn't make the hashes public since that would correlate private data, even if the values weren't known. If the hashes are secret, then for example, a checkuser says multiple accounts were editing from the same IP address, and the hash of that IP was ABCDXXX. The the users could generate a hash their IP and verify it was indeed ABCDXXX, or if not, somehow protest the claim. But we would have to rate limit the hashing mechanism to something very low, so any random user can't hash random ip's and attempt to find one that hashes to ABCDXXX. As Geoff says, nothing like this exists currently. Maybe elaborate with an example, and we can evaluate if something like that should be implemented? CSteipp (talk) 01:51, 9 January 2014 (UTC)
I'd elaborate that, the hash-related activities is limited to people with access to privacy data to verify data and ensure they are not false--but not available to common users. There exists cases that original data is lost so that they can no longer be verified(For example, CU data become stale after 3 months). In regard to the question you asked about random generation attack, I think first we can have a hash verification page which only privileged users can access and which returns "yes" or "no" on a given pair of data and hash, while the algorithm is kept private. We can also put some kind of logging and anti-flooding to prevent abuse (For example, let's say you won't normally try out hash verification page for one specific hash 100 times a day). Also, the hash is not required for every piece of CU data, it should only be used if a CU think they need to have verification when they want to use data in discussion.--朝鲜的轮子 (talk) 03:07, 9 January 2014 (UTC)
Hi, 朝鲜的轮子 - thanks for thinking about this problem. It is a problem we take seriously - if there was a solution that allowed CUs to do their job without access to IPs, that would both protect users, and protect CUs from pressure from governments and other sources.
Because of those questions, earlier this year we hired a consultant to investigate how Checkusers use IPs, and whether or not we could use hashes (or some other approach) in order to reduce the exposure of IPs to Checkusers. Unfortunately, their conclusion was that the problem would be very difficult to solve, for many reasons. Chris pointed out one (that we'd have to build a rate-limited hashing service so CUs could have conversations with each other); there are several others as well - for example, we would have to build internal replicas of all the third-party tools Checkusers use for things like whois, geolocation, etc. Because of the difficulty, I don't think that we can mandate such a radical change - it could only be done after there is extremely detailed research into the requirements.
However, the privacy policy does not prevent improvements! The code CUs use is open source; I suspect that proposed solutions that analyzed the many tasks Checkusers do with IP addresses, were scalable/performant, and actually improved security would be taken seriously. Sorry we do not have such a solution at this time.-LVilla (WMF) (talk) 04:20, 9 January 2014 (UTC)
Letting CUs see the hashes instead of actual IPs can be a possible variant of hash planning, but this is not what my current proposal means. I just mean a method to verify whether a piece of data is correct. For example, if one Checkuser want to claim what he see about a data before 3 months, or some users asked their CU results to be reviewed after 3 months has passed, then hashes can help verify a data and make the problem clearer.--朝鲜的轮子 (talk) 08:59, 11 January 2014 (UTC)
Which begs the question why that problem needs to be solved through the Access Policy. As far as I can tell, what you describe is a proposed solution to a trust issue between the community and checkusers. It doesn't contribute to the protection of users' privacy. — Pajz (talk) 11:08, 11 January 2014 (UTC)
If there is a way to ensure a piece of data is true, not falsified or typoed, and that way is better than merely trust, why should not we do it? And better trust between community and checkusers will benefit users, of course.--朝鲜的轮子 (talk) 01:34, 13 January 2014 (UTC)
Implementation would be as simple as attaching a PGP signature to the data, if we wish to go this way. Platonides (talk) 21:15, 13 January 2014 (UTC)

Data Retention Guidelines posted[edit]

We're happy to announce that the first draft of the new data retention guidelines are now available for your review, feedback, and translation. This draft is the result of a collaboration between many teams within the Foundation, including Analytics, Operations, Platform, Product, and Legal.

As with the other privacy documents, this draft is just that: a draft. We want to hear from you about how we can make it better. As suggested in the discussion about timelines above, we plan to hold the community consultation period for this draft open until 14 February 2014.

Thanks - looking forward to the discussion. —LVilla (WMF) (talk) 21:30, 9 January 2014 (UTC)

Thank you, Luis! SJ talk  01:44, 11 January 2014 (UTC)

Lack of specifics for respecting the right to be forgotten[edit]

The following discussion is closed: closing as this discussion appears done, please reopen if necessary. Will archive shortly if not. Jalexander--WMF 11:19, 12 February 2014 (UTC))

In a nut shell, the current policy proposal lack any specifics regarding the time factor, which is at the core of the daily routine of Check User operations. Thus more specifics are needed to strike a balance between protecting users' privacy and fighting vandalism.

Note that in EU's en:General_Data_Protection_Regulation, the "Right to be forgotten" exists.

Note that in current Wikimedia Foundation's use of Nonpublic Information, the Check User operations generally rely on the most recent data within three months, which is akin to the implementation of the "Right to be forgotten".

Nonetheless, for Check User operations, some data that is more than 90 day old is kept in the Checkuser Wiki, but it remains opaque when and what kind of data can be kept longer than the 90 day.

It is also unclear whether and how the Check User administrators can download, exchange, and actively store such data in their own devices outside the Checkuser Wiki.

It is also unclear whether and how such kind of data can be used as reliable evidence to fight vendalism.

There is also no techincal and legal risk assessment how the Check User administrators can keep the data safe in their own devices.

Hence, I have the following policy proposal for consideration.

policy proposal
  1. Regarding Check User data that is more than three-month old, in order to insure the integrity of en:digital evidence, while balancing the need between data protection and fighting vandalism,
    • Check User administators can only base their Check User judgements on (a) the current user activity data/server log data (within three-month old), and *(b) the data stored and properly registered in the Checkuser Wiki.
    • At any given time, all other Check User working data that cannot be validated by the said dataset is not admissable for Check User judgements.
    • The community of global Check User administrators must set, review and implement standard procedures to "selectively" keep some of the about-to-expire IP data into the Checkuser Wiki for future use. Any single record must be signed off by two Check User adminstrator with reasons why such record need to be kept and for how long.
    • To protect users' privacy against excess stroage, oversight mechanism is needed to regularly review whether the community of global Check User administrators store too much unnecessary data.
  2. Regarding Check User data being stored by individuals outside the Checkuser Wiki
    • Proactively storing such data is strongly discouraged.
    • Using secure official channels for communication among Check User administrators is strongly encouraged.
    • To decrease the frequency and risks of data leaking, it is advised to share only coded anonymized references that can lead to the records stored in the Checkuser Wiki and the current user activity, instead of the raw data themselves.
    • To decrease the legal and political risks of Check User administrators, they are encouraged to seek technical support from the Wikimedia Foundation to limit their footprints in their own devices when conducting Check User operations.
  3. Regarding Users' Right to their own information
    • Normal users can request for the records on who, when, why their own registered accounts were under Check User operations.
    • Normal users cannot have access to the records based on their IP addresses.
    • Under special circumstances, normal users may request for the relevant and/or derivative records on their own registered accounts were under Check User operations, but the IP addresses must be redacted.
practical concerns.

After consulting the Chinese Wikipedia community, I have the following to report:

One Check User administator (an active one) expressed that he does not need to store data in his own devices to be a productive Check User administrator.

One Check User administrator (the recently elected one) expressed his conerns that the proposal is impractical or impossible to implement. He argued that he is confident to take technical measures including data encryption to secure the locally stored recoreds.

Several editors have expressed the general concerns that the Check User administrators may have to respond to local authorities. (Currently all five Check User administrators are from the People's Republic of China.)

As a past receipt Yahoo Fellowship myself, I feel compelled to express my concerns based on the precedent of en:Shi_Tao, a mainland Chinese journalist who is jailed because Yahoo! China cooperated with Chinese authorities by providing private information to his account. Yahoo!'s role in this case has been examined and studied.

Hence, although the above policy proposal in no way is tailored or designed for a certain jurisdiction, I personally believe that the policy proposal can limit the legal and political risks, not only to normal users who participate in Wikimedia projects, but also to Check User administrator themselves. By keeping identifable data within Check User Wiki with proper administrative and tehcnical measures to insure its data integrity, and by replacing necessary data exchange with coded anonymized references, and by giving normal users to check whether, when, by whom, their accounts are checked, this proposed policy should fill some of the gaps in the current Access_to_nonpublic_information_policy proposal. --Hanteng (talk) 18:59, 14 January 2014 (UTC)

Thanks for putting this up at last. I'd wait for further responses at this time.--朝鲜的轮子 (talk) 03:07, 15 January 2014 (UTC)
Hi Hanteng. Thank you for these suggestions. While I think these are topics worthy of discussion, I do not think they fall within the purview of the Access to Nonpublic Information Policy draft discussion. These types of decisions are generally handled by local communities (and in some cases, the global community) and thus, if these ideas were to be adopted at all, they would be adopted as community policies rather than in a broader Board policy like the Access policy. Mpaulson (WMF) (talk) 22:31, 22 January 2014 (UTC)
Do you think any part of this proposal may relate to any policy we are currently discussing? For example, I am quite interested in the question about whether and how a user can know by whom and when they have been checkusered, as this is not addressed in the current policies.(Not to say I don't care the rest of the proposal, but I think is better to discuss it by parts)--朝鲜的轮子 (talk) 09:23, 23 January 2014 (UTC)
About nine days had passed, and there had not been any comments on the proposal itself. So I'd just leave some of my comments when I discussed this issue with Hanteng.
Though Hanteng himself considered most of my comments "not in scope of this proposal", I think that part of my comments relate to likely consequence of enforcing this proposal. Regarding he also said he will consider discuss this at London Wikimania this year later, I'd like to express my view first.
"Check User administators can only base their Check User judgements on (a) the current user activity data/server log data (within three-month old), and *(b) the data stored and properly registered in the Checkuser Wiki.""Proactively storing such data is strongly discouraged." To what extent should we enforce those two idea? Does it mean"If a piece of data is expired and it is not stored in CUwiki, you should STRICTLY NEVER USE IT EVEN IF USING IT WILL SAVE THE WORLD"? Should we punish someone for saying or implying "I remember from a piece of stale data(which is not stored in CUwiki) that..." or someone who is suspected of making use of any stale data not stored in CUwiki? And also, checkusers can make some other reasons not related to stale data to aviod controversy if they are tempted to use data stale data not stored in CUwiki.
Secondly, the wiki system as it stands now is not the best option for storing such data. Assuming checkuser wiki works the same as wikipedia sites except that only checkusers can view the pages, where there are no log of each visit, and the data does not expire until people agree to remove it, it means anyone with checkuser access can see everything stored on CUwiki. Also, given that people need to discuss for retaining and destroying data, it is completely reasonable for anyone to visit a data even if they do not have anything to do with that data at all. This greatly increase the exposure of privacy data, which I believe is excessive. In the past, there had been at least one CUer who had his/her privilege removed because of doing more-than-necessary checking, so I believe there stands more than unlikely chance that some checkusers do gather data for their own interests. Actually, due to the fact that people have the freedom to remember and forget anything as they wish, the best way to protect privacy is not having a complicated rule which we have no means to enforce and causes even more problems, but to just reduce the exposure of data to the minimum.
If this part should be adopted, I would advise to have a system which has access log(like CU tool) and separation between local and global data(for example, if a piece of data only relates to someone who had only been active in Chinese wp, there is no point for CUers of other languages to know it), at least.
The part in this proposal I'd currently support are that using some hash mechanism to ensure data integrity and that allowing user to get the records of by whom and when they have been checkusered(but not the detail of each check).--朝鲜的轮子 (talk) 07:40, 24 January 2014 (UTC)

Also mention actions to improve positive aspects[edit]

The following discussion is closed: closing as this discussion appears done, please reopen if necessary. Will archive shortly if not. Jalexander--WMF 11:19, 12 February 2014 (UTC))

The "Use and disclosure of nonpublic information" enumerates types of work and specific activites solely for preventing problems (vandalism, socking, etc.). However, some roles also access this info to promote positive aspects, for example, OTRS to accept or confirm license permission or simple helpdesk emails, and developers to improve and debug the software. At least one positive aspect might be a nice addition to the intro of this policy section. More importantly, the first sentence of part (a) implicitly excludes all such things, setting the scope as "help prevent, stop, or minimize damage to the Sites and its users." The next sentence could allow positive-impact activities ("only use their access rights and the subsequent information they access in accordance with the policies that govern the tools they use to gain such access"), but if so it contradicts the first sentence. And I think it really is up to those who design the tools and write their policies what activities are allowable with them. Therefore, that second sentence should be first and the focus, rather than a secondary position. DMacks (talk) 04:40, 15 January 2014 (UTC)

Hi DMacks, thank you for the suggestions! What do you think of the following language?
Community members with access rights provide valuable services to the Sites and its users -- they fight vandalism, respond to helpdesk emails, ensure that improperly disclosed private data is removed from public view, confirm license permissions, investigate sockpuppets, improve and debug software, and much more. But community members’ use of access rights is limited to certain circumstances and contexts.
(a) Use of access rights & nonpublic information. All community members with access to nonpublic information may only use their access rights and the subsequent information they access in accordance with the policies that govern the tools they use to gain such access. For example, community members with access to the CheckUser Tool must comply with the global CheckUser Policy, and, unless they are performing a cross-wiki check, they must also comply with the more restrictive local policies applicable to the relevant Site. Similarly, community members with access to a suppression tool may only use the tool in accordance with the Suppression Policy. When a community member’s access is revoked, for any reason, that member must destroy all nonpublic information that they have.
RPatel (WMF) (talk) 23:21, 22 January 2014 (UTC)
That new wording definitely resolves my concerns. Thanks! One bit of clarification for the "all...they have" at the end of (a). There are lots of independently available (and revokable) tools that allow access to independent and/or overlapping sets nonpublic information. Need to avoid saying that if someone with Tool A and Tool B loses Tool A, he would need to destroy the info he has from Tool B. DMacks (talk) 19:18, 25 January 2014 (UTC)
Hi DMacks, the language has been changed to: "When a community member’s access to a certain tool is revoked, for any reason, that member must destroy all nonpublic information that they have as a result of that tool." Does that make sense? Thanks! RPatel (WMF) (talk) 23:23, 3 February 2014 (UTC)
Looks all good to me now. DMacks (talk) 05:04, 9 February 2014 (UTC)

Recommendation to the board regarding access policy[edit]

Summary


The community consultation for the Access to Nonpublic Information Policy draft is coming to a close, so, in this announcement, we would like to outline the following:

  • A recap of the consultation process
  • The impact of the consultation on the draft
  • Our findings as a result of this consultation
  • Our recommendation to the Board to repeal the current Access to nonpublic data policy and adopt a modified version of the Access to Nonpublic Information Policy draft that does not include an identification requirement



Introduction

The Access to Nonpublic Information Policy draft (“draft”) has been under community consultation since 03 September 2013. As the closing of the consultation period (currently set for 14 February 2014) approaches, I think it would be helpful to recap what happened during the consultation, the changes the draft underwent, and what the Wikimedia legal department has learned as a result. We sincerely thank the community members who took the time to read the draft constructively, share their legitimate concerns, and contribute useful suggestions.

Background

The first draft was originally created to replace the current Access to nonpublic data policy (“Policy”), which requires that community members who have access to nonpublic information be “known” to the Wikimedia Foundation (“Foundation”). As I noted in the discussion, the reason why we believe that the Policy needs to be updated is that - both historically and currently - Wikimedia Foundation staff have found it difficult to balance privacy of identification with the Policy's requirement to assure accountability, meaning that current identification practices do not comply with the Policy. Current identification practices largely consist of: (1) community members who have been selected for particular access rights providing an identification document of their choice (or a copy of such identification document) to a designated member of the Foundation’s staff; (2) the staff member “checking” it; and (3) the staff member either returning the identification document or destroying the copy of the identification document without retaining any information contained in the identification document. Because identification is only required once in most cases and staff at the Foundation changes over time, the identities of many community members with access rights have become unknown to the Foundation over the years.

In an attempt to address this problem, we released our first draft of a new policy, which specifically addressed how copies of identification documents were to be submitted by community members with access rights and how they were to be retained by the Foundation. The draft was also intended to provide guidance to community members with access rights as to when they should access or share nonpublic information and to explain to the greater community why certain community members have access rights and the obligations individuals in those roles have with regard to that nonpublic information.

There were parts of the draft the community liked; there were parts that the community thought could be improved and indeed helped us improve; and there were parts that caused great controversy. Submission of copies of identification documents and retention of those copies by the Foundation was perhaps the most controversial topic for a variety of reasons: concerns over local restrictions on copying or transferring government-issued identification documents; discomfort with the Foundation retaining copies of the identification documents; disagreement over the circumstances in which the Foundation could disclose the contents of the identification documents; questions as to whether submission of identification documents would be sufficient to verify someone’s identity, and many more. What we learned was verification of identity is a complex problem with no simple solution for our community and the Wikimedia Foundation.

With these issues in mind, we asked the community whether it made sense to simply revoke the current Access to nonpublic data policy. Although no consensus developed as a result of this discussion (as was the case with the other proposed alternatives), it seemed that many community members continued to want a policy that: (1) addressed access to nonpublic information; and (2) required community members with access rights to be known to the Wikimedia Foundation. But those participating also continued to have serious reservations about the submission and retention of identification documents.

In response to this feedback, we changed the draft to how it currently reads:

  • Explaining which community members were covered by the draft – those that have access to a tool that permits them to view nonpublic information about other users or members of the public; those with access to content or user information which has been removed from administrator view; and developers with access to nonpublic information;
  • Requiring community members with access rights to be at least 18 (or 16 in the case of email response team members) in accordance with current practices;
  • Requiring community members with access rights to provide their name, date of birth, current valid email address, and mailing address to the Wikimedia Foundation – without providing any identification documents or copies thereof – which would be retained by the Foundation for the amount of time the community member had access rights plus an additional six months; and

Outlining when community members with access rights may use their rights, how they may use the nonpublic information they access, when and to whom they may share the nonpublic information, and what their confidentiality obligations are.

Recommendation to the Board

We believe that there is value in having a global policy that sets the minimum requirements that community members with access to the nonpublic information of others or restricted content must meet. We also believe that there is great value in providing guidance to the individuals in these important roles about how such access rights and the resulting information should be used. The establishment of such a policy will not only serve the community members in these roles, but the greater community as well by providing better understanding about how access rights and nonpublic information are used.

Local communities are, of course, free to create and adopt more restrictive guidelines and requirements, but a global policy will ensure the establishment of a consistent foundation upon which local rules may be built.

That said, we do not believe that requiring identification in the manner that is currently laid out in the draft is an effective way to ensure accountability. Without proper identification and verification procedures in place, it would be disingenuous for the Wikimedia Foundation to claim complete and accurate knowledge of the identities of all community members with access rights. And, as noted by the community, there are various logistical and cultural obstacles in creating and maintaining proper identification and verification procedures.

Therefore, we presently plan to recommend to the Board that it repeal the current Access to nonpublic data policy and replace it with a modified version of the Access to Nonpublic Information Policy draft, which would not include any identification requirements. This means the present draft would be modified to remove sections (b) “Identification” and (d) “Submission & retention of submitted documents” of the “Minimum requirements for community members applying for access to nonpublic information rights” section.

This modified draft would still provide for minimum age and confidentiality requirements (which community members currently in possession of access rights and community members applying for access rights will have to certify and agree to, respectively). It would also provide rules surrounding the use of access rights and the resulting nonpublic information. However, it eliminates the identification requirement currently present in both the Access to nonpublic data policy and the draft.

We recognize that this is a controversial issue and people will have understandably varying and conflicting positions on the matter. We make this recommendation to the Board with the understanding that it may reasonably choose a different approach or solution to this unique challenge. However, after analysis of the relevant discussions throughout the community consultation period and considerable research into alternatives, we believe that this recommendation provides a sensible and honest approach to how access rights are handled.

Michelle Paulson, Mpaulson (WMF) (talk) 21:18, 28 January 2014 (UTC)
Legal Counsel, Wikimedia Foundation

Replace IP addresses of unregistered edits with unique guest names[edit]

The following discussion is closed: closing as this discussion appears done, please reopen if necessary. Will archive shortly if not. Jalexander--WMF 11:20, 12 February 2014 (UTC))

While looking into privacy matters, it would be worth considering changing the current situation where an unregistered user or unlogged in user, reveals their IP address. Many users are not aware of the consequences of revealing their IP address. By replacing each individual IP address with a unique guest name, this will preserve the privacy of unregistered editors, yet allow the tracking of their edits. It would also protect those registered users who sometimes accidentally edit while logged out. Each unique guest name could be identified with the tag User:UGN, followed by a random sequence of letters and numbers to be applied to that IP address: User:UGN005abd77-43a. By using such a system, privacy and accountability are covered, and we are treating everyone with the same level of respect. SilkTork (talk) 17:30, 30 January 2014 (UTC)

However, that reduces our ability to handle open proxies, and to conduct rangeblocks. --Rschen7754 18:55, 30 January 2014 (UTC)
In addition to what Rschen said, some services like ACC require IPs to be viewable to us to make close calls about what to do. So unless a CheckUser type tool can be developed where users can find 'UGN' sequences, it is going to do more harm than good. John F. Lewis (talk) 23:59, 3 February 2014 (UTC)
  • You also forget that knowing the IP address is important for marking vandalism sources such as schools or public hotspots, which helps admins in making blocks.--Jasper Deng (talk) 05:20, 4 February 2014 (UTC)
Also, IP information is very relevant when choosing the length of the block. Blocking an IP without being able to see it would inevitably lead to a lot of mistakes, and the proposal would create an unnecessary extra workload for checkusers and steward alike. Snowolf How can I help? 15:51, 8 February 2014 (UTC)
Hi SilkTork, thank you for your comment. I just wanted to refer you to a discussion earlier in the consultation period about hiding the IP addresses of users. That discussion also includes a response from Erik (Eloquence), our head of engineering. Hope that addresses your concern. RPatel (WMF) (talk) 21:11, 10 February 2014 (UTC)

Closing of the Community Consultation Period[edit]

The community consultation for this Access to Nonpublic Information Policy draft has closed as of 14 February 2014. We thank the many community members who have participated in this discussion since the opening of the consultation 03 September 2013. Your input has provided valuable insight about this sensitive and complicated topic. You can read more about the consultation and the next steps for the Policy draft on the Wikimedia blog. Mpaulson (WMF) (talk) 20:28, 14 February 2014 (UTC)

  • Wow, no public notice for a draft proposal that will repeal the identification requirement for those who have users' most sensitive information--and the only protection that that information isn't available to immature 12-year olds or other malefactors. Something like this definitely deserved more intense notification than a stupid ArbCom or Stewardship election or RfC on paid editing. --ColonelHenry (talk) 02:25, 5 March 2014 (UTC)

Late comment[edit]

Obviously the period for community consultation has long ended—frankly, the final proposal has somewhat slipped out of my attention. Having noticed the posting of the updated draft document today, I decided to nevertheless share the following comments (written back in February). Feel free to consider it a waste of time.

Writing solely in my individual capacity as a user of the Wikimedia Sites, I would like to share a few observations on this matter. To summarize, I respectfully disagree with the proposal of the Legal Counsel and her interpretation of the community discussion as well as the LCA team's overall approach toward the objections by community members brought forward during the consultation period. The Access Policy should include a requirement to submit identifying information.

(1) The obvious irony between the wide-reaching consensus on increasing the privacy-protection level as dictated by the Privacy Policy and the simultaneous resistance expressed by some of the very same users on this discussion page against implementing any identification requirements at all for volunteers with access to private information is striking. It seems to go unnoticed by these commentators that vast parts of the Privacy Policy are, in the absence of identification requirements, no more than a giant loophole. Specifically, the “Sharing” section of the (draft) Privacy Policy stipulates that it is users who

“writ[e] most of the policies and select[] from amongst themselves people to hold certain administrative rights. These rights may include access to limited amounts of otherwise nonpublic information […] They use this access to help protect against vandalism and abuse, fight harassment of other users, and generally try to minimize disruptive behavior on the Wikimedia Sites […] These user-selected administrative groups are accountable to other users through checks and balances: users are selected through a community-driven process and overseen by their peers through a logged history of their actions.”

While sections on the use of private information by third-party service providers and contractors underwent tough scrutiny by commentators, the above section seems to have everyone nodding in agreement. And indeed, it reflects our common understanding of how the Wikimedia Sites work. Yet if those users who violate the conditions set forth in this passage (and specified in the Access Policy) do not have to fear any consequences in case of wrongdoing, all of the Privacy Policy’s fancy ideas on how the Foundation supposedly works to protect user information are proven pointless.

The Foundation ensures that my IP address is kept strictly confidential? Cool! Any tweaks? — Well, some community members may access it of course. — Oh? Who would that be? — Actually, we have no clue. But they have to abide by our tough rules. — Great, I figure there will be serious consequences if they don't! — Oh yes, there are very powerful bodies that enforce it. — Puh. And what can they do if someone messed up? — They can report it to the Foundation. — Wow. And what can the Foundation do? — Well, of course they can't do real-life stuff but they absolutely could ban him from Wikimedia! — Hurray.

If you have a policy that you are not able to enforce because you lack the information required to do that, you actually don't have a policy at all, you have a guideline. The blant truth is that the Foundation has a Privacy Policy that contains a passage which says that unknown people residing somewhere in the world can access the very information the policy is set to protect.
A common objection is that this ignores the fact that checkusers and OTRS agents are volunteers. That is correct, but ignoring this is entirely on purpose, for it bears no relevance whatsoever to the question: Users expect their private information to be kept confidential and expect that promises on careful treatment of such information can be enforced. It just doesn't matter to that end whether the qualified users are volunteers or paid staff: People want their information to be safe (and rightly so).
Finally, some suggest that we never had a substantial case of privacy breach in the past. That may be correct but the argument is really not worthy. We are happy (and arguably lucky) that nothing has happened in the past and it is certainly evidence of the professionalism of the relevant communities in granting access rights. But the risk that something may go wrong is obvious. I feel reminded of some folks with a very similar argument, by the way, which would be OTC traders in pre-subprime crisis times; when you asked them about the need for viable reporting requirements, that was precisely the answer they gave: What? Reporting requirements? But nothing has gone wrong in the last decade!

(2) The debate on this page is fundamentally biased, and it is my impression that the Legal Counsel does not adequately recognize that in her recommendation. An estimated 90 percent of those who comment here represent just one of the stakeholders: Those with access to nonpublic information. Of course they have an inclination to resist identification requirements. Just as I do. It actually seems downright stupid to be eager to provide the Foundation with your real name and your address if there's only a downside but never ever any upside to it. A typical example of other stakeholders are customers that write to info-en@wikimedia.org. You cannot claim with honesty that they have had any voice in this discussion. However, in 2012 alone, more than 20,000 tickets have been created in the info-en queue, and putting the number of unique customers at at least 15,000 should be a rather conservative estimate. OTRS agents can easily retrieve real-name information (as provided in the emails), sometimes IP addresses, email addresses etc. from these people. Checkusers can potentially access inordinate amounts of IP address information from registered users on Wikimedia Wikis. But neither OTRS customers nor “ordinary” Wikipedians are significantly represented in this debate. That, of course, is no surprise and I don’t think one can really change it. But when Michelle writes that “after analysis of the relevant discussions throughout the community consultation period and considerable research into alternatives, we believe that this recommendation provides a sensible and honest approach to how access rights are handled”, I doubt that is the right way of looking at it if you strive to balance the conflicting interests. And that also brings me to the next point:

(3) I am unhappy about the way the Foundation has handled some substantial criticism on this page. The truth is that, yes, there were those who expressed their resistance to identification requirements per se. I do not agree with them, and I've outlined why above. But there were, from my impression, even more who accepted the general idea of identification but criticized the extent to which the LCA team reserves the right to use it under the current draft Acess Policy. I will give two examples for that.

i) Requirements, (d)(i)(A) ("permitted by a non-disclosure agreement that: (1) has been approved by the Wikimedia Foundation’s legal department; (2) allows for use of the submitted materials only in a manner compliant with the Wikimedia Foundation’s Privacy Policy;")
To date, no convincing reason has been given why this must necessarily be part of this Policy. It was mentiond that this is necessary to be able to handle cases via outside counsel (which is a legitimate interest). Nonetheless, it could be limited in several ways. You could, for instance, limit the provision to apply only to disclosures in the context of legal action in response to (d)(i)(B) or at least to actions in the context of (d)(i)(B)-(D). Nothing of this has been implemented.
ii) Requirements, (d)(i)(A), (d)(i)(C), (d)(i)(D):
Another suggestion brought forward by users was to limit the information sharing to disclosures requested by law-enforcement authorities. Under this proposal, the Foundation would generally keep the information for the sole purpose of being able to help with legal investigations. Under a somewhat less restrictive proposal, it was suggested to include an exception that would additionally allow for disclosure of the identifying information for the purpose of protecting the interests of the Wikimedia Foundation provided that other members of the relevant Community body agree with that. No matter which of the alternatives you prefer, they would all be significantly more restrictive than the clauses in the current draft.

In both examples, a more restrictive approach was, in my opinion, feasible and, most importantly, it would still have been in the spirit of the purpose of the Access Policy (see 1.). That is why I am unhappy about how this was handled: It just stands to reason to me that despite existing possibilities to create an Access Policy that does instate some accountability, the LCA team has chosen to insist on more wide-ranging disclosure scenarios and now effectively, in regard to user accountability, has left everyone with a take it or leave it proposition. If the LCA team were really interested in addressing the interest of the Community, it should have presented to the Board at least a version of the draft Access Policy with identification requirements but with less comprehensive possibilities for disclosure. (Even more so if you bear in mind that, as explained above, the Community is inadequately represented on this page in the first place.)

(4) My favorite argument, as always, is that people can fake their submitted IDs. This is not just some argument, it is the ultimate, unchallenged favorite argument of the “net community.” My current working hypothesis is that this is the great argumentative undertaking of the internet age, and its ultimate purpose is the replacement of common sense logic with the authority of the techies. The structure is well-known and always identical: Measure A can be circumvented. => You should not implement A. This is usually “proven” by providing great examples of how easy it is to circumvent A; on conventions, this is the point where the internet activist audience bursts into laughter. Gotcha! Needless to say, this is nonsense. You would either have to show that A will (necessarily) be circumvented, or (as usual) that the cost of implementing A does outweigh the benefit of implementing A. Neither is the case here. Even if a criminal volunteer who already has access might provide a counterfeit statement of identity, identification requirements at least act as a deterrent. OTRS is a good example here: Many volunteers have access over a long period of time. I, for instance, got access in 2007. If I had been forced to submit identifying information in 2007, got angry about a fellow Wikipedian last week and now had malicious intentions, the submission from 2007 would still constrain me. Also, less responsible people might be deterred alltogether from requesting access in first place. Finally, a requirement to submit identifying information underscores that the Foundation and the Wikimedia movement in general take the privacy of users seriously. Not having any identification requirement for anyone (not even Arbcom members dealing with highly sensitive cases, stewards and OTRS admins) sends a clear message that the Foundation does not really care about user privacy (as for the user it is certainly irrelevant if a volunteer or a WMF employee is responsible for a privacy breach).

(5) If the LCA team had shown some more flexibility in setting the means of identification, concerns by Dutch and German users could have been addressed. For instance, it would be possible to simply include an exemption clause to the end that if the scanning/copying and/or transmission of governmental identification documents is prohibited by law, WMF staff will provide other means for providing identification that are customary in the respective country. Of course these do exist (for instance, German volunteers could identify via PostIdent (which can also be made use of by foreign organizations directly)). You could also offer to provide different means of identification on request (at the discretion of Foundation staff) in case someone does not want to submit a copy of their ID.

In light of the above, it is my opinion that the Board should reject the Recommendation. — Pajz (talk) 09:38, 18 March 2014 (UTC)