Talk:Draft Privacy Policy June 2008

From Meta, a Wikimedia project coordination wiki
Jump to: navigation, search

A proposal that is brevified and updated for a number of important matters, and described further below (developed by FT2, Pathoschild and others) can be found at Draft Privacy Policy June 2008/Collaboration. An annotated version can be found at /Collaboration (annotated) explaining the thoughts behind some of the wording. Other users have produced drafts of a similar style, also below.


more emphasis and clarity on what is meant by free[edit]

from the Preamble:

  • (...) free-content educational and informational resources that may be created, used, and reused by the entire human community, should run with a bit more emphasis and clarity on what is meant by this "freedom", something like this perhaps:
  • (...) free educational and informational content-resources that may be freely created, freely used, and freely reused by the entire human community

all the best, oscar 09:54, 14 June 2008 (UTC)

III, editing histories[edit]

it reads:

  • For example, only registered users can create a new page on the Wikipedia.

but imho it is not (yet) true that for all projects you have to be registered to create a new page, so this should run something like:

  • For example, only registered users can create a new page on the English-language Wikipedia.

all the best, oscar 10:00, 14 June 2008 (UTC)

It's also not true for talk pages. Since this is just a project-based decision, perhaps it shouldn't be made permanent in this policy. The decision to allow page creation on en.wp by unregistered users could be reversed at any time. A better example would be image uploading since there isn't even an option in MediaWiki to allow unregistered users to upload. Angela 10:09, 14 June 2008 (UTC)
i.e. For example, only registered users can upload media files. Cbrown1023 talk 10:16, 14 June 2008 (UTC)
excellent idea imho. oscar 11:32, 14 June 2008 (UTC)

Last section[edit]

Is there a reason for the first sentence of the last section to lack a verb? Ucucha 10:31, 14 June 2008 (UTC)

It now reads The Wikimedia Foundation holds that maintaining and preserving the privacy of user data is an important value. But perhaps "holds" is not the best word to use there.  – Mike.lifeguard | @en.wb 10:46, 14 June 2008 (UTC)
imho "holds" is fine here (it means something like "firmly believes" doesn't it?), and it is also consistent with the usual "wmf-policy-language". oscar 11:31, 14 June 2008 (UTC)

tl;dr[edit]

I don't like the new version. Of course, it's much more thorough and legally correct, but it lacks the most important part: unlike the current version, it takes much more to read and actually understand it, especially for a newbie. MaxSem(Han shot first!) 12:24, 14 June 2008 (UTC)

Agreed. Nobody will ever read this, which makes it kind of pointless. --Tango 15:16, 14 June 2008 (UTC)
For what I've seen, it's a fairly good compromise between legal mumbo-jumbo and something that normal people can actually read. As such, it's at least quite clear, although a summary section somewhere might be a good idea. Circeus 15:36, 14 June 2008 (UTC)
Not exactly: summary section is a must have. See CreativeCommons for example: they have the deed and the separate complete legal shebang nobody reads unless they're sitting in the courtroom. --grin 08:11, 16 June 2008 (UTC)

Comments/suggestions[edit]

IV As a Wikipedia user, for example, your level of access to Wikipedia is determined by your presence in various 'user groups', which can be found on http://en.wikipedia.org/wiki/Wikipedia:User_access_levels.

Can you make this less English-wikipedia-centric? For example, " 'user groups', which can be found on the page "Special:User group rights" on each project." Or leave the enwiki example as it is cleaner but note in parentheses that user groups for any project can be found by clicking on the "Special pages" link and then "User group rights" in whatever language.

VI It may be possible for a username to be changed an account to be renamed, depending on the policies of the project to which you contribute. The Wikimedia Foundation does not guarantee that a username will be changed accounts will be renamed on request.

X B(1): Traditionally we have avoided giving people the exact time frame that checkuser information is retained, as knowing the exact time frame could enable banned users to avoid detection. It is easy for an attentive user to guess the approximate time frame, but do the benefits of being precise outweigh the possible risks?


XI
5. Where the user has been vandalizing articles or persistently behaving in a disruptive way, data may be released to a service provider, carrier, or other third-party entity to assist in the targeting of IP blocks, or to assist in the formulation of a complaint to relevant Internet Service Providers,
6. Where it is reasonably necessary to protect the rights, property or safety of the Wikimedia Foundation, its users or the public.

I wonder if, in order to provide full disclosure, the policy should describe that any editor may request CheckUser investigation of allegedly disruptive editors, via email or other private communication or via a page for public requests (en:Wikipedia:Requests for checkuser and its equivalents); that CheckUsers are expected to use their judgement and discretion in deciding whether to conduct an investigation and how much information to release; and that "third parties" may include Wikipedia editors and sysops/administrators, especially as regards "persistent disruption" and "targeting IP blocks". A perennial issue with checkuser requests is, if an editor is editing while logged in and logged out, and the combined effect is disruptive (edit warring, pretending to be two people, etc) what is the threshold for formally confirming that the IP and the editor are the same, and has the editor in fact waived his own privacy expectations by editing while logged out in a way that could be linked to his account (as opposed to editing with multiple pseudonymous sockpuppets, where a release of the IP would rarely be necessary). Maybe this is too much for a general policy statement.

You may also wish to note somewhere that "release of information derived from CheckUser logs is governed by this privacy policy and the m:CheckUser policy. Individual projects may have additional policies regarding CheckUser but they are subordinate to the Foundation level policies."

The privacy policy should also link to the ombudsman commission. Thatcher 12:39, 14 June 2008 (UTC)

Re X/B(1): this phrasing lets open the possibility to change the time window of retained data. This is a good thing. --grin 08:15, 16 June 2008 (UTC)

Can be significantly improved[edit]

Some parts explain better, but as the comment above says, "too long, didn't read". This needs to be brevified (less "chatty"), and in some parts better targetted, as a higher priority.

I have concerns over pseudonimity, anonymity, and serious practical weaknesses which arise in the existing policy's disclosure criteria, that aren't addressed. They need to be. Users are not told enough about when and how privacy information may be released, and this is a constant problem that I had hoped to see addressed in any rewrite, as it hits checkuser usage and dispute/abuse handling regularly:

1. Pseudonymity

I am uncomfortable encouraging users to use a real name as of 2008. There has been a disturbing trend towards co-ordinated abuse and harassment of that openness ("so-called "outing" of contributors) this year. Users who edit under a real name but fail to meet normal standards, also complained deeply, on finding their actions are discussed and their names appear on a wide range of pages together with others' comments on their editing. This is a common problem. I would seriously consider adding this important information in a very visible way:

"All edits to the encyclopedia are permanently recorded, and publicly visible in the history of any page you edit, as well as on discussion pages. If you use your real name, or a username that you go by elsewhere, people looking you up on the internet may see your username and others' comments on your editing. If your editing happens to cause concern, there may be discussion linked to your username."
That is exactly my biggest concern: I do not seem to remember the suggested change of policy from using aliases to strongly urging people to use real names for registration. I do not seem to notice this tendency when looking at WMF members, officials and administrators either. (We have a saying in Hungary which basically translates to "hitting the nettle with someone else's dick" ("Más farkával vered a csalánt"), I guess you got the picture.) And I do not agree either. --grin 10:47, 16 June 2008 (UTC)
2. Disclosure criteria

Current wording states: "Where the user has been vandalising articles or persistently behaving in a disruptive way, data may be released to assist in the targeting of IP blocks, or to assist in the formulation of a complaint to relevant Internet Service Providers.", and also "Where it is reasonably necessary to protect the rights, property or safety of the Wikimedia Foundation, its users or the public."

In fact, privacy based issues arise in the following areas:

  • We may want to contact other bodies, such as schools, police, or others with reasonable jurisdiction and authority in the matter, not just ISPs.
  • If the community seeks explanation of a block, a blocking admin/checkuser needs the ability to do so, if appropriate.
  • Users who use an identifiable name in their edits or username, may have that name linked with discussion of their conduct, which in turn may be searchable in perpetuity.

I would like to see the disclosure criteria amended to make these unambiguous (amendments in bold):

5) "Where the user has been vandalising articles or persistently behaving in a disruptive way, data may be released to assist in the targeting or explanation of IP blocks, to reduce further disruptive activity, or to assist in the formulation of a complaint to relevant Internet Service Providers or others likely to have authority in the matter."

and the addition of this:

==Disclosure of IP information in cases of disruption and misuse==
Users who engage in problematic conduct, especially to the point that Checkuser action arises, must expect that the protection of the project is given a higher priority compared to the protection of those who breach its policies on editorial conduct, if the two conflict or there is a problematic editing history. This can also affect users whose logged-in edits are visibly identified with an IP editor, in their editing history.
In some cases a user who is disruptive may have to accept that disruption may lead to their IP becoming linked to their account.
3. Use of term "anonymous"

I feel strongly we should not describe unlogged-in users as "anonymous" users in our privacy policy, even if that is a term used internally. We should call them "unlogged-in users" or "IP editors". Depending on their IP and chosen account name, they may in fact be far more, or far less, anonymous than a logged-in user. Describing them as "anonymous" is purely misleading, in this context, even if historically called "anonymous editors" on-wiki.

Eh, I totally agree with you. Anons are actually less anons that loggued-in users. I suspect unloggued-in users is probably the most accurate, factual and easy to understand (IP editor sounds weird to a newbie). I support that change Anthere
I agree too. We should stop using the word "Anonymous" for these users. I prefer the phrase "Unregistered users", though almost anything is less misleading than "Anonymous". {erhaps it has come time to reconsider how we treat these unregistered users: On English WP at least, unregistered users account for roughly half the people who make edits to the site... We are so careful about giving out checkuser so why do we effectively give checkuser to the whole world for half our users? --Gmaxwell 21:02, 15 June 2008 (UTC)
4. Misc. observations and improvements

Finally, a few smaller comments:

  • We often say "a user who is disruptive". Better for policy wording would be, "a user who is deemed disruptive or who appears to be associated with disruptive activity..." This wording is more in line with actual usage. Often we don't literally know a user is disruptive (we don't have a video camera in their room), and "is" implies an objective legal standard we may only rarely reach. In fact we block because technically, they appear to be strongly tied in with disruptive activity (eg same account, IP, editing area, plausible emails, etc).
  • A missing purpose for collecting information is to resolve problems and confirm users are not abusing the community, as well. Needs adding.
  • Factual correction - a users' edits "will be publicly identified with your (etc)". The word "publicly" is missing, and the information that privately they may be identified with their IP and other server information as well, as described.

FT2 (Talk | email) 13:49, 14 June 2008 (UTC)

Update - as noted by another user, the time duration isn't part of this; minor edit to remove. FT2 (Talk | email) 15:43, 14 June 2008 (UTC)

Names of cookies[edit]

As far as I can tell, the software does not set a cookie named PHPSESSID anymore. Erik Warmelink 15:02, 14 June 2008 (UTC)

This is fixed in the proposed rewrite. —{admin} Pathoschild 08:44:47, 15 June 2008 (UTC)

copyedit-level comments/suggestions[edit]

I agree with other commenters that the draft could be finely edited/pruned for brevity, but I won't detail all those suggestions here and now, but would be happy to do so when it's the appropriate time. Just some broader copyedit-level suggestions here:

  • A more consistent wording format for section heads would make the policy's structure easier to scan and understand.
  • The head for section II, Contributing To A Project -- Identity Issues. does not seem to match the section's contents. Suggest something like "General Principles: Open Contributions and Editing".
  • The paragraphs at the end of Section VII that describe the board's April 2008 resolution do not seem to fit with that section. Perhaps separated by another section head or moved?
  • Given the length of the policy and the readership drop-off that may occur as a result (as mentioned by other commenters above), would it be worthwhile to consider positioning the disclaimer closer to the beginning of the policy, rather than at the end?

--Sfmammamia 16:50, 14 June 2008 (UTC)

right to disappear[edit]

  • "Another of the guiding principles is that the history of contributions by editors (whether self-identified, pseudonymous, or anonymous) is itself preserved indefinitely. This is a choice we have made in order to enable our communities to build reputation systems among editors and to enable editors to qualify over time for administrative positions and privileges."

Deleting the history of the contributions after a given amount of time (eg. 2-3 years) would not harm the advantages of the reputation system. For GNU reasons, the list of all contributors could be kept for the main (whereas I don't think pseudonyms has any right in the GNU licence system). Nothing should have to be kept for all other pages.
The big advantage is that it would really preserve the long-term personal integrity of people. Some events that occur on the projects or some comments written on people cannot legitimely be kept registered forever.
There is a considerable ethic issue with this and this warning page is not enough. Contributors do not realize the risk they take in contributing to the projects and technical solutions to avoid this "time period" should be found.
Ceedjee 20:25, 14 June 2008 (UTC)

Mediawiki can't distinguish between a genuine unimportant pseudonym, a real name, or a pseudonym that is well known to be a given person's by-line on the internet and which they use and edit under widely. So removing attribution after 2-3 years is problematic. GFDL and WMF wikis do not distinguish between mainspace and other namespaces, or decide contributions to some are "more important" than others. And it's not really for us to decide who may wish to rertain long term attribution and who may not, or what name they may write under. I don't think this - as proposed anyhow - can work. FT2 (Talk | email) 06:12, 15 June 2008 (UTC)
Hello, you are right for the impossibility of the distinction.
But I think you missed two points anyway :
  • after 2-3 years we just list the contributors, but don't link the diff to its author.
  • we can delete all the pages that do not concern the main (ie talk pages, etc).
Ceedjee 10:03, 15 June 2008 (UTC)

2 concerns[edit]

I understand this is a draft. But there's a definitely bias towards both 1) english speaking world and 2) english wikipedia.

Excerpt:

  • Whether you register with your real name or with a pseudonym, you should note that registered users may gain more access to the Wikimedia projects. For example, only registered users can create a new page on the Wikipedia. (See: http://en.wikipedia.org/wiki/Wikipedia:User_access_levels.)
  1. THE Wikipedia is NOT the english wikipedia
  2. That statement is false in most wikis.

I just want people working on this to be extra careful to write a good policy taking in account all projects and languages, not focusing on the needs of the english wikipedia. es:Drini 22:05, 14 June 2008 (UTC)


Another excerpt:

X. Details of Retention of Private Information – Browsing and Editing in WMF Projects.
(...)
B. Editing.
When editing a page on Wikipedia projects, ...

What about non wikiPedias? Again, I understand this is a draft, but shouldn't be written thinking of english wikipedia in mind, just some red lights. es:Drini 22:10, 14 June 2008 (UTC)

Support. The document should use some nomenclature like WikiX to stress that it also covers Wikiquote, Wikisource, etc. IPs can create new articles at both EN:WQ and EN:WS, for example.--Cato 22:44, 14 June 2008 (UTC)
These concerns are addressed in the rewritten draft below. —{admin} Pathoschild 22:58:39, 14 June 2008 (UTC)

Rewrite[edit]

I think the draft is unnecessarily verbose, reads like an essay or opinion piece, makes incorrect assumptions (like "everyone can contribute", "history [...] is preserved indefinitely", or "you are encouraged but not required to register with your real name" (some specifically discourage that due to stalking, etc)), significantly addresses non-privacy subjects (like community values, copyright, or user access hierarchy), and uses redundant section numbering (sections are numbered automatically in the table of contents). I propose moving the explanatory material to a separate essay, so that the policy only contains policy.

Upon addressing these points (without intentionally changing any specific policy points) and reorganizing around the specific information, the draft looks something like this:

The purpose of this document is to outline the privacy policies of the Wikimedia Foundation (see also explanatory material).

Collection of information

Consistent with the Data Retention Policy, the Wikimedia Foundation collects and retains the least amount of personally identifiable information needed to fulfill the operational needs and legal obligations of the Foundation and counter abuse.

authorship information
Each edit is accompanied with authorship information in an edit history, including user name (or IP address if not logged in), timestamp, and what was changed. This information is also aggregated, such as by user (see user contributions) or date (see recent changes). You may contribute to public projects without logging in. However, edits by unregistered users are credited in edit histories to their IP address, a series of four numbers that identifies their computer or network. This information may be retained indefinitely, unless deliberately removed such as in response to a privacy violation or court order.
email address
You may optionally provide a working email address in your user preferences, which allows other users to send email to you through the wiki. When you receive an email from another logged-in user, your email address will not be revealed to them unless you respond, or possibly if the email bounces. However, your email address will be displayed when you email another user through the wiki, and will be publicly available if you email a public mailing list.
non-public information
Every time you visit a web page you automatically send technical information to the server, including software profile, request headers, and the IP address of your Internet service provider. Most servers routinely maintain access logs with a portion of this information, which can be used to generate usage statistics. The Wikimedia Foundation may keep the raw logs, but these will not be published or used to track legitimate users.
When you edit (either logged in or not), the server confidentially stores this information for a limited period of time. This information is automatically deleted after a set period.
When you edit without logging in, the IP address of your Internet service provider is publicly credited as the author of the edit. Depending on your connection, this address may be traceable only to a large Internet service provider, or specifically to your school, place of business, or home. It may be possible for a third party to identify you from this IP address in conjunction with any other information you provide. Logging in allows you to better preserve your privacy in this situation.
user accounts
Once created, user accounts will not be removed. It may be possible for a user name to be changed, depending on the policies of the wiki to which you contribute. The Wikimedia Foundation does not guarantee that a user name will be changed on request.
cookies
The sites will set a temporary session cookie on your computer when you visit the site. If you do not intend to log in or edit, you may deny this cookie. It will be deleted when you close your browser session.
More cookies may be set when you log in to maintain your logged-in status. If you choose to save your user name and password on your terminal, that information will be saved for up to 30 days, and this information will be resent to the server every time you visit the same wiki. If you are using a public machine and do not wish to expose your user name to future users of the machine, you may clear these cookies after use.
Access to and publication of information

The Wikimedia Foundation will not sell or share private information such as email addresses with third parties, unless you agree to release this information or it is required by applicable law.

public authorship information
Authorship information (detailed in the previous section) is publicly available.
account passwords and email addresses
Users' passwords and email addresses are confidential, and no person should knowingly expose another users' password. However, emails sent using Wikimedia's email-user feature are sent from the user's email address.
non-public information
Certain users have access to non-public data, including IP addresses and software profiles and connected user names, usually in order to counter abuse. Access to and publication of this information is governed by the Access to nonpublic data policy. Limited information may be released in any of the following situations:
  1. In response to a valid subpoena or other compulsory request from law enforcement. In the event of such a legally compulsory request, the Foundation will attempt to notify the affected user within three business days after the arrival of such subpoena by sending a notice by email to the email address (if any) that the affected user has listed in his or her user preferences.

    If you receive such notification, the Foundation cannot advise you regarding the law or an appropriate response to a subpoena. However, you may have the legal right to resist or limit that information in court by filing a motion to quash the subpoena. Should you wish to oppose a subpoena or other compulsory requests, you should seek legal advice concerning applicable rights and procedures that may be available. If the Foundation receives a court-filed motion to quash or otherwise limit the subpoena as a result of action by you or your lawyer, the Foundation will not disclose the requested information until Wikimedia receives an order from the court to do so.

  2. With the permission of the affected user.
  3. To the chair of Wikimedia Foundation, the Foundation's legal counsel, or the chair's designee, when necessary for investigation of abuse complaints.
  4. Where the information pertains to page views generated by a spider or bot and its dissemination is necessary to illustrate or resolve technical issues.
  5. Where the user has been vandalizing articles or persistently behaving in a disruptive way, data may be released to a service provider, carrier, or other third-party entity to assist in the targeting of IP blocks, or to assist in the formulation of a complaint to relevant Internet Service Providers.
  6. Where it is reasonably necessary to protect the rights, property or safety of the Wikimedia Foundation, its users or the public.
Wikimedia policy does not permit public distribution of such information under any other circumstances.
Disclaimer

The Wikimedia Foundation holds that maintaining and preserving the privacy of user data is an important value. This privacy policy, together with other policies, resolutions, and actions by the Foundation, represents a committed effort to safeguard the security of the limited user information that is collected and retained on our servers. Nevertheless, the Foundation cannot guarantee that your user information will necessarily remain private. We acknowledge that, in spite of our committed effort to protect private user information, determined individuals may still develop various data-mining and other methods to uncover such information and disclose it. For this reason, the Foundation can make no guarantee against unauthorized access to any information you may provide in the course of participating in Wikimedia Foundation projects or related communities.

{admin} Pathoschild 22:13:37, 14 June 2008 (UTC)

Completely agree. Good call, Pathos. I was just revisiting here to add a post-script note, that as drafted it has too much extraneous matter, and this needs to be a privacy policy and not drift into being an instruction manual. I've also been busy drafting a version, but Pathos has probably beaten me to it. I'll post it on a sub-page anyway for eyeballs on its approach, as Pathos has posted above. FT2 (Talk | email) 23:07, 14 June 2008 (UTC)
This is to the point. The content page draft is .... sorry, did I fall asleep? --Brian McNeil / talk 23:33, 14 June 2008 (UTC)

Good start, but it removed some information which SHOULD be kept absolutely. Please allow me to go point by point to try to identify what is really mandatory (but could be rewritten and shortened for clarification)

  • 1. preambule: nice to have, but could be moved to another document
  • 2. contributing to a project: nice to have, but could be moved
  • 3. editing history. There are several problems with this paragraph. It refers far too much to the case of the english wikipedia to be correct (misses the general situation); it refers far too much to wikipedia, to the detriment of other projects; it seems to imply that the choice to let editors be anon, pseudon, or real was made by the Foundation (not only is that incorrect, but it may increase wmf liability). Last, it implies we favor identification under real name (we do not). I think the summary you made of this paragraph is good
  • 4. Reputation systems: this paragraph deals with two points. One is the recommandation to use a strong password (nice to have, but does not belong to a privacy policy perhaps). The second is the issue of user-level access. You have entirely removed from your new proposition all points related to special user access, and this is not okay. We absolutely must explain to editors that some editors have actually access to their private data (eg checkuser) and we must bind checkusers to a privacy policy.
  • 5. How long is the information be stored. Remind editors that editing information will be kept by default (no matter how noisily they ask for removal). YOu removed that part. Given the number of times editors request that info be removed, I think it should be kept.
  • 6. removal of user account. Info should be kept and you kept it
  • 7. General privacy expectation. This paragraph is quite long and should probably be tightened. But you removed it entirely. I do not support that. Actually, your version is entirely written as if participation was only editing projects, whilst it is much more than that. WMF hosts the OTRS system, so this should be dealt with in the privacy policy. WMF hosts mailing lists, so this should be dealt with in the privacy policy. etc... in one of your paragraph, you wrote "email addresses are confidential". This is true on the projects. This is absolutely not true on mailing lists. All cases related to WMF activity should be dealt with.
  • 8. why is user data collected. Well, I happen to think this is important to keep such information
  • 9. Details of retention. Again, I think this is interesting to keep. I wonder if we could not very clearly separate the document to host such data...
  • 10. Details of retention. Same, I think we need to keep such information.
  • 11. Policy on data retention. Unless I am wrong, it seems that you simply picked up the last policy to summarize the new policy. Except that there are some significant differences between the two (eg, notification to an editor when his private data has been given to a third party). If there is ONE thing to keep in the new policy, it is precisely this paragraph.

In short, whilst I am annoyed that Mike's version is a little bit too verbiose, a bit to english/wikipedia centred, your version is on the other hand too much project oriented. It simply removes references to the other activities of the Foundation (mailing lists, otrs etc...) and removes references to special access users (checkuser, oversight, developers). I wonder if a solution might not be to more clearly identify project-related privacy issues on a single document, but later propose on the projects a "light version" to readers (with a link to the complete version). If we were to do that, we need a doc providing us with the ability to "remove" some parts. OR, each project could draft a simplified version, with clear indication that it is a light version, with a link to a complete version. But fact is, we can not REMOVE from the global privacy policy the issue of OTRS-related privacy just because the global policy would be too long to read for Wikipedia editors. Anthere 10:39, 15 June 2008 (UTC)

Thanks Anthere, that's extremely helpful. Pathoschild and I ended up last night collaborating on a joint draft that we hoped might merge the best of all three of our views. The above provides some very helpful guidance, and several of the points in it were already anticipated (eg OTRS/wiki-centric/special access/removal of account/email address on mailing lists) as well as some likely important loopholes that were not noticed before, suggesting we're on the right direction. I hope we can bounce the draft we're working on, off you, for your views as well, as you have given above, when the points you make are thought through in more detail. Thanks again for them. FT2 (Talk | email) 10:57, 15 June 2008 (UTC)
Update - all covered now. 1 - 6 was already fully dealt with, 7 and 11 were dealt with but have been improved per your comment, 8 - 10 look appropriately and adequately dealt with. Issues such as wp-/en-/specific-project-centricity, and special access users including OTRS, were already fully fixed. Waiting for review by Pathoschild. A couple of minor definition and other points came up that will need referral to Mike to redraft if this version finds favor (and probably would apply for any privacy policy that is approved); they've been tagged as such. FT2 (Talk | email) 11:41, 15 June 2008 (UTC)
Okay. Waiting for the rewrite then :-) Anthere 21:52, 15 June 2008 (UTC)

In the course of collaboration, three issues have come up that are not "obvious" or cover unexpected new ground, and would be useful to consider:

  1. An actual succinct definition of "private/non-public information" of the kind covered (a whole policy referring to it but we never say anything of what it is). First stab at a draft of this (IANAL): - "This policy primarily covers certain personally identifying information collected or stored by the Foundation in relation to the wikis and communities hosted on Wikimedia servers, that is not public upon creation and not intended by the Foundation to be made public, or which (if posted publicly) has been removed to a point where ordinary users and administrators are unable to view it. Examples include IP and other technical information derived from server logs, most OTRS and certain other emails, password and email address settings for a user account on a hosted wiki, personal identification provided in compliance with the 'Access to nonpublic data' policy, and oversighted (but not ordinarily deleted) content."
  2. Wikimedia mailing list administrators - a person who subscribes to a mailing list may give an email address even if they do not post and their intention is to merely read. Is the list of email addresses on the server (as opposed to actual emails sent) "private" information in the sense of this policy? Specifically, is a list administrator permitted to publicize the list of subscribers and their email addresses for any mailing list at will, or state "X [optionally: with an email address of Y] is a subscriber to this list"? What if under "subscription" they gave a real name, on the basis they would never send an email to the list and the subscription information was non-public? If so, we need to clarify; if not then we need to consider if list admins are bound by non-public data policy and this hasn't been considered...?
  3. Disclosure exemption #2 ("With permission of the affected user") modified to add "... (In the case of a user who is a minor, with permission of the user or a parent or legal guardian.)" If a minor corresponds with OTRS or makes a post on the wiki, or we email a user and it turns out to be a minor and intercepted by a parent or guardian. I'd hate to see us telling a parent that we cannot inform or discuss any aspect of their minor child's suicide email, vandalism case, or household IP global block, with them. Whilst minors need privacy protection too, are there exceptional circumstances where (as for legal matters) we wish to allow ourselves to discuss a matter with a legal parent or guardian? Perhaps one whose net connection was used?

FT2 (Talk | email) 12:26, 15 June 2008 (UTC)

Thanks to Pathoschild for this redraft -- as others have mentioned, I'd also prefer to keep explanations and philosophy out of the policy proper. Some quick responses to FT2's points:
  1. "but not ordinarily deleted," the distinction there is a good one to make, between deletion and oversight, but I'm not sure the average outsider will understand it; perhaps that should be elaborated on, at some point?
  2. Regarding mailing lists, the current privacy policy indicates that "If you subscribe to one of the project mailing lists, your address will be exposed to any other subscriber." I do try to treat this sort of data with care, but we should make clear that the expectation of privacy is significantly lower, here, if anything. The highest expectation there seems to be with the email as set in Special:Preferences, and anything received by OTRS.
  3. Legal guardian sounds like a reasonable case for release, yeah. I've seen "parent or legal guardian" on countless forms and permission slips, so perhaps the phrase has a very specific legal definition I'm not aware of, but I nevertheless imagine that just "legal guardian" will cover it with less ambiguity.
Other random thoughts:
  1. I'd just as soon move the bit about subpoena notification out of the numbered list, for balance; perhaps immediately below it (with or without its own heading).
  2. As Anthere also pointed out, this seems very project-oriented. Prior sections on mailing lists, OTRS, and IRC seem to have been removed. ML and OTRS are probably still worth a mention. If IRC is worth a mention, probably just to clarify it's not covered by this policy.
That's all for now. Luna Santin 21:28, 15 June 2008 (UTC)

Raw log data[edit]

Why should raw log data of page accesses be kept indefinitely? Are the old logs ever used?--Cato 22:57, 14 June 2008 (UTC)

My understanding/guess of the primary reason why these raw logs are kept indefinitely is that none of the SysAdmins want to bother establishing a policy about when the old ones are deleted. They think of them as something between the scum that builds up on your bathtub & the monthly statement you receive from your bank on your savings/checking accounts: sometimes useful, often an inconvenience, & cleaned out when they have nothing better to do. These logs may be deleted after a week or a year -- or may be kept for a couple of years while specific performance or access problems are being investigated. Using them to track what specific users are reading is not their primary interest. -- Llywrch 00:43, 16 June 2008 (UTC)

Excess verbiage[edit]

This proposal looks more like an essay full of excess verbiage. There is much in there that has absolutely nothing to do with privacy; it may be valid policy, but it belongs in a different document.

The detailed explanatory portions of the document should not be treated as policy. They can be useful, but where they conflict with the actual policy the policy itself should prevail.

In my view any policy document should be succinct and to the point.

The very first sentence states:

The purpose of this document is to outline the privacy policies of the Wikimedia Foundation (WMF) and the philosophy that underlies those policies.

How are we to distinguish between policies and philosophy? Eclecticology 23:39, 14 June 2008 (UTC)

It surprises me that the distinction between policy and philosophy needs explaining, but I'll give you a non-wiki example. The prohibition of murder is a policy. Valuing individual human life is philosophy. But if you want to merge the WP entries on "policy" and "philosophy," be my guest. MGodwin 00:43, 17 June 2008 (UTC)
What do you think of the rewritten draft I proposed to address that problem? —{admin} Pathoschild 01:13:05, 15 June 2008 (UTC)
After the long-winded original anything is an improvement. It could still be trimmed a bit more.
  1. I would exclude the links to help pages. Having links to lower status pages has the effect of suggesting that they are part of the policy.
  2. The document uses the second person in the first part, but reverts to the third person in the later stages. If we are talking about policy the third person should be used throughout. The second person may be helpful when explaining a policy to users, but this is not the same as the policy itself.
  3. The policy should begin with a definition of what we mean by privacy.
  4. When initials are introduced the full term should be given, thus we should begin with "Internet Protocol (IP) address". Simply using "IP" later in the document is just fine.
  5. The two major parts of this policy should be consolidated to avoid redundancy.
  6. Possible revisions in part:
    1. Privacy is the right of an individual not to have his personal information made public.
    2. User name means
      (a) The unique name or names by which the user has chosen to be recognized on any site operated by the Wikimedia Foundation (WMF), or
      (b) Where the user has not chosen a name, or has failed to log in using that name, the Internet Protocol (IP) address of the site from which he is editing.
    3. Public information includes all edits conforming with this policy with associated edit names and time signatures. This information helps in identifying copyright owners and is kept indefinitely. It is available and may be indexed or used in other organizational schemes by any user.
    4. Private information is any information which has not been designated as public.
I could continue in this way, but I'll wait until I have a response before doing so. Eclecticology 04:54, 16 June 2008 (UTC)
See #Collaborative proposal below. Comments? FT2 (Talk | email) 11:18, 16 June 2008 (UTC)

Licenses other than the GFDL[edit]

It may be relevant that, in addition to the CC licensing of Wikinews content, nl.wikibooks content is dual-licensed under a CC and the GFDL license. (There may be other projects with special licensing considerations, but I am not aware of any.) --Iamunknown 05:43, 15 June 2008 (UTC)

This is fixed in the proposed rewrite, which doesn't delve into copyright. Updating a wiki's license shouldn't require an update to the Foundation privacy policy. —{admin} Pathoschild 08:47:04, 15 June 2008 (UTC)

Reading vs Editing[edit]

Our position with respect to data on reading vs editing should be addressed separately: In the case of editing there are many complications which result in far more expansive information collection and dissemination while at the same time the overwhelming majority of site users are readers who do not edit. I think the functional part of policy should be separated into two major sections: "Retention and dissemination of information related to reading activities", followed by, "Retention and dissemination of information related to editing activities". This ordering should improve clarity even if it creates some duplication. --Gmaxwell 20:53, 15 June 2008 (UTC)

User page licencing a privacy matter?[edit]

What has the licence a user page and other userspace pages licencing to do with privacy?

And furthermore... I find it immeasurably a very very very bad idea to hold that userspace is automatically under a precisely identical licence to the project. Come on, surely we can do better!

-- Cimon Avaro 21:19, 15 June 2008 (UTC)

_Why_ is it a bad idea? Wiki(m|p)edia is not a webhost. Firefoxman 21:32, 15 June 2008 (UTC)
Well, the practical case I was thinking in particular was when the stormfront guys started the nazipedia pedia fork, and copied all userpages to their site too, giving the appearance that every user of wikipedia was also a nazipedia user. -- Cimon Avaro 22:13, 15 June 2008 (UTC)
If there goal was to perpetrate that sort of fraud they could have done so without any licensing grant. They could have simply made dummy userpages using factual info extracted from the real ones, for example. Licenses aren't the right tool to prevent misrepresentation. --Gmaxwell 22:16, 15 June 2008 (UTC)
The discussion has only been going on a few days and already we have an example of Godwin's Law... :P Cbrown1023 talk 23:56, 15 June 2008 (UTC)
Except in the case Cimon is referring to, the individuals running the website were honest-to-god, self-identified Neo-Nazis. Their political & racial beliefs are incidental to their inappropriate use of material in this specific event. (Look in the wikipediaEN-l mailing list archives in 2004 or 2005 for "nazipedia" for details about this short-lived Wikipedia mirror site.) -- Llywrch 01:39, 16 June 2008 (UTC)
It was still a comparison to Hilter/the nazis, ergo Godwin's Law. :-) It was a joke though, hence the tongue smiley. Cbrown1023 talk 04:33, 16 June 2008 (UTC)

Another draft[edit]

Collection of information

Consistent with the Data Retention Policy, the Wikimedia Foundation collects and retains the least amount of personally identifiable information needed to fulfill the operational needs and legal obligations of the Foundation and counter abuse.

The Wikimedia Foundation will not sell or share private information such as email addresses with third parties, except as governed in this policy.

Why is User Information Ever Collected?

In general, the Foundation limits the collection of personally identifiable data to fulfill the purposes that serve the well-being of WMF projects, including but not limited to the following:

To provide site statistics
The Foundation statistically samples raw log data from users' visits. These logs are used to produce the site statistics pages; the raw log data is not made public.
To solve technical problems:
Log data may be examined by developers in the course of solving technical problems and in tracking down badly-behaved web spiders that overwhelm the site.
To enhance the accountability of WMF projects
The Foundation and the Wikimedia communities have established a number of mechanisms to prevent or remedy abusive activities in WMF projects. For example, when investigating abuse of a wiki, including the suspected use of malicious “sockpuppets” (duplicate accounts), vandalism, harassment of other users, or disruption of the wiki, the IP addresses of users, derived stored data may be used to identify the source(s) of the abusive behavior. This information may be shared by users who are charged by their communities with protecting the projects.
Retention and dissemination of information related to reading
access logs
Every time you visit a web page you automatically send technical information to the server, including software profile, request headers, and the IP address of your Internet service provider. Most servers routinely maintain access logs with a portion of this information, which can be used to generate usage statistics. The Wikimedia Foundation may keep the raw logs indefinitely, but these will not be published publicly.

Here's a sample of a user's raw log data for one page view:

64.164.82.142 - - [21/Oct/2003:02:03:19 +0000]
"GET /wiki/draft_privacy_policy HTTP/1.1" 200 18084
"http://en.wikipedia.org/wiki/Wikimedia_projects:Village_pump"

"Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en-us) AppleWebKit/85.7 (KHTML, like Gecko) Safari/85.5"
Statistical aggregates and other anonymized datasets may be generated from this normally private data and released publicly or shared with researchers.
Certain persons have access to non-public data, including IP addresses and software profiles and connected user names, usually in order to counter abuse. Access to and publication of this information is governed by the Access to nonpublic data policy. Limited information related to readership activity may be released in any of the following situations:
  1. In response to a valid subpoena or other compulsory request from law enforcement.
  2. With the permission of the affected user.
  3. To the chair of Wikimedia Foundation, the Foundation's legal counsel, or the chair's designee, when necessary for investigation of abuse complaints.
  4. Where the information pertains to page views generated by a spider or bot and its dissemination is necessary to illustrate or resolve technical issues.
  5. Where it is reasonably necessary to protect the rights, property or safety of the Wikimedia Foundation, its users or the public.
Wikimedia policy does not permit public distribution of such information under any other circumstances.
cookies
The sites will set a temporary session cookie on your computer if you log into the site. Additional cookies may be set to store preference information. If you do not intend on editing or logging in these cookies may be denied.
Retention and dissemination of information related to editing

It's best to assume that any edits or other contributions you make to a WMF project will be retained forever.

authorship information
Each edit is accompanied with authorship information in an edit history, including user name (or IP address if not logged in), timestamp, and what was changed. This information is also aggregated, such as by user (see user contributions) or date (see recent changes). You may contribute to public projects without logging in. However, edits by unregistered users are credited in edit histories to their IP address, a series of four numbers that identifies their computer or network. This information is publicly available may be retained indefinitely, unless deliberately removed such as in response to a privacy violation or court order.
email address
You may optionally provide a working email address in your user preferences, which allows other users to send email to you through the wiki. When you receive an email from another logged-in user, your email address will not be revealed to them unless you respond, or possibly if the email bounces. However, your email address will be displayed when you email another user through the wiki, and will be publicly available if you email a public mailing list.
user accounts
Once created, user accounts will not be removed. It may be possible for a user name to be changed, depending on the policies of the wiki to which you contribute. The Wikimedia Foundation does not guarantee that a user name will be changed on request.
Users' passwords and email addresses are confidential, and no person should knowingly expose another users' password. However, emails sent using Wikimedia's email-user feature are sent from the user's email address.
cookies
The sites well set a temporary session cookie on your computer when you log into the site or make an edit. Additional cookies may be set to store preference information. If you do not remained logged or edit, you may deny these cookies. Unless you request otherwise the session cookie will be deleted deleted when you close your browser session.
More cookies may be set when you log in to maintain your logged-in status. If you choose to save your user name and password on your terminal, that information will be saved for up to 30 days, and this information will be resent to the server every time you visit the same wiki. If you are using a public machine and do not wish to expose your user name to future users of the machine, you may clear these cookies after use.
non-public authorship information
When you edit (either logged in or not), the server confidentially stores this information.
When you edit without logging in, the IP address of your Internet service provider is publicly credited as the author of the edit. Depending on your connection, this address may be traceable only to a large Internet service provider, or specifically to your school, place of business, or home. It may be possible for a third party to identify you from this IP address alone or in conjunction with any other information you provide. Logging in allows you to better preserve your privacy in this situation.
Certain users have access to non-public data, including IP addresses and software profiles and connected user names, usually in order to counter abuse. Access to and publication of this information is governed by the Access to nonpublic data policy. Limited information may be released in any of the following situations:
  1. In response to a valid subpoena or other compulsory request from law enforcement.
  2. With the permission of the affected user.
  3. To the chair of Wikimedia Foundation, the Foundation's legal counsel, or the chair's designee, when necessary for investigation of abuse complaints.
  4. Where the information pertains to page views generated by a spider or bot and its dissemination is necessary to illustrate or resolve technical issues.
  5. Where the user has been vandalizing articles or persistently behaving in a disruptive way, data may be released to a service provider, carrier, or other third-party entity to assist in the targeting of IP blocks, or to assist in the formulation of a complaint to relevant Internet Service Providers.
  6. Where it is reasonably necessary to protect the rights, property or safety of the Wikimedia Foundation, its users or the public.
Wikimedia policy does not permit public distribution of such information under any other circumstances.
Mailing lists, IRC, and other modes of communication
mailing lists
If you subscribe to one of the project mailing lists, the email address you use to subscribe to that list will be exposed to any other subscriber.
The list archives of most of Wikimedia's mailing lists are public, so your email address may be searchable on the Web, and your address also may find itself quoted in messages. The list archives are also archived by Gmane and other services.
You should consider that any email addresses you use, as well as any messages you send to a mailing list, may be archived and may remain available to the public permanently.
IRC (Internet Relay Chat)
IRC channels are not officially part of Wikimedia proper and are not operated on Wikimedia controlled servers.
By participating in an IRC service, your IP address may be exposed to other participants. Your privacy on each channel can only be protected according to the policies of the respective service and channel, which may differ from one service to another and channel to another.
Different channels have different policies on whether logs may be published.
Wikimedia Email addresses
Some email addresses (such as info-en@wikimedia.org) forward mail to a team of volunteers trusted by the Foundation to use a ticket system, such as OTRS, to view them and answer them. Mail sent to the system is not generally publicly visible, but is visible to a select group of Wikimedia volunteers.
By sending a mail to one of these addresses, your address may become "public" within this group.
The ticket system team may discuss the contents of your mail with other contributors in order to best respond to your message.
Mail to private addresses of members of Board of Trustees and the staff of the Foundation may also be forwarded to the OTRS team.
Your messages and email address may be saved by members of the respective OTRS team and any email service they use and may remain available to them before they are deleted.
Response to a valid subpoena or other compulsory request from law enforcement

In the event of such a legally compulsory request, the Foundation will attempt to notify the affected user within three business days after the arrival of such subpoena by sending a notice by email to the email address (if any) that the affected user has listed in his or her user preferences. Certain legal obligations may prevent the foundation from providing notice even when it would be otherwise able to do so. Because the foundation does not collect any contact information for users without accounts such notice is usually impossible for readers and unregistered editors.

If you receive such notification, the Foundation cannot advise you regarding the law or an appropriate response to a subpoena. However, you may have the legal right to resist or limit that information in court by filing a motion to quash the subpoena. Should you wish to oppose a subpoena or other compulsory requests, you should seek legal advice concerning applicable rights and procedures that may be available. If the Foundation receives a court-filed motion to quash or otherwise limit the subpoena as a result of action by you or your lawyer, the Foundation will not disclose the requested information until Wikimedia receives an order from the court to do so.

Disclaimer

The Wikimedia Foundation holds that maintaining and preserving the privacy of user data is an important value. This privacy policy, together with other policies, resolutions, and actions by the Foundation, represents a committed effort to safeguard the security of the limited user information that is collected and retained on our servers. Nevertheless, the Foundation cannot guarantee that your user information will necessarily remain private. We acknowledge that, in spite of our committed effort to protect private user information, determined individuals may still develop various data-mining and other methods to uncover such information and disclose it. For this reason, the Foundation can make no guarantee against unauthorized access to any information you may provide in the course of participating in Wikimedia Foundation projects or related communities.

I've restructured per my above suggestion and made a number of other changes. In particular the prior language with regard to retention does not, to the best of my knowledge, reflect the current behavior. I also noted some situations in which the foundation could not provide notice of legal request for the release of private information. --Gmaxwell 22:13, 15 June 2008 (UTC)

One thing I like about this version is that it avoids the issue brought up by section 8 -- " Why is User Information Ever Collected?" The title implies that an explanation for collecting this information will be provided here -- & doubtlessly there are some respected Wikimedians who would eagerly argue whether there is a need to collect any user information. The primary reason for this is that this information cannot be collected without forgoing the collection of many kinds of performance & health data about both the servers & the networked connections. Perhaps what is needed here is a simple statement along the lines that "non-public information on users & editors -- specifically server logs -- is primarily kept for technical reasons, and may be deleted for reasons of performance & system health at any time. Using them to identify individuals for whatever reason, legal ethical or otherwise, is secondary to these purposes". -- Llywrch 01:57, 16 June 2008 (UTC)
A good idea. (And, the raw statement may need modifying since obviously other non-public information is not primarily kept for technical reasons.) FT2 (Talk | email) 12:43, 16 June 2008 (UTC)

Collaboration[edit]

As mentioned above, Pathoschild and I did some work on a collaborative version, which also included Anthere's comments above as well. It is based on Pathoschild's succinct and to-the-point draft, and has been very carefully reviewed (almost word by word), and a wide range of small but significant edits found, whose fixing impacts on loopholes and the quality of the text as a privacy policy.

Link: Draft Privacy Policy June 2008/Collaboration.
(An annotated version can be found at Draft Privacy Policy June 2008/Collaboration (annotated) explaining the thoughts behind some of the wording)

FT2 (Talk | email) 11:16, 16 June 2008 (UTC)

Ermmm, is this a rival proposal? You expect people to review it the same depth as this one? Is yours expected to hold this one from going further in the decision chain? --grin 11:59, 16 June 2008 (UTC)
No, we're not much doing "rivalry" here. A number of people have suggested ways to improve the great wordiness of the original, and Anthere has noted in a comment on one of them, features from the original that are important. The question of the best wording possible that: 1/ meets updated WMF board and legal expectations, but 2/ do not get "tl;dr" responses from the great majority of users, and 3/ catches key details that may be omitted or problematic, is something that affects all of us, and every member of the public who is a WMF project user.
So far at least two people have posted suggestions of possibly better approaches for communal review. This is a collaboration of one of them, drawing upon Anthere's feedback. FT2 (Talk | email) 12:31, 16 June 2008 (UTC)

English Wikipedia specific[edit]

Still, it's quite english and wikipedia oriented. For example I do not know about others, but Hungarian Wikipedia does posses some role accounts which are not handled by WMF and not by their OTRS (hopefully ;)), so this should not be there as the only possible means of group email handling. Generally local projects are pretty free to handle their affairs and this generally does not relate to privacy policy any way, so phrasing should be cautious to mention english wikipedia specific facts suggesting that they are general and exclusive cases. --grin 11:53, 16 June 2008 (UTC)

Can you describe how those work and in what ways they are not covered. Is it simply that these are not mentioned (eg, in the "including but not limited to" examples in some places), or is it that they are a whole different way of interacting that is omitted? (Notice english wikipedia's group emails are not specifically named by anyone commenting, either.) More information needed here. FT2 (Talk | email) 12:37, 16 June 2008 (UTC)

With pleasure. ;-)

D.Information email addresses.; Some email addresses (see below) may forward mail to a team of volunteers trusted by the Foundation to use a ticket system, such as OTRS, to view them and answer them. Mail sent to the system is not generally publicly visible, but is visible to a select group of Wikimedia volunteers. By sending a mail to one of these addresses, your address may become "public" within this group. The ticket system team may discuss the contents of your mail with other contributors in order to best answer your query.

Bold parts are enwp specific; truth is moct probably best achieved by cutting them completely. WMF and enwp specific details may be mentioned separately. --grin 13:12, 16 June 2008 (UTC)

But OTRS is not enwiki, or even Wikipedia-centered. It handles Wikiquotes, Wikinews, Commons, general inquiries, copyright for media used on all wikis, all languages inquiries, and so on.
References to "tickets" I can see is not essential: "Information email addresses: Some email addresses may forward mail to a team of volunteers to view them and answer them. These emails are not generally publicly visible, but are visible to that team. By sending a mail to one of these addresses, your address may become "public" within this group. Team members may also discuss the contents of your mail with other contributors in order to best answer your query. FT2 (Talk | email) 13:35, 16 June 2008 (UTC)

Since you've got the point just a sidenote: OTRS handles only english inquiries, for english only projects. Or maybe some larger language (german?), but definitely not all of them. But your text perfectly catches my drift. ;) --grin 09:47, 18 June 2008 (UTC)

There is a German OTRS team, too, which mainly deals with wikipeda but also offers support for other german-language Wikimedia projects, as wikisource or wikiquote. --Hei ber 22:25, 20 June 2008 (UTC)
There are at least 30 different languages with an info queue on OTRS. --Elitre 11:45, 26 June 2008 (UTC)

Emailuser footer[edit]

https://bugzilla.wikimedia.org/show_bug.cgi?id=14558

Identified as a result of the work on this page, patch added by demon. FT2 (Talk | email) 04:04, 17 June 2008 (UTC)

Comment[edit]

As Ombudsman I have worked with the privacy policy on a regular basis. I found some points somewhat unclear and I saw that in practice, the privacy policy left room for interpretation that was extended too much. Therefore I do encourage a re-write of the policy.

I find it quite difficult to give a profound statement on a rather short notice seven days ago, but after all it's a wiki and I would like to contribute at least some thoughts and give a short comment, before the board deals with it today:

I do not mind that the policy is quite verbose and gives a broader background to the topic. This background information is important to convince the users about the importance of privacy issues and to show, how theses issues are embeded in the core values of our community. I do expect from any volunteer, who is supporting the project by becoming CheckUser or Steward to take the responsibility and read the privacy policy carefully, be it short or verbose. For the common user, we should find ways to transport the basic issues on privacy in another form. This was also adressed in the mailing list: There are many IP-contributors who do not know, what information is saved for the public - despite of a clear warning at the top of each edit-window for IPs.

Some issues have been adressed and I will not get into details concerning localisation and focus on the large variety of projects. These points can be transferred without any problems. Some major remarks follow, sorted by importance.

Section XI

Personally, I have always had some concerns about point 5 in section XI, where one case is stated, when personal information is "colleced or released":

The old text is:

"5. Where the user has been vandalising articles or persistently behaving in a disruptive way, data may be released to assist in the targeting of IP blocks, or to assist in the formulation of a complaint to relevant Internet Service Providers"

Now formulated as:

"5. Where the user has been vandalizing articles or persistently behaving in a disruptive way, data may be released to a service provider, carrier, or other third-party entity to assist in the targeting of IP blocks, or to assist in the formulation of a complaint to relevant Internet Service Providers."``

I find it a strong improvement, that it is specified more clearly, to whom data may be released in this case. However, I find "third party entity" is somewhat fuzzy. I read this as official or semi official places outside the Wikimeda Foundation projects, as schools or universities. I would prefer to name more directly, who is meant by "third party entity" and to specify more clearly, that this paragraph refers to wiki-outside entities, not to wiki-administrators without CheckUser privileges or comparable status.

There have been many discussions, what a "Vandalism" constitutes of. I find it somewhat problematic that in the mentioned paragraph, this questions remains open. It should be noted, that the vandalism / project disruption must be of a certain severeness, before a release of data can be considered.

Section VIII
"For example, when investigating abuse of a wiki, including the suspected use of malicious “sockpuppets” (duplicate accounts), vandalism, harassment of other users, or disruption of the wiki, the IP addresses of users, derived either from those logs or from records in the database may be used to identify the source(s) of the abusive behavior. This information may be shared by users with administrative authority who are charged by their communities with protecting the projects."

I assume that "users with administrative authority who are charged by their communities with protecting the project" refers to ArbCom-Members, CheckUsers or Stewards performing CheckUser - and not the the "standard" administrators, which have not identified themselves to the Wikimedia Foundation. This should be worded more clearly:

This information may be shared by users with administrative authority who are charged by their communities with protecting the projects and are authorized by the Wikimedia Foundation (for example, CheckUsers or Stewards).

If the intention of this paragraph is that IP information should also be given out to other administrators, the policy should take into consideration the individual amount of private information that is represented by the IP-adress (open proxy, large provider, company, or even identifiable person) and the extend of the disruption. IP-Information can be abused also by trusted members of the comunities. Therefore, they should by liable (i. e. at least identifiable by the foundation) and therefore, the circle of persons with whom the data is shared should be restricted.

Section IX

Current wording:

"The raw log data is kept indefinitely, but is not made public."

New text:

"the raw log data is not made public, and is normally discarded after about two weeks."

It has been noted elsewhere, that the new wording reflects more accurate the actual practice. However, in a policy, we should word the ideal state and the goals. When we are talking about data retention, a statement as "kept indefinetely" should not be made. The old text gives a least a time-goal without preventing longer saving of logs for special cases like bug-tracking or actual statistics on a larger scale. The time frame should be adapted to a practical values (1 months) and the developers should be asked to regulary check the log length and remove data that is not needed any longer. This is especially important for servers that are hosted outside the US. In some European countries, there have been court decisions about the preservation time of logfiles of webservers.

Besides of these specific points I named, some practical aspects are not adressed in the policy:

  • CheckUser results are in many projects published as far as usernames are concerned, often usernames are published that belong to an IP or a IP-range. Sometimes, such a result will link an account, which contains personal identifiable information (as the real name as accountname) will to a pseudonymous account or an IP. They might have done edits that are considered harmful to the editor, when liked to his real name. In such cases, care should be taken to protect real names without preventing the project from protecting against vandalism.
  • Not much is written about apropriateness: I wrote it abouve: The release of personal information or the linking of on identifiable account to an IP or another username might have severe implications for a person. Therefore, it should be always checked, if an access to logged user data is apropriate. Only if the disruption and/or the vandalism to fight is sufficiently large and only if the access to the data will potentially allow actions against the disruptor, such an access to data should be permitted.
  • Meta:Right to leave also deals with privacy considerations. I ask the board to consider to increase the support for users to have personal identifiable information removed that they themselves provided in the past.
  • We do have a severe problem with individual users that release private information against the will of the owner. This is not an direct issue of the privacy policy which focus on data that is collected by servers of the foundation. However, I do see a need for action. As ombudman, I received several requests by users, who were identified by others (because they sent e-mail to them or because they were not carful with their IP-edits or because of personal information they provided on user pages) and needed help. It would be desireable to have a anonymity policy that gives administrators guidelines in dealing with "outings" of personal information by other users.

All in all I do encourage to go on with new draft with the proposed changes, although I do see important issues that still need to be adressed. --Hei ber 01:39, 21 June 2008 (UTC)

New proposition from Mike[edit]

Draft Privacy Policy June 19 2008

Translations[edit]

Such an important theme - but where are the translations to other languages. In this way - impossible! Marcus Cyron 22:14, 22 June 2008 (UTC)

that is right: I can read it, but I do not really comprehend - where is the platform to discuss on i18n? Talk:Draft Privacy Policy/de.. W!B: 15:23, 27 June 2008 (UTC)