Jump to content

Talk:Privacy policy/Archive

Add topic
From Meta, a Wikimedia project coordination wiki
Latest comment: 19 years ago by Tobu in topic Database dumps

The IP addresses of logged-in users may occasionally be reviewed from the server logs while investigating cases of vandalism.

Who will investigate these logs ? Maintenance people only, or any sysop who ask for them ?

So far, mainly me or Jimbo, though excerpts may sometimes get posted publicly. --Brion VIBBER 13:36 Feb 9, 2003 (UTC)
what defines when excerpts are posted publicly ? Who can ask for them and why ?

Would it be possible to make a link to the Draft privacy policy page on the page where you can log in of make a new user account. It took me a very long time to find any mention of any privacy information, and I found the draft privacy policy very useful.

Thank you

Hey, someone else who finds privacy policies useful! This draft is looking pretty good to me so far. I don't know how the process works, but I imagine a link to a more polished version of this document will be appearing on all Wikipedia pages soon (?).

Does the term "aggregate statistics" refer to statistics in which all or most personally identifiable information has been stripped? Can someone define this term or elaborate on it?

What user information falls under the GFDL? If I get a SQL database dump, are there email addresses, IP addresses, etc. in the dump? -- EvanProdromou



I've noticed that MeatballWiki gives no IPs for anon users, but (what I presume to be) reverse DNS lookups. Has this been proposed/discussed/rejected here? Martin

The old usemod wikipedias also show the hostname. I would be nice to have option to select between IP adres and hostname. Giskart 18:39 Mar 15, 2003 (UTC)
Except in rare cases (dynamic IPs), IPs and hostnames are equivalent, but hostnames are sometimes considered more privacy-invasive, as they often explicitly specify a person's university, workplace, or local ISP by name in text for all to see, which information would require a separate lookup with an IP. That, and we'd have to do reverse lookups on every visitor in order to obtain the information -- that'll slow things down a little. --Brion
This would be useful information, though, in helping to judge such a user's contribution. For example, if the BBCi article was modified by someone with a bbc.co.uk hostname, one might expect it to be accurate, but potentially biased. If the Java programming language was modified by someone with a university hostname, one might expect a certain, more theoretical, slant. If someone with an French-based hostname posted to US plan to invade Iraq, one might want to check for an anti-US slant - and also copyedit the spelling+grammar of someone who may not be a fluent English speaker. Martin

A few points:

  • Shouldn't IP adresses of viewers be deleted periodically, say once a month? That way it wouldn't matter if any authorities asked for them because they would not be available.
    • They already are.
      • OK. It should be noted along with the frequency. I am assuming that any backups done do not contain the purged information (i.e. purge, then backup). Dori
  • Would it be possible to associate tempID's to IP addresses for edits when users are not logged in. This way the public and/or authorities could not exact revenge/hunt down/other evil things the editors, yet at the same time such IP's could be banned by administrators if need be. These matchings would also need to be flushed periodically.
    • If you don't wish to be identified by other visitors, log in and use a pseudonym instead of your real name or your real network address, and screen your language, writing style, and the domain of things you write about very carefully to avoid tipping people off. If you're writing things that are likely to get people to "exact revenge" upon you, you're either not following NPOV or you're in an unhealthy environment which is a bigger problem for you than anything Wikipedia does. --Brion VIBBER 23:39, 20 Oct 2003 (UTC)
      • I already use my login. It was intended for others (those who don't know any better, don't want to, don't care, etc). By protecting others for their own good we can make the environment safer, which in turn will convinve more people to contribute. You do not have to write in POV style to get someone to be angry at you. Sometimes, all it takes is a factual statement. I think we should foster an environment where people are not afraid to speak out as long as they are telling the truth. This may seem theoretical, but it could happen. Dori
  • Shouldn't the IP's of editors that are logged in be deleted periodically? If you have the login name, you may not need the IP address. I could see where this would come in handy for long-term troublemakers, but we must weigh privacy issues more.
    • They aren't stored other than in the webserver log which is already cleared periodically, see above.
      • OK. Again, it should be noted. Perhaps it wasn't clear what exactly is meant by "logs" Dori 01:35, 21 Oct 2003 (UTC)

Most people might see this as having to do with the US government and terrorism, but that, while still a valid concern, may not be the worst case scenario. Say someone edits an article having to do with a dictator, mafia member, government official in a way that angers the latter entity. Since the IP's are public, the editor could be tracked down rather easily.

just a few things to consider, Dori 23:08, 20 Oct 2003 (UTC)

I think the draft in such a state where it could be presented to the public as is (well, almost, must answer and remove notes). I think it is better to have this unfinished draft, than none at all. Also, this policy would probably have to be presented to all the 'pedias. Dori 04:28, 21 Oct 2003 (UTC)

RDNS lookups can be misleading because there is no requirement that they be factual. ISP-run DNS servers are likely to run fairly true, but there is nothing to prevent someone with a block of 255 or more IPs from running their own RDNS and setting the hostname of an IP to, say, www.drudgereport.com. There is no requirement that it match a forward lookup. Thus, hostnames-only may not provide as effective an audit trail. UninvitedCompany 19:50, 24 Dec 2003 (UTC)



While serious contributors are encouraged to back their contributions with their real names, this is not required; a user account may be created using a pseudonym (but please see Wikipedia:No offensive usernames), or you may continue to edit anonymously (but see note about IP addresses below).

Many of the project's major contributors are pseudonymous, and some choose not to reveal their real name.



You may optionally provide your e-mail address when creating a user account, or update it in your preferences. This is not required, but if you do choose to submit your e-mail address:

  • Other signed-in contributors may send you direct e-mail via your user page and vice-versa; your address will be revealed only to those to whom you send e-mail, and to those who send e-mail to you only if you respond
  • If you lose your password, you can have it reassigned through the login page and a temporary password sent to you.

Users' e-mail addresses are stored only on the main Wikipedia server, accessible only to the site maintainers, and are not included in the publicly available article database dumps.

They will be passed on to the government... upon request under the relevant legislation?

Wikipedia has to comply with the laws of the United States and the state of California, where the server is located. (And possibly the state of Florida, where the Wikimedia Foundation is incorporated, but I'm not sure how that comes into things.) Hypothetically, if ordered by a court of law, records that exist may have to be turned over.

-My guess is Wikimedia foundation might provide information of users when asked by foreign law enforcement officers as well. Police of a country may request Wikimedia to submit some IP address information for reasons like defamation, obscenity, invasion of other's privacy, etc., and Wikipedia might cooperate, I guess. Tomos 03:53, 22 Oct 2003 (UTC)

They will be passed on to third parties... never? They are stored... indefinately?

They're stored indefinitely unless you log in and change the e-mail address listed in your account, at which point it's gone. The SMTP mail server logs may include references to any mail sent to or from you through Wikipedia, but without any connection to your account name, the subject or the content of the mail sent.

What about the mailing lists?

Like the wiki, the mailing lists are public. The archives are public. Anything you send to them is being published publically. Don't publish things you'll regret.
The subscriber list for the mailing lists is as far as I know limited to list admins only, though obviously if you send anything to a list the whole world now knows your email address. A number of different people admin the various lists. If you don't like it, subscribe with a throwaway hotmail account.

Can you subsequently remove your email address if you change your mind? If you do, is the old address stored anywhere?

You can change the email address in your wiki preferences at any time, and the old address is not stored in association with your account. I'm not sure how the mailing list stores its subscriber lists, so I don't know if an address would stick anywhere (besides the mail server logs) in connection with a mailing list after being unsubscribed. Check with the GNU Mailman people for info.

IP addresses


Depending on how you are connected to the Internet, your IP address may generally identify your Internet service provider, or uniquely identify the computer from which you are connecting.

The IP addresses of anonymous contributors are permanently recorded in the publicly viewable page histories. If you do not wish to publicly reveal your IP address, you should create a user account; your user name (which may be a pseudonym) will then be recorded in place of the IP address.

Internal server logs include the IP address of every viewer; this information is used for aggregate statistics.

The IP addresses of logged-in users may occasionally be reviewed from the server logs while investigating cases of vandalism. Excerpts may be published... when? where? by whom? What about other reasons to review logs - eg bug-tracking?

IP addresses will likely be seen by developers during bug hunting if this involves looking at the logs to get information on diagnosing the bug, but will not be published. Publishing of IP addresses may happen when tracking vandalism, as blocking a logged-in vandal requires also blocking the IP address to prevent the same person from simply creating another account immediately, and the IP block list is public for accountability reasons.

IP addresses may become more widely viewable in the future, pending discussions.

These logs are stored... indefinitely??

Logs are rotated daily; the archives are cleared out every couple weeks (deletion schedule not automated yet).

The logs will be passed on to the US government... upon request under the appropriate legislation? Which means what in practice?

In practice, this means never. Hypothetically law enforcement could serve a warrant asking for log data relating to some investigation, and we'd probably have to comply to the extent that the requested information exists (eg Patriot Act).

Other personal information


Server logs may include the operating system and browser version that viewers use; this information is used for aggregate statistics.

If you or someone else adds personal information to a wikipedia page, such as your user page, it will be stored indefinately, even if you subsequently edit it to remove it. Do not publish information that you don't want published! Wikipedia is not a private chat room.

If Wikipedia passes into the control of a third party...


... then we have no control over what they may choose to do with IP addresses and/or emails.

True enough of all websites.
But it never hurts to say it!



Seems relevant here...

Cookies are required to log in. If you choose 'remember my password', a cookie will be stored with a hash of your password. This may be a bad idea if your computer isn't very much yours and you're paranoid.
The main functioning of the login system uses a session cookie which expires at the end of your browser session.
Also set on login are cookies storing your user id number and name, which are used to fill in the last-used name in the login box when you next visit. If you don't like this, clear your cookies after logging out. These cookies last 30 days IIRC.


This draft does not seem to be getting much more attention. Maybe the notes should be removed and the draft linked from the main page (better this version than none, plus it would get more attention from editors). Maybe I am the only one who feels that Wikipedia should have a privacy policy, so I'll shut up after this. Dori 00:35, 16 Nov 2003 (UTC)

Wikitravel privacy policy


The Wikitravel privacy policy might be worth comparing and contrasting. --Evan 22:21, 17 Dec 2003 (UTC)

User, and user_talk pages


This has been discussed elsewhere, but I think it deserves a mention in the privacy policy.

Personally I think a logged in user should have the right to control the content in their own page. I've seen instances of people reverting blanking of another user's (a "bad" user) talk page, and there are talks at personal subpages saying that they should not be deleted.

The user pages are not part of the encyclopedia, they should be deleted upon request. Keeping them viewable by everyone against the user's will is, in my opinion, a misuse of the GFDL.

tristanb (not logged in) 00:27, 21 Dec 2003 (UTC)

I disagree. User talk pages are there to support the development of the encyclopedia, and as such include information that is relevant to particular articles. Perhaps that should have gone on the article talk page, but often it doesn't, and the talk pages provide a very useful history of how particular articles and issues were developed. The user talk page is not supposed to be something private. If you want a private discussion with someone, you can do that by e-mail, so I see no reason why these pages should be made part of the privacy policy. The same might not apply to user pages. Angela 01:28, 21 Dec 2003 (UTC) (see below)

I strongly agree with tristanb, to the point that I'm considering using an off-Wikipedia wiki to post my replies to talk comments, then pointing people there, avoiding releasing simple discussions under the GFDL. If something contains text intended for the encyclopedia, of course, I would be deliberately place that in an article-related area, rather than a personal area. I'm here to make an encyclopedia, not to have simple workplace discussions recorded forever by my "employer" here. Jamesday 09:01, 21 Dec 2003 (UTC)

Often the concerns relating to user pages et al are of people seeking to continue to contribute to Wikipedia, while trying to remove criticism from their user talk page (etc). Insofar as Wikipedia is in some sense a deliberative democracy, stifling criticism can have some side effects that make people rightly cautious. However, where people have decided to leave Wikipedia, I agree with JamesDay and Tristan that it makes sense to grant the right to vanish, and such users should have pretty much free reign. The only exception is where someone has been banned, where we want to have a record of why we banned them, and how long the ban is for.
I don't think you would have to remove the comments completely though in order for someone to vanish. This could be done through a name change. Also, agreeing to delete a user talk page doesn't really solve anything if comments they would rather vanish from also appear on article talk pages, which is quite likely to be the case. Article talk pages are obviously not going to be deleted, so there needs to be a solution that can apply to both these and to user talk pages. I can't see any strong reason to treat these differently. I'm also not sure you can state different privacy rules for banned users. It's possible that they might be the ones most wanting to hide their past on Wikipedia after they are made to leave. Angela 23:07, 29 Dec 2003 (PST) (see below)

My thoughts on this have changed now following an experience on another wiki where I did leave and requested my talk page be deleted. A talk page and user page is something more personal than what you write on article pages. User and article talk pages already follow different rules. For example, a user is, in nearly all cases, allowed to refactor and delete comments on their own user/talk page in a way that would not be regarded as acceptable on article talk pages. Therefore, it makes sense for those differences to apply to deletion of the pages as well. People are more attached to their pages than to their comments on article pages, and I think it is this level of attachment that would cause someone to feel uncomfortable about leaving an undeleted user page behind when they exercise their right to leave. It doesn't solve the problem of not vanishing from article talk, but if the user feels separated from these in a way they don't from their own pages, then there is reason to treat the pages differently. Deletion of your user/talk page may also be a way of psychologically breaking away from a wiki, which has a stronger effect than just walking away. Perhaps when people leave they need this as some sort of final statement that they have left, and not only that, but a statement that they no longer wish to be associated with it at all. The history of user talk pages can be fascinating and offer huge insights into the working of the wiki, but this isn't what they are there for. The aim is to build an encyclopædia, not to provide insights into how the community works or to document how individuals played a part in that. So, I now feel that the privacy policy should state that a user/talk page will be deleted on request after someone leaves. Angela 14:50, 2 Jan 2004 (UTC)

Having seen how differently I write on IRC you might also consider whether the user talk pages should be crawlable by search engines. As you've seen, people can act very differently when everything they say is being recorded compared to how they are when that is not the case. Jamesday 20:11, 25 Jan 2004 (PST)

Agreed: I think article talk, user pages and user talk should be excluded by robots.txt as well as database dumps (there is a license requirement as well tackled below, the point is not to facilitate automated processing). --Tobu 20:11, 29 May 2005 (UTC)Reply

Database dumps


I too am concerned with the retention of personal data. Wiki editing generates lots of personal information, including bickering, personal trivia and such. Keeping it forever in an indexed database means it can be searched, giving an in-depth knowledge of someone's interests and opinions (even from normal edits by well-behaved people). Having a serial identity, which is common when first signing up, makes this worse since google often connects this with a name. The solution of renaming a user can help, but it means leaving the wiki, isn't automated, and isn't used without a special reason.

The current data is accessible via a search engine, but it isn't too bad since these keep to the current text - a removed comment will be forgotten, and engines don't understand the history.

Database dumps are worse, and I think they should be tackled by the privacy policy. The options I see are removing talk pages; anonymizing user names from history data (sigs (non-template) can't be removed easily however); making dumps of only the current version (could be already the case).

--Tobu 20:11, 29 May 2005 (UTC)Reply

Personal pages


The purpose of the project is to produce an encyclopedia. To facilitate this, user and user talk pages are provided, in a different namespace from the encyclopedia. Since those pages are not part of the encyclopedia, the Wikipedia will use whatever technical means are reasonably convenient to inhibit to whatever degree reasonably convenient the wide disclosure and searchability of those pages. Except to the extent that they contain text clearly intended for a Wikipedia article, these personal pages are not released under the GFDL but are instead released solely for internal use within the project.

I've added a further section based on discussion here and over at Wikinfo, where one Wikinfo technical person indicated that it was not happy to have users prevent the display of their user pages there, even though that has been requested and so far has been accepted by them. Since it doesn't seem to inhibit our ability to build an encyclopedia I've eliminated the GFDL release of user and user (but not article!) talk pages to non-GFDL for use here only, making GFDL only items intended for the encyclopedia. This will let us better assist our countributors if there's a desire to remove their personal information from mirrors, which currently could claim a GFDL right to distribute information we're removed. I'm not envisioning any immediate or rapid technical change - I'm aware that databases are combined and a variety of other technical issues mean that it is currently convenient to distribute everything as a package, and that multiple licensing is currently most conveniently done via user pages. I'm also aware that we use Google as a fallback search engine, and that limiting it would be problematic and do not propose any immediate change to search engine crawling while we need this capability. This is mainly to eliminate the "you can't stop us" argument which seems to make some of our contributors unhappy. With regrets to other sites, I want happy Wikipedia contributors, not those who don't feel free to discuss freely because of fear that their discussion will be mirrored and searchable forever, everywhere. Jamesday 12:42, 16 Feb 2004 (UTC)

I've moved the above text to the talk page since it's most certainly not the present policy (though whether it should be is open for discussion). --Brion VIBBER 15:36, 16 Feb 2004 (UTC)
Thanks - I'd forgotten that header saying current. Jamesday 14:14, 25 Feb 2004 (UTC)
I would support not having User and User talk: pages not be under the GFDL, but I don't know what the ramifications would be. Would everyone have to write new pages (the old ones are already under the GFDL)? How would this affect being able to hold temporary articles on user pages while they're being worked on? What about copying and pasting comments between other namespaces with regard to comments? There are many issues that would need to be resolved first. Dori | Talk 15:53, 16 Feb 2004 (UTC)
I guess that's a good point. For a different reason, there is a discussion at Japanese wikipedia regarding introducing a second license (called something like in-site public domain license). It permits copying, modification and translation of any posted contents within wikipedia (of any language) and other projects. That kind of solution may work to an extent, maybe?
In addition, such a re-licensing would take agreements from the copyrightowners, I suppose. Still, the past versions are released GFDL already, and it cannot be revoked, I suppose.
If the purpose is to prevent others from copying those pages based on GFDL terms, maybe it is easier to remove these pages from the database dump. Tomos 23:35, 16 Feb 2004 (UTC)
One issue which caused me to make this change is wikinfo, which takes them from the site when the page is requested, rather than from a database dump. Web crawlers (except Google, which we use as a backup search engine) also really need to be blocked, just so things like the internet archive don't save them forever. I like the sound of that Japanese move. Please let me (or all here) know what happens with that idea - I like it very much. Jamesday 14:14, 25 Feb 2004 (UTC)
Within the Wikipedia and except for text intended for articles seems to cover the moving things around needs, since it allows those things between namespaces. If you don't think it does, please suggest clarifications. I don't think that the old pages are under the GFDL - the edit page the last time I looked specified that items for the Wikipedia are under the Wikipedia license and the Wikipedia is clearly defined as the free encyclopedia, which personal pages aren't part of. However, it's arguable enough (and was argued at wikinfo) that clearly saying they aren't is worth doing, which is why I added this paragraph to make it completely clear that they aren't under the GFDL, so we can speak more freely.
One advantage for this split is that it makes it much easier to argue that source material we may discuss is not intended for republication. I'd like some way to do that for article talk as well, but I'm not sure that I want to go so far as suggesting that article talk pages should also be clearly not under the GFDL, hence not for publication. I don't actually think that article talk needs to be GFDL either but it's got a much better case than user or user talk pages. Views on whether article talk pages really do need to be GFDL or whether saying that text for the wikipedia can be placed in them on the way to going into an article is sufficient are welcome.
Barring objections and in a week or two (on the usual slow is good schedule:)) I'll put this back into the proposal and indicate that the proposal isn't intended to be the current practice only but is intended to be future practice (and I'll also include a note requesting that possible changes be clearly indicated, so they can be discussed). Jamesday 14:14, 25 Feb 2004 (UTC)
I'm not sure what I think about this. The logic of this idea seems reasonable, but I am uneasy with the idea of user and user_talk pages being different in terms of licensing. I have some vague sense that there would be some unintended consequences about this -- can I stop someone from releasing what they write on my talk page? Can they alter my comments on their page because they own it more that other pages? Etcetera, etcetera, etcetera... -- BCorr|Брайен 01:28, 6 Apr 2004 (UTC)

Filtering out web bugs and viruses in forwarded email


Should we try to prevent people sending web bugs in email that we forward? E.g. by requiring only plain text, or safe html or something, like some mailing lists do? Or is HTML email important for some correspondents? I'd prefer to only get plain-text mail without potentially dangerous or virus-infested attachments or web bug. But of course there is some development effort. The GPL'd Mailman software can do this. Nealmcb 18:11, 14 May 2004 (UTC)Reply

Are you referring to the mailing lists (which run on GNU mailman), or to the 'Email this user' form in the wiki? In the case of the former, it should be set to strip HTML mail as it is. If you see a list misconfigured, please say which. For the latter, it should only be possible to send plaintext. If you can show otherwise, this is a bug which should be fixed immediately. Please let us know. --Brion VIBBER 22:02, 14 May 2004 (UTC)Reply

The idea came up on IRC last night that it might be a good thing to ask a developer people to set up a way for people to send private messages to each other directly through the wiki software as an alternative to communication through talk pages. The idea is that it is a way to increase communication, reduce the level of public conflict, keep conversations from polarizing quickly, and allowing more frank discussions, etc. One concern raised about this is that there is a certain "check" involved in discussions being searchable and archived, i.e. people should feel more accountable for what they say.

Opinions? Thoughts? Alternatives? -- BCorr|Брайен 14:25, 29 May 2004 (UTC)Reply

Is this going to offer any advantage over email? I also wonder if it might encourage people to use our wikis as chat rooms if there is no check on whether what people are writing is wikimedia-relevant. Angela 21:10, 31 May 2004 (UTC)Reply
I don't see the point. Everything on the wiki itself should be open, you want to keep it private, take it outside (i.e. e-mail). Dori | Talk 04:07, 1 Jun 2004 (UTC)
For an opposing viewpoint, see MeatBall:GetARoom. Wikis need not be entirely open, and private conversation has benefits to the community. UninvitedCompany 21:32, 15 Jun 2004 (UTC)
I generally dislike things going to email. I much prefer private messages on the site concerned when that feature is available, in part because it keeps all things related to the site netly in one place and in part because I try to keep mail only for personal and high priority (my priority) things. If someone started to routinely use the email link instead of my talk page and declined to stop on request, I might well add them to my spam filter to prevent it from continuing. Jamesday 08:28, 14 Jul 2004 (UTC)

Multiple accounts and Password protection


The discussion arose from a concern about a recent leak of shared passwords that is documented at Wikipedia:User talk:Tim Starling/Password matches and at Wikipedia:Votes for deletion/User:Tim Starling/Password matches. In that case, a well-intentioned admin created a list of probable sock puppets by searching for matching passwords. My concern (and that of more than a few other users) is that an innocent person who happened to choose the same (probably weak) password will get swept up in the accusation and that, further, by releasing the list their password will be exposed to the real vandal who will now be able to highjack the account.

Some consider this a remote possiblity, but I worry that it is a real threat. It might go something like this:

  • User A is a vandal. User A creates sock puppet identities B through G, all using the same password so he can keep them straight.
  • User H happens to choose the same password.
  • Admin Z matches passwords and publishes a list of A-H as probable sock puppets.
  • User A, being a vandal and probably knowing that he is being targeted, looks for and finds Admin Z's page. User A recognizes H as a new ID - not a sock puppet he created.
  • User A knows that H's password must be the same as his. A can now sign in as H, commit vandalism that will be attributed to H and even change H's password thereby reserving a new account with a respected edit history to his own illicit use.
  • User H can no longer even sign in to his own account to complain about the theft of his identity.

I support the hunt for vandal. I agree with the exposure of sock puppets for what they are. I can even support the use of password matching as one tactic to support the accusation of sock puppetry. However, I think that publication of the list with the statement that it was based on password matching creates an unacceptable risk that a valid user's password will be exposed. Passwords deserve extraordinary protection.

I added the section on Multiple Accounts (sock puppets) as a lead in to the section on password disclosures. If there is a better place for either of these clauses, please move them. Rossami 16:51, 13 Jul 2004 (UTC)

I wholeheartedly agree with Rossami's concerns on this issue. Because of the potential security risk to innocent users, this blatant disregard for privacy is the sort of thing that will very quickly lose me (and many others, I would imagine) as contributers to the Wiki projects if it is not addressed in a timely fashion.

Though it has been stated that the "ends justify the means" in this instance, I do not believe that vandalism warrants this breach of privacy. Vandalism can already be quickly detected and easily fixed by any Wiki contributer so long as people remain diligent in patrolling the recent changes lists.

Sockpuppetry is, of course, also a concern in matters unrelated to vandalism, such as voting. In such cases, password matching is a powerful and useful tool available to the admins, but I feel it is unnecessary for the results of such matching to be made public. It is just as effective for admins to patrol the more crucial polls amongst the Wiki pages (polls concerning policy changes and requests for adminship particularly) and run password matching and/or check IP logs should they suspect that a user and his/her sockpuppets is slanting the poll.

In summary, revealing password matches is unnecessary for the management of vandals, for vandalism (whether by sockpuppets, single users, or anonymous users) can already be readily monitored by the recent changes pages. In addition, serial vandals can be monitored and their actions corrected by the vandalism in progress page. In cases of other sockpuppetry issues, I welcome admins to use password matching in conjunction with IP logs, observation of editing style, and other evidence to build up a case against sockpuppet accounts. However, such information need not and should not become public until such a time as admins have satisfactorially established that the accounts in question are vandalous or otherwise harmful sockpuppet accounts and have been banned by necessity.

Anyone wishing to discuss the points I have made here are welcome to do so either here or on my talk page. Spectatrix 21:59, 13 Jul 2004 (UTC)

Tim didn't list all of the examples. He hand selected those where a number of accounts with the same password appeared to have been problematic. 7044 passwords on en have more than one account using them (one is one of mine) and one password has 823 users on en. Admins cannot check password matches or IP addresses - only developers can do it. Identifying sock puppets and trolls quickly has been a major and growing concern because, unlike vandals, they can waste significant amounts of the time of productive contributors. It's likely that the community will accept more measures like this one to assist with the task. Jamesday 09:43, 14 Jul 2004 (UTC)

Vhile Jamesday has good points, I think that publically listing any password information is likely to cause more harm than good. Perhaps only admins should have any access to it, but then that creates a power differential and attendant issues. I still don't believe in trolls — what does "troll" even mean in a Wikipedia context anyway? Jeeves 01:43, 17 Jul 2004 (UTC)

Wikimedia and child protection


A few days ago I posted a query on the Village Pump of Wikipedia asking about the ages of Wikipedians, sparked off by reading about an 82 year old on a talk page. In the response there was discussion about a 7-year old who regularly takes part. Various Wikipedians had been on touch with him on his talk page putting him right where he went wrong - as we all do of course! Somebody asked about his parents involvement. There are evidently a number of young teenagers as well. This discussion has now been removed, by the way, as part of Angela's latest weeding.

In view of the events of the past few years with on-line chat rooms being abused by paedophiles, I pose the question about how Wikimedia and the Foundation stands on this. Does the way we have set things up, allowing people to list their ages on Wikipedia, and allowing a direct Email link between editors make us vulnerable. The Talk page link seems fine, since it can be read by anyone but the direct email link is private, and potentially open to abuse, and I am uneasy that we are not following best-practice here I would hate us to be hammered for being sloppy - the press could well have a field day.

If we get more school-age participants, and I hope we do, not only for research but also for helping, this is likely to be a growing problem - there was a link put up recently on the home page addressed to educationalists on how to make use of Wikipedia. Since Wikipedia will interest intellectually-inclined youngsters who we need to be fostering, I urge we address this with some urgency and see how we might be more secure. Any thought? Apwoolrich 18:52, 4 Sep 2004 (UTC)

Do you have any recommendations? --Brion VIBBER 21:50, 5 Sep 2004 (UTC)]]
Look at safe-surfing practice and see where we might be out of line; think about if we really need a page where Wikipedian ages can be posted; add something to the 'Welcome/beginners' pages aimed at youngsters and point out safe surfing practice. It might be superficially attractive to have a kind of well-monitored 'Junior Wikipedia', to where their activity might be restricted but feel this will be counter productive. So far as as I am aware we have not had complaint from the 'moral majority' out there about suitability for youngsters of the some of the content of Wikipedia on eg pornography, but perhaps that will come.
My children are both adult and have families of their own, so I have never been faced with supervising web use by a youngster. I am aware of the danger of the possibility of hysteria in this: in the UK organisations like schools and churches have instituted child protection rules which have the effect of making the administration of organisations difficult - eg a parent not being allowed to dole out cups of orange squash at a school fete until they have been Police vetted; a proposal that all Anglican church bell ringers must be police vetted before they can ring. Apwoolrich 07:27, 6 Sep 2004 (UTC)
a parent not being allowed to dole out cups of orange squash at a school fete until they have been Police vetted sounds a little extreme! But anyway, posting your age (I assume you are referring to Wikimedians by age) is voluntary, and most people wouldn't find that page anyway. Of course, that doesn't stop someone from revealing their age so nothing much can be done. Same with email, it is optional, but children could still fill it in (and nowadays they probably all have email addresses). Prehaps it would be neccessary to remove the email feature. As for other suggestions, prehaps it could be possible to include parental controls which could be activated for a particular account: these would allow for the blocking of the email address, control over any content-blocking features (as proposed elsewhere to allow users to block potentially offensive images, text, or articles), etc. A youngster's beginners page sounds like a good idea also. Are there any professional or government guidelines or resources available regarding online child protection? TPK 07:30, 8 Sep 2004 (UTC)



I'd like this draft to be finalised for translation fairly soon and have at least the English version ready to go live by the end of the year. Some thoughts on what needs to happen before then:

  • Needs to be linked from every page
  • Needs to be translated (translations could be "unofficial" in the same way as the GFDL)
  • Should be protected? (have one official protected one on the Foundation site and others unprotected maybe?)
  • "Right to Vanish" section - do we actually give this as a right? Since developers choose whether or not to respond to username changes, perhaps it needs to be clear that there is no guarantee this will be done. It's a nice option, but not something we should be offering on any official level. (Now moved to another page)
  • E-mail: Is this really never going to be used for anything other than user-to-user communication? There has been talk of sending out emails to all users. Can we state this will never happen? (Now clarified)
  • User contributions. Every one of these and the time it was made is stored. Should this be mentioned? Are the graphs of edit counts over time a problem/ever likely to be a problem? (Now added a "User data" section)
  • I'm very unsure about "The Wikipedia will delete personal information about contributors (most likely on on user and user talk pages) at their request". I don't think this is official, and whether it happens is largely up to the admins of the wiki concerned. If we can't guarantee this will happen, it shouldn't be mentioned here. If it does stay in, it needs to mention that we have no control over the mirrors which might still contain that information. (Removed this section now)
  • The "Multiple accounts" section seems misplaced. Does this need to be part of the privacy policy? It seems a separate issue to me. (Removed this section now)
  • The following points still need to be added or clarified:
    • Needs clarification on how often logs are deleted
    • Search is not currently mentioned. Will the terms people search for ever be shared? Will that only be available as aggregated data?
  • Two very clear policies are Wikitravel and Google. It might be worth comparing these to Wikimedia's policy and seeing if it can be made clearer.

Angela 08:02, 7 Nov 2004 (UTC)/ 02:51, 13 Nov 2004 (UTC)

Well, I am for having unofficial translations. Every local project links the document in their own language if possible, and the translated document says it is unofficial and provides a link to the original (English?) policy.
As for right to vanish, I am suspicious if it is good to have it as a right. Developers are volunteers so how can we say there is a right to demand them? If vanishing is a right, it would bring something mundatory into developers' task. Can we assure the users who want to use right to vanish, their vanishment would be done by those voluntary developers? (Although I don't doubt they are willingly to help us)
"The Wikipedia will delete personal information about contributors (most likely on on user and user talk pages) at their request". It sounds a gurantee, in my openion we could only say "most of Wikimedia projects will delete personal information about contributors following to their own criteria". I'm afraid a case a requesting person and a project have different opinion what personal information is, and a case such request taks a time to be done. In my view we could only say the most projects would consider such request respectively. Sorry for my random thought.--Aphaia 06:23, 13 Nov 2004 (UTC)

If this is going to be that much of an official policy document, it should refer to following points:

  • The policy is possibly changed without prior notice by the discretion of the board. (Or something else that addresses how policy could be changed).
  • Wikimedians and other site visitors are considered to have agreed with the policy after the change. (Or something else regarding the retroactive nature of the policy).
  • If your personal information obtained through the WMF's servers is treated by others in violation of this policy, you can appeal the treatment to the board by sending email to ... (Or something else regarding the enforcement issue).
  • WMF have no plan to start selling or sharing your information with the third party. Commercialism is not something unanimously supported by the WMF projects' members. If this happens, that is only after a community-wide discussion.
  • If and to what extent the privacy information is "protected." If there is any possibility that information could be "stolen."

Also, I think consulting especially with some developers is a good idea since they are in the position to get some information from the server log.

If this is not a "contract-like promise of what we do" but more of an "explanation of what are the existing practices," then maybe we do not need some of the above stuff. Tomos 20:40, 13 Nov 2004 (UTC)

Release of data by developers


The policy used to say that data will be released:

To all curious/interested parties, sometimes except the user concerned, when it is believed that an IP address may be associated with a banned or otherwise abusive contributor, as part of the process of trying to determine whether this is in fact the case. because this has been done in the past - not sure if it will be in the future...

I don't think we should release the IP address of a user on suspicion that it may be associated with a banned or otherwise abusive contributor. We should release the IP addresses associated with the vandalism or trolling itself. We should also be able to release any private data associated with a username which has been persistently used for vandalism or trolling.

Someone said:

The last motive for ip display certainly need more description. There is a recurrent issue that this is illegal and not acceptable. I would support that we have a clearer policy on this.

I'd like to know exactly what law it is breaking, if any. I'm not being sarcastic. As far as I'm concerned, if someone vandalises Wikipedia, I'll deal with them to the best of my ability, within the limits of applicable law. I'd also argue against any voluntary policy which prevents me from doing so. I'll respect the privacy of users who contribute in good faith, but I'm not going to spare any strategy when dealing with people who attempt to damage the site. So far the difference between good and bad users has been left to my discretion, and the discretion of any other developer who wishes to get involved in this stuff. I don't think a privacy policy should be the place to change that, it should be an internal matter. -- Tim Starling 01:53, 17 Nov 2004 (UTC)



[Policy regarding log retention]:

It is the policy of Wikimedia that server log information, including any backup copies, is destroyed within one year of its creation. Will this be retained longer for abuse/security issues or at the request of law enforcement?
more information needed here, in particular with regards to log publishing. Even though IRC is not officially part of Wikimedia, the question of the log is regularly mentionned, and ihmo should be included in that document, if only to inform people.

[comment under 'Deletion of content']:

or, possibly, in cases of harassment, which may be by a sysop toward another sysop?
this comment is unclear to me. What did you want to say exactly ? Anthere


What happens to the data about what users are searching for?

I have removed the above unclear points and comments. Angela 23:27, 25 Jan 2005 (UTC)