Talk:Controversial content/Brainstorming/personal private filters

Question[edit]

Thanks, an interesting read. One question: Given that the Wikimedia software on the Wikimedia site would have to recognise the user's filter preferences somehow, do you think that a censorious country or ISP could use the same lists to implement a solution that would block access to all these images permanently, for all their users? --JN466 22:27, 11 October 2011 (UTC)[reply]

Glad you liked it. The short answer is I'm pretty sure they couldn't easily do that, and what they could get would have some pretty odd results. The actual filters would not be publicly accessible. However a potential censor would be able to create an account and then see what message they got up with each image, but if they started reading hundreds of thousands of images a day then we'd be entitled to block them as a denial of service attack. If they were clever they'd also click on a few things to set their preferences - otherwise they'd risk insulting their own bosses. But I wouldn't expect a censor to be trying to use our filtering system - If anyone wanted to use Wikimedia resources for censorship there are freely available lists such as various commons categories and en:MediaWiki:Bad_image_list (Not safe for work). WereSpielChequers 23:22, 11 October 2011 (UTC)[reply]

Well, assume I set up a user account with a maximally wide-ranging filter. That user preference would be stored in a cookie (or something similar) on my computer, wouldn't it? It would create a particular personality, identified by my cookie data. Couldn't that be exploited, by assigning the same type of personality to all users of a particular ISP? A second step a censor might then undertake would be to disable the filter override. If my assumptions are wrong here, what exactly would be stored where, and why would it not be accessible? --JN466 23:49, 11 October 2011 (UTC)[reply]

The user preference would be stored on the WMF servers and only accessible by yourself, a developer or someone who knew your password. In that sense it would be similar to your watchlist, whether you log in at home or at work or in an Internet cafe, when you log in you can access your watchlist, but once you logoff other users of the same PC can't access your watchlist and if you are in a multiuser system then different people can log in at the same time without sharing or disclosing user preferences. By contrast on both the PCs I use the operating system prompts me with possible edit summaries, and offers the same ones whether or not I'm logged in. But my two PCs clearly don't talk to each other as the summaries offered are different. WereSpielChequers 07:09, 12 October 2011 (UTC)[reply]

New images[edit]

Shouldn't there be a "Show all unfiltered images so I can help identify offensive images" option? Without it, new images would be entirely dependent on "Only hide images that I've chosen not to see again"-choosers to do all the filtering. I suspect there might not be very many of those. Thparkth 13:37, 12 October 2011 (UTC)[reply]

I think that "Show all unfiltered images so I can help identify offensive images" would be a way to market the option to only hide images that you've chosen to filter.

I have no idea what proportion of our readership would use this system or which option they'd choose, except that a lot of editors with slow connections would jump for the first option, though that isn't really a censorship issue. You may be right about there being a shortage of active filterers, though personally I expect we will find there are plenty of people out there who have strong opinions as to what is suitable for others to see. However I've added a new option which should increase the flow of information. WereSpielChequers 22:15, 12 October 2011 (UTC)[reply]

User initial preferences[edit]

When a new user starts to use the filter, they will not have any preferences saved and it will not be possible to accurately match other users with similar preferences. Yet this is likely to be the most popular option filter users would select.

I can see two possible ways to deal with that, neither of them particularly satisfactory.

1. Since the user has expressed no opinions, the algorithm uses the average of the entire filter user base as a starting point. This may satisfy the user, but it is more likely to confuse and frustrate them - "the filter is broken! I'm still seeing pictures of breasts in medical articles about breast cancer!" Of course, at least some users can be expected to actually express opinions about images, but I suspect there would be an intrinsic bias at work - users are far more likely to "thumbs down" an unfiltered image that they do not like, than to "thumbs up" a filtered image that they do like (because they would not see it!). Thus over time the "average user" starting point becomes more and more censorious.

2. Since the user has expressed no opinions, the algorithm does no filtering until the user has expressed an opinion about a statistically useful number of images. Confusion and frustration - "the filter is broken! I'm still seeing penises and prophets!". Many (most?) casual readers will never express an opinion about a statistically useful number of images.

I do have a possible solution. Unfortunately it involves making things yet more complicated:

In the initial configuration, ask the user a few (maybe five or so?) questions about their preferences. These questions would be textual. Something like:

Do you find depictions of Mohammed offensive?
Do you find images of human nudity offensive?
Do you find images which are disrespectful to God offensive?

...etc etc. The answers to these questions would not be used to filter out any specific image, and there is no attempt to classify images here. The answers would be used to do achieve one thing only: to allow the filtering algorithm to find other users with similar answers to establish a more appropriate initial set of preferences for the user. If you find images of human nudity offensive, but are OK with everything else, then a clever algorithm can confidently decide what to filter on your behalf based on the image opinions of other readers who have the same preference.

Thparkth 14:52, 12 October 2011 (UTC)[reply]

Yes getting a new user's preferences is one of the difficult points about the system. In a POV application that needed tuning to a users preferences you would have a starting default, or you'd give new filterers the option to calibrate the system, either as you say with a bunch of questions, with some borderline images or in other ways. An image filter is a little more difficult to calibrate than other systems because obviously you can't have a filter that starts off by showing people a bunch of images that would offend most people who subscribe to filters. I did consider having male and female stick figures and sliding bars so you could set the two at any combination: "mankini and hijab", "tracksuit and dental floss" and so forth. But that would then give you the difficult task of integrating the calibration information with the filter data, it would give vandals an opportunity to game the system, and it would breach NPOV. The last point is perhaps the most serious, lots of people have made it very clear that they won't support a filter system in which we are making choices, including what to include or exclude from a menu. So that's why this system starts off uncalibrated, but gives people a safe mode if they want to be cautious.

I'm not convinced that this would become more censorious over time. Though if it succeeds we will attract increased readership from people who are currently too censorious to use an "uncensored" site like this. If anything my expectation is that it would become more precise over time as the more data it had to work with the more confident it would be as to where an individuals preferences lay. Of course people's tastes and tolerances may change, so we may need to date stamp the data and include a recency weighting in the preferencing. WereSpielChequers 23:49, 12 October 2011 (UTC)[reply]

Commons[edit]

It has been asked whether this proposal would work for Commons. Personally, I think that filtering Commons should be excluded from the discussion because of its special function as a repository, but have you thought about this aspect? --Sir48 01:43, 13 October 2011 (UTC)[reply]

I don't see why Commons should be any different, other than the much higher proportion of images. I do categorisation on both commons and EN Wikipedia, and I can see myself ignoring all images on EN wiki while doing gnomish edits there if that speeded up what I was doing. But I'd be surprised if one could do much on Commons whilst blocking all images. However if you are browsing the commons category for the place you are going to for your next holiday then yes I can see that someone who has opted in to the filter would want the filter to hide some images that happened to be in that category.

One difference with commons is that as a repository it contains millions of images that are unlikely to be used outside it, and therefore the filter would be less effective in its quietest backwaters. But I see that as a problem for the filterers not for Commons.

The filter relies on image names, descriptions and alt text to give the filterer context in deciding whether or not to unhide an image. So on a practical level as Commons is a multilingual wiki with a paucity of multi lingual descriptions especially non-English descriptions, filterers especially non-English speakers are likely to find the filter less helpful on Commons than on other projects if they have set it to default to hide. If an image is in your search but you can't understand any of its metadata then it may be a frustrating experience.

On a broader note that raises the issue of whether this should be available to all Wikimedia users wherever they are browsing, and therefore it would need Wikimedia wide consensus for me to be happy seeing it implemented; Or should this be a local decision for each project, as cultural attitudes vary sharply by language. My view is that provided your filter choices are as personal to your Wikimedia experience as the skin you choose or your other display options in your user preferences, then it would be better to make this a wikimedia wide decision as it wouldn't alter the experience of other editors or readers of a project unless they also opted in to the filter. Also it is the nature of the filter system that the more participants the more effectively it should work. However we may need to be pragmatic, if some projects are overwhelmingly for or against having this feature then we may need a local option to avoid upsetting Wikimedians. But I would hope that people would accept that if this had no affect on those people in their project who didn't opt in to the feature then it should be a wikimedia wide decision. WereSpielChequers 07:45, 15 October 2011 (UTC)[reply]

Thanks for this proposal, Werespielchequers[edit]

Hi Werespielchequers, thanks for prodding me to look at your proposal. I appreciate what you're trying to do here: to come to a workable solution that doesn't trigger the objections that were made to the initial Pokemon proposal here. I've just read your proposal, and the talk page as well. So here's some feedback. It's going to be long: forgive me :-)

I do agree that that a large amount of opposition to the Pokemon design was triggered by its reliance on categories. As I understand it, there are two main points of opposition:

1) A category-based system would create significant new work. As you say, our categories system is currently designed to label content from an informational standpoint. There are currently no labels or categories that attempt to assess whether an image is objectionable. And as I understand it, even if a determination were made that eg "asparagus" is potentially objectionable, we do not currently have a category system that can claim to capture all images of asparagus: many images are uncategorized or incompletely categorized. And therefore, even a filter that assumes asparagus is objectionable and labels it as such, wouldn't necessarily capture all instances of it. Hence the argument that a category-based system would create new and not inconsiderable work for editors: first, determining that asparagus is objectionable, and second, categorizing all current and future images of asparagus as asparagus. That's the workload argument. There's also an ideological component to that argument, which is the notion that editors' time is better spent on activities that directly improve the quality of material in the projects, rather than working to support people in NOT seeing material they don't want to see.

2) The second objection to a category-based system is that creating and maintaining categories of objectionable material makes it easy for external entities to censor the projects. That's a serious moral objection, and it can't be ignored.

So I would say that you are correct that a category-based solution is a non-starter, for both those reasons.

What I like about your proposed solution is that you are aiming to support user choice and to make it easy for users. I think it's critical that the system be easy for people to use: I think it would be unhelpful to force users to wade through many, many images they find objectionable, in order to screen them from their own view. So I like the idea that you are aiming to provide people with a small number of simple choices.

I can imagine people using your first 'hide all images' option when, as you say, they've got a slow internet connection – or when they are reading about topics that make them uncomfortable, or they are with a small child, or on a train, or whatever. I can also imagine people wanting ii) only to hide specific images that bother them, or iii) images that e.g. fellow observant Muslims have chosen to hide, or iv) that anyone has chosen to hide for any reason. Those four choices seem like a reasonable array for me: it's user-friendly and simple. I also believe that it supports readers making their own decisions, either by themselves or (supported by technology) in easy collaboration with other people who share their preferences. Removing the burden of anticipating those choices from editors is I think also a good thing. I've made a number of comparisons to journalism in talking about the controversial content issue, and I'll make another one here: in order to be a good journalist, you need yourself to be able to look unflinchingly at things that some other people find uncomfortable – you might cover car crashes and war, read hate material, interview people who say appalling things. Our editors could be considered to be like journalists in that regard, and the system you're proposing would distinguish (correctly in my view) between that role, which is exposed to all kinds of materials, and the role of readers, who should be able to make their own choices about what they want to be exposed to.

So the “what” of what you're trying to do here, I agree with. I'm not sure about the how. I want to be careful here not to make commitments on behalf of the Wikimedia Foundation's tech team: I do not know how complex it would be to enable this system, particularly to give people the ability to hide images that fellow filterers with similar preferences have hidden. And I do know that we want to put minimal technical effort towards this project, because the tech team has lots of work on its plate, and its priority is to develop functionality that will support editor recruitment and retention, like the new page patrol triage work, the Visual Editor, the MoodBar, and so forth. So, I'm going to refrain from talking about whether your database notion is feasible given resource constraints.

But, I know that your purpose in asking me to comment here was to help you figure out whether to put more effort into further developing this line of thinking. And I think you should: I think it's promising. I want to be cautious in saying that, because I haven't been involved in all the conversations the tech staff have been having on this topic, so it's possible something I've said here may horrify them (and if so, I hope they will notice this and say it), but in general, I think your thinking is on the right track, at least in terms of the 'what.'

Let me tell you a little about the Wikimedia Foundation's timeline from here.

The Wikimedia Foundation's not going to dedicate engineering effort to this issue until at least January. That's because we've got other work to do, which is already scheduled, and we want to get some of it rolled out before we switch gears and begin on this. So, you won't see any activity from eg Brandon or other tech staff until then. That leaves a window of two or three months for conversations to develop like the one you're having here.

I'll also tell you that our initial thinking at the Wikimedia Foundation is that the first step in all this ought to be enabling readers to flag images they don't want to see, in a way that enables editors to see reader input. There might be a period in which we do that purely for information-gathering reasons, before anything much happens with the data (e.g., maybe it would screen the image from the individuals' view thereafter, but it wouldn't have any other implications). The purpose of doing that would be to help us better understand what kind of images people currently are seeing that they want to hide. I think that would be good useful information to support editors in making editorial decisions (e.g., imagine there is an image that is clearly inappropriate in its context: user feedback could help editors identify those kinds of problems), and I think it would help us have information in the event that later, for example, we wanted to ask a few questions to help people configure their preferences as you say on the talk page. (So for example you currently say “Mohammed/nudity/God” – we might discover that the most-flagged types of material are for example “Mohammed/nudity/gore.” A flagging period would also give us a better sense of the scale and size of the problem, overall, which might help eg with decisions about prioritization, roll-out, etc.

One last note: I'll also say that I agree with you that ideally the system would apply to all projects. This is for the purposes of user-friendliness. If the goal of the system is to support user preferences, it makes sense to assume their preferences would be consistent across multiple projects, and so the preferences should move with them. This is something we are not currently terrifically good at supporting. For example, I am “female” on a couple of projects where I've bothered to set that preference, but I am not globally “female” – which is obviously kind of silly :-) So, I appreciate that you are aiming to develop something that would work for all projects, and I would love to see non-enWP editors participating in your discussions here. And lastly let me thank you for your work on this. I do appreciate what you're doing – aiming to build consensus among people who have very different starting points for discussion.

Can I also say that I've been following the conversation between you and jayen466 on foundation-l. I gather jayen466's got a different proposal, which I have to say I haven't yet read. You both strike me as people who are willing to work towards consensus on this, so I wonder if it would be possible for the two of you to collaborate on a shared proposal? Just a thought :-)

Thanks. Sue Gardner 23:51, 16 October 2011 (UTC)[reply]

Dear Sue, Thanks for the feedback.

I appreciate that the developers have other things to do, and think that a timeout on this for the rest of the year might allow things to calm down a little. However it would be useful to have someone cast an eye over this and other proposals at least to say whether it would be a trivial or major development. I know enough about speccing programs to be confident that my proposal is doable, but I'm not sufficiently familiar with the capabilities of the Software that our devs use to know whether they have 90% or 9% of the functionality already reusable from other things, and there may be some people who would be reassured if the devs were to confirm that it was feasible.

I'm nervous about the idea of publicly revealing people's filter lists. Partly this is because one of the common objections is that we don't want to create a system that others can access and use for censorship. It isn't one of my objections, as we already release data that is far easier for a censor to use in a filter system, but I am aware of this concern and have tried to accommodate it. Partly there is the argument that people's POV re this should not impact the pedia, people could have all sorts of reasons why they don't personally want to see an image, but if they try to stop others from seeing something then some people will consider this censorship. So maintaining a clear divide between the filter system and editing will make the proposal more acceptable to the community. Most seriously, this filter data would be deeply intimate data delineating our personal boundaries of disgust and disquiet, That makes it something we should keep very confidential, profiling people can be contentious at the best of times, but releasing this sort of data would be way more intrusive than a marketing profile as to what sort of shopper you were. Obviously if this goes ahead we are going to need to test the software, but we need to do this in ways that don't link people to their phobias and dislikes.

Anonymised information such as the "twenty most controversial images in use in this Wikipedia" would be interesting, but I fear that selection bias would quickly set in, especially for EN wiki. The images we use in certain potentially contentious articles will be filtered by more people than similar images on commons that are not so frequently seen. So at best this would lead to continual change and the annoyance of those who had put careful thought into the current choice of images; But it could lead to the truly perverse situation whereby the EN and AR editors who focus on a particular sexuality topic find themselves changing to what they had previously thought was an image that more people would find distasteful; Only to find that an image that got a relatively low score when it was mainly seen by people browsing DE wiki suddenly becomes more offensive when used on EN and really quite contentious on AR. There's also the issue of an anonymised editor feedback tool being gameable by sophisticated vandals or the more clueful spammers. Lastly, but not leastly, we should remember that some of the most controversial images will feature current and former wikimedians and their current and former lovers; If we start publishing offensiveness scores and offensiveness league tables re our images of women's breasts then it won't be long before someone tells the press that says she'd never have agreed to have her breasts photographed if she'd known this would happen.

I've looked at Jayen466's proposal, and I will consider how to possibly collaborate. However my initial reaction is that the two approaches are very different, so different that we might want to use both to solve separate problems. It's like having two attempts to set new records for circumnavigating the globe; One by hot air balloon and the other by bicycle. Both might succeed, but hybridisation would be counterproductive. WereSpielChequers 00:41, 19 October 2011 (UTC)[reply]

[1] :) --JN466 16:36, 23 October 2011 (UTC)[reply]

Thanks, well yes it can be a challenge in its own right.To be more specific. Those projects that want the option to declare particular images in particular pages controversial can already do so if they have consensus within their project that this option should be available and also consensus re that particular editorial decision. I'm told that both the Hebrew and Arabic wikipedias have done this. I suspect that more might do so if they had ready access to an appropriate template. Is there any reason to go beyond that as far as editorial decisions about hiding images? The only real alternatives that I see on the table are either that local or project consensus aren't necessary but merely project wide majority, or that we should rule such editorial censorship wrong in principle and ban it Wikimedia wide. I'm not entirely comfortable with either of those options even before considering the implications. The other issue would be to put more details on the implications of "the law of least surprise". On EN wiki we have had incidents where editors have decorated their userpage with a stash of porn linked from commons, and we have clear precedent that that is an unacceptable use of a userpage as even an editor who primarily edits in wiki project pornography could have another editor visit their userpage on their wikibirthday. WereSpielChequers 09:09, 25 October 2011 (UTC)[reply]

Dead horse,[edit]

This horse was dead ages ago, why do you want to keep beating the bones about? -- Cimon Avaro 12:52, 20 October 2011 (UTC)[reply]

Hi Cimon, that's an interesting opinion, though not one that I share. Would you mind telling me what your concerns are about this proposal, and if there has been a previous discussion that rejected this particular design then I'd appreciate a link, not least because my understanding was that this was a new design which to my mind meets many of the objections raised against other proposals. WereSpielChequers 13:58, 20 October 2011 (UTC)[reply]

Potential problem[edit]

I like the design, maybe even as a third-party resource (although it doesn't address some of the requirements of some pro-filterers, these are mainly things seen as objectionable by anti-filterers).

However there is a weakness (well two actually).

Create an account. Tag all images in (say) Category:Not work safe as offensive.
Scan the 10,000 most visited articles, and compile a list of bocked images. [Exploit 1]
Create a whole bunch more accounts
Tag all the images, and all images in Category:Falun Gong [exploit 2]

Rich Farmbrough 21:48 16 November 2011 (GMT).

Hi Rich, weakness 2 I think we can cater for - it shouldn't take too many people tagging Falun Gong as acceptable for the system to pick up that porn and Falun Gong are different preferences with some people objecting to both. Weakness 1 I'm pondering, but to some extent this is a trade off between functionality and purity. What strikes me as the obvious solution is to allow people to import a preference list from off wiki and then to encourage censor groups to create off wiki stop lists. This should work and be much more user friendly, but I'm not sure whether the anti censorship lobby would be comfortable even with such an opt in system. WereSpellCheckers 18:29, 20 November 2011 (UTC)[reply]

Reasonable proposal[edit]

Hey there.

While I see a number of difficult technical hurdles with your proposal, I cannot help but applaud it as going in the right direction (or, at the very least to actively take into account the numerous problems with the pokémon proposal).

There is, however, a serious problem about one of its foundations: the capability of finding out what someone else's filter settings are. Any system must insure that what a reader chooses to filter (or, more to the point, not filter) is as private as their password. Therefore, any filtering based on what other people are filtering must be done through anonymizing peer groups: it should not be possible for me to say "filter like User:X" and then browse around to see what User:X filters or not.

A second concern, albeit not a critical one, is that there should be at least two set of filtering preferences for everyone, and a trivially easy way to switch between them. I'm thinking "work/public" vs. "home/in private". People might wish to conform to social expectations in one context without crippling their reading experience in others, and this is (I think) a common enough scenario that we should strive to make it easy for readers.

Otherwise, carry on. Good show, and all that. — Coren ^{(talk) / (en-wiki)} 19:43, 11 November 2011 (UTC)[reply]

Thanks Coren. re your first objection, the idea is that only the computer and God will know whose preferences are similar and dissimilar, and hopefully she will be too amused at the odd combinations to use her invisible pink horn to point things out to us mortals.

As for the second point, that's such a great idea I've nabbed it as something for advanced options (the existing system does not need more complexity for newbies).

WereSpellCheckers 18:38, 20 November 2011 (UTC)[reply]

Time poll[edit]

WereSpielChequers, further to our conversation at the recent meet-up, here are the sources I remembered reading on the Time Magazine poll: [2], [3]. Regards, --JN466 11:23, 19 November 2011 (UTC)[reply]

By the way, having just read your remarks on how reliable sources would likely illustrate an article, there is currently a discussion touching on that at w:Wikipedia_talk:What_Wikipedia_is_not#Objectionable_content. --JN466 11:33, 19 November 2011 (UTC)[reply]

Also note Controversial_content/Brainstorming#Page-specific_.22images_off.22_button. This proposal could possibly be combined with your idea (though I am still worried about potential precision hacking and marblecaking). --JN466 01:07, 22 November 2011 (UTC)[reply]

Proposal is not category free and doing wrong statistics; images are informative[edit]

Hello,

your page proposes a filter system and calls it a "Category free filtering".

1. In fact, this proposal is a category based filter system. The user page says:

"...in essence it would score your support for other people's filter lists by looking at the images you've chosen to hide, the images you've chosen to see and compared that pattern with fellow filterers."

So basically, the fellow filteres with their own choices to filter build up categories for filtering images for others.

2. The proposal is aiming for control of the readers. The user page reads:

"... Statistical results including numbers of filterers will be made available,"

But it's the private affair of every individual user which page they read. If they want to tell their opinion, they may chose so, and are invited to do. In the age of paper publishing, when say a newspaper or magazine wanted to know the opinion of its readers, they would attach postcards with a questionnaire which the reader could send back if they want. What the publisher wouldn't do is to look directly into the reader's living rooms to observe which pages of the magazine are read how long.

The opinions of the readers are very important for Wikimedia, and if a Wikimedia project wishes to learn about the opinions of the readers, they can talk with them, or ask them for their opinion. But doing anonymous statistics of the private reading behaviour can't ever improve the content. Any statistic can only be used to create the image of an "average reader". But the mission of Wikimedia is not to reach an average readers, but each individual reader.

3. The proposal is aiming for negotiations about content. It says:

"Offering people a personal opt in preference system would be different, it would be hard to argue that it conflicted with our mission, if it succeeded in getting our information accepted in parts of the world that currently shun us then arguably it would actually promote our mission."

Images as well as text do carry information. Posting an praticular image on a particular content page is always an editorial decision. Every reader can choose to take notice of some parts of the content, and not to look at other parts. They can do so even without any content filter or image filter: either by chosing to open certain webpages and not to open other pages, or by changing the preference settings of their web browser not to display any image files, or by reading wikimedias projects with mobile view, etc.. However, this choice is the affair of the individual reader, not the affair of the publishing Wikimedia projects.

If "parts of the world", e.g. institutions like certain countrys, schools or librarys etc. only accept Wikimedia content with personal image filters, and don't accept it without personal image filter, then I think it is obvious that these institutions don't really accept the entire information.

Greetings, --Rosenkohl 13:11, 21 November 2011 (UTC)[reply]

Hi Rosenkohl, this proposal is category free in the sense that it is independent of the existing category structure. So it avoids several major problems that would happen if we tried to use our existing categories in an image filter. I agree that the net result is at least as similar to categories as watchlists are, but it would only cause confusion if we used the same word for such very different things. As for your point about statistical information and spying on readers, I'm not sure why anyone would object to knowing how many people were using this option. But the section Statistical results including numbers of filterers will be made available, but lists of images with an offensive quotient or frequency of being blocked will not be. The system would calculate which filterers had similar or dissimilar filtering preferences, but that information would not be divulged other than on an anonymised basis. could probably benefit from expansion, are there any specific changes you would suggest to it? As for your point about printed publications not measuring readership by page, Wikimedia already collects information on page views http://stats.grok.se/ , the difference between this proposal for an image filter and reading articles is that the number of people filtering a particular image would be private. If you don't think that Wikimedia should collect information on the number of people reading particular articles then I would suggest you raise that as a separate discussion elsewhere on Meta. WereSpielChequers 21:27, 22 November 2011 (UTC)[reply]

Well, the proposal from Image filter referendum/en is obviously using the word "category" for a thing different from the existing Wikimedia categories, since it says that at least 5-10 new categories would be created. And for example see http://www.mediawiki.org/wiki/Personal_image_filter#Category_Equivalence_Localization, where a "Category Equivalence Mapping" is proposed, which clearly would go beyond the existing system of caegories which exist only locally for every single project.

The notion of "category" of course is older than Wikimedia, and much more general than the existing Wikimedia category systems are. Probably the modern concept of category goes back till Kant, (e.g. see en:Category (Kant)).

Each user is maintaining their own watchlist, often keeping it secret without sharing it with others. With the filter system proposed on this user page however, the server would automatically compare filter lists of different users, and use other users filter lists to support a reader filtering. But there is no similar automatic system which would compare watchlists of different users. So watchlists are not as similar to categories as the filter lists from the proposal on this user page would be.

The page to which you link says:

"What do Wikipedia's readers care about? Is Britney Spears more popular than Brittany? Is Asia Carrera more popular than Asia? How many people looked at the article on Santa Claus in December? How many looked at the article on Ron Paul?"

Perhaps, the answers to these question are of interst for these people Britney Spears, Brittany, Asia Carrera, Ron Paul, their adherents or their managements, or the toy industry, but I don't see how such answers would help the Wikimedia projects.

However, statistics on the use of a filter system would even go a step further in that they not only say which pages are opened how often, but even how often the group of readers is using a filter and how similar or dissimilar they are filtering.

I don't call this statistics "spying" on the readers, and don't think that reader or filterer statistics would usually constitute a privacy problem, if conducted anonymously and with care and control. But the problem I see is that by doing these statistics, Wikimedia is not interested in their individual opions of the readers, what they think about the projects and the content, but in their attendance and behaviour as a mass on Wikimedia pages.

But probably doing statistics on the use of a filter system would be necessary in any case, in order to convince the "other parts of the world" that the filter system is applicable and working, and that it is used and accepted by a large group of readers.

--Rosenkohl 16:16, 23 November 2011 (UTC)[reply]

Hi Rosenkol, yes the idea behind the statistics is to see how it is working - whether that is "there are 112 opted in users but insufficient overlaps to predict any 2 as having similar preferences" or "cluster analysis has identified 112 clusters and sub clusters amongst opted in filterers with 90% of new filterers being allocated to a cluster within 24 hours of opting in to the filter" and "Of the last 36,073 images hidden by the software in the last month 1023 were unhidden by readers as overkill and 638 were added to filters by people clicking "never show me that image again". Not "The most offensive images according to Wikipedia are politicianA, PoliticianB, PoliticianC, the Goatse, irritating young popstar1, and shock surprise (victim of concerted lulz motivated campaign). The most offensive penis picture was no 112 and the most offensive scatological (other than the Goatse) image was number 468.

Are you OK with that approach? If so any suggestions as to how one embeds that in policy would be very welcome. WereSpielChequers 22:03, 6 December 2011 (UTC)[reply]

Content neutral implementation[edit]

I would like the foundation to provide the facility but I would very much prefer for it not to get involved in selecting the content. It is better to let editors provide different ones and let readers choose which one they prefer.

I think what should be provided are: a way for the community to set up filters and provide a searchable description, a way to select a filter preference, a quick link to the most popular ones, and search support for finding other filters. The most popular filters would probably have a high level of protection so only admins could update them. Which filter is used by a user would be private. Any excluded content should have a clearly visible marker showing it has been excluded and a way of overriding the filer in that particular case. The default should be to use javascript on the user browser to implement the facility but if javascript is not available the facility should still be usable with server support.

There could be a source filter and a generated filter. The source filter would be an easy way of specifying filters, e.g. a list of categories or media files or articles or other filters or specific inclusion. I suppose an automatic regeneration could be done when a filter is used if a dependency changes but just having them regenerated if over a day old would probably be fine.

I would allow the filter to be used to stop any file being displayed, even articles or external links. I would allow a school to mark a range of ips as used by the school and having a filter preference but still allow readers to ovverride the filter. If a school wants to fully protect from avoiding the filter they should need to use some other facility or some version of WIkipedia for schools. Dmcq (talk) 00:22, 16 July 2012 (UTC)[reply]

Hi DMCQ, this talkpage is a subpage for discussing one specific proposal, you might find your thread got more feedback if you posted it on the main page Talk:Controversial content/Brainstorming. Though I'll grant that this proposal does meet your criteria of not putting the foundation in charge of deciding which images will or will not be filtered. WereSpielChequers (talk) 07:52, 16 July 2012 (UTC)[reply]