En validation topics

From Meta, a Wikimedia project coordination wiki
Jump to: navigation, search

This page is for planning topics and point ranges for the upcoming article validation feature.

Please add suggested topics freely; we can cut the list down later. For Pro: and Con:, add comments, not votes as yet (we can vote later as needed). Feel free to improve the descriptions.

Note that the present implementation (on http://test.leuksman.com/ ) does not include the descriptions!

This list needs finalising fairly soon - please discuss any concerns at all ASAP - David Gerard 13:15, 26 May 2005 (UTC)

I would like to point out that it will (technically speaking) be very easy to add new topics later, but it will be very hard to delete topics (needs direct database access at the moment), and virtually impossible to change the point range while keeping existing ratings for a topic. --Magnus Manske

Well, we were probably going to trash the data after seeing what the results of 1.5 look like ... or you could rename the relevant table so the data's still there for the researchers ... or is even just trashing the data and starting over too much like pain?
(When 1.5 is ready, I'll add a pile of these topics, probably almost exactly as set out by Beland below - I've alerted wikien-l and wikipedia-l so people can fight over them at the last minute.) - David Gerard 23:25, 30 May 2005 (UTC)

Point ranges[edit]

Jimbo has suggested 1-4 will keep it usably simple. 1 = bad, 2 = middling-bad, 3= middling-okay, 4 = excellent. The idea being that 1-10 is unlikely to be clear enough what number is appropriate, and a mere 1-2 yes/no option doesn't give enough range.

Do we want to force people to choose between postive and negative, or would allowing them to give a "3" if the article is somewhere in the middle be better? Angela 22:34, 20 May 2005 (UTC)
I suggest not, or we'll get a lot of "3" for "no opinion" instead of people leaving it blank, which they can presently do - David Gerard 09:59, 21 May 2005 (UTC)
I concur with David. My experience of designing and capturing survey information is that a central point is disproportionately selected. Curiously, eliminating the centre also increases use of the extremes; it seems that the middle votes are not simply divided between the middle pair. There are national differences here, too: US opinions tend to favour the extremes, with a bias towards the positive; UK opinions tend to favour the middle, with a slight bias towards the negative. The four-point scale is particularly effective at mitigating these differences. --TheoClarke 14:11, 27 May 2005 (UTC)
Three or four levels at most. Kuro5hin used to have ratings between 0-5 on each post, it was silly. Lot of overlap in the 2-4 range. Now Kuro5hin uses 0-3, I believe. Rhobite 03:05, 7 Jun 2005 (UTC)

It helps, in term of human psycology, to keep things in odd numbers for rating systems. 1-5 is much better than 1-4 as it allows a "no opinion" option with a rating of 3, but still allows a horrible (1) rating, bad (2), good(4) and excellent (5). When you have an even number (like 1-10) you tend to have quite a bit of overlap on what constitutes each quality range. In short, keep it 3,5, or 7 rating points, with 3 being the most simple to describe. It doesn't matter if the rating starts at 1 or 0, just be consistant. --Roberth 20:12, 23 July 2005 (UTC)

This is deliberate; we don't want "no opinion" options to be selected, so we just won't supply them. Not filling out the question is entirely possible, if one doesn't want to give an opinion. -- James F. (talk) 09:42, 24 July 2005 (UTC)
In terms of human psychology? Would you care to elaborate extensively, providing sources? --Brian0918 19:17, 25 November 2005 (UTC)

Any rating scale involves a compromise: you have to pick between ease of use and accuracy. I can understand Jimbo's concern about the former, but as I have outlined in the Review vs. Validate secion below, there are a multitude of ways to indentify vandalism and innocent rating errors. That said, a broader range will do wonders for any data mining efforts. I strongly recommend a scale of at least 1-6, for the reasons above.The1physicist 17:27, 24 November 2005 (UTC)

The 1-4 scale seems to be the best option. --Brian0918 19:17, 25 November 2005 (UTC)

Please explain in detail why you think it must be 1-4, or at the very least why 1-6 (or more) is bad.The1physicist 01:13, 30 November 2005 (UTC)
The problem is that people who ingenuously vote in the extremes will have more vote weight than people who honest give their opinion. The question is a matter of overall vote weight per user - one person one vote. To do that the weight of all of a user's votes must add up to the same as another. In this way, a multipoint system such as has beens suggested is problematic for fairness, because more exteme votes will have higher overall vote weight. I suggest people have approve/disapprove votes, a moderate number, such as 5, per article. that way each user has a maximum overall vote weight of "5", for each article.
Alternatively, if one still wants a multiple point system, one could let people put more than one approve/disapprove vote on any given revision, but still restrict their total vote weight to a maximum, say 10 in this case. this could act as anywhere inbetween a (-10) to (10) rating system on one revision to 10 approve/dissaprove votes, one per revision, depending on how the user wants to distribute their votes. And there is no way a user can distribute their votes to give themselves an unfair voting advantage.
with this, whether one is exteme or moderate, generally positive or generally negative, one still has the same amount of effect on the relative ratings of revisions as any other character type.
Another advantage of fixed vote weight per user per article is that when someone is at the maximum, and they make a new vote, their oldest vote is automatically removed. Without this, old versions with good ratings have to be rated down before new versions with good ratings could replace them (see Article_validation_possible_problems#Newer_edits_with_fewer_votes). With this, one de-nominates old versions at the same time as they nominate (or anti-nominate) new ones. Kevin Baastalk 18:15, 8 April 2006 (UTC)

See also[edit]

Consolidated plan[edit]

Please rate this article on the criteria below on the following scale:

1 - Bad/Not at all
2 - Needs work/Marginally
3 - Good
4 - Excellent

If a section does not apply to this article, just leave it empty.

If you have a specific suggestion, such as a better title, or an idea for an illustration, or a specific complaint, please feel free to write it in the comment field next to the appropriate line.

Overall quality No opinion - 1 - 2 - 3 - 4 [text field]
  <submit>
Neutral Point Of View No opinion - 1 - 2 - 3 - 4 [text field]
Factual accuracy No opinion - 1 - 2 - 3 - 4 [text field]
Quality of references No opinion - 1 - 2 - 3 - 4 [text field]
Completeness/conciseness No opinion - Very short - Slightly short - Just right - Too long [text field]
Quality of lead section No opinion - 1 - 2 - 3 - 4 [text field]
Images and illustrations No opinion - Missing or bad - Need some work - Just right - Too many [text field]
Grammar, spelling, word choice, wikification and layout No opinion - 1 - 2 - 3 - 4 [text field]
  <submit>
Did you find what you were looking for? What were you looking for? No opinion - 1 - 2 - 3 - 4 [text field]
Is this the article you expected at this title? No opinion - No - Yes [text field]
How relevant is the topic to a general encyclopedia? No opinion - 1 - 2 - 3 - 4 [text field]
  <submit>

<submit><clear>

Privacy note: Your ratings will be viewable by any Wikipedia visitor, and will be associated with your Wikipedia account and/or your IP address (just as article edits are). See Wikipedia:Privacy policy for more information.

(and the next page...)

Thank you for your evaluation!

If you have found this article in need of improvement in any way, we welcome your contributions. You can edit or discuss the article right now, or go back to reading it.

Comments[edit]

Two important things I tried to do:

  • Standardize.

I've used consistent terminology, consistent polarity, and the same number of options for each criterion. This reduces confusion and mistakes, and allows a terse universal legend to serve as the primary instruction to survey takers. The more text there is to read, the less people will want to participate, and many people won't pay attention to complicated instructions or explanations.

I take it that when people are actually doing the rating, they will just see "()1 ()2 ()3 ()4" instead of our one-liners (but the one-liners could be on an "explanation" page"). Given that, there's pretty much no room for creative variation. People should be able to intuitively and correctly guess a reasonable approximation of what we've written for the one-liner without ever seeing it.

  • Consolidate similar metrics.

The longer the survey is, the fewer people will start (or finish) taking it.

Overall perceptions of a subject are known to affect specific-criteria ratings of the kind we are taking. So for instance, people probably won't so much form an independent opinion of "writing clarity" so much as superimpose their general impression of the article. So I've folded "writing clarity" and "suitable for publication" into "overall quality", since many people will not make clear distinctions between these criteria, anyway. "Balancing coverage" was clearly redundant with the question on encyclopedic appropriateness.

I don't think we're going to get a clear signal out of "copyediting, wikification, layout, etc.". There are too many ways in which an article could be good at one and bad at another...this will probably get amalgamated into something resembling "overall quality". If we really care about measuring these things, I suggest doing so one at a time.

These attributes are fairly superficial. If you see bad English grammar, it's very easy for pretty much anyone to fix, and it's easy to add a {{copyedit}} tag if you don't have the time or language skills to fix widespread problems. Layout and linking can be fixed without being familiar with the subject matter, and it might be better to simply encourage people to make specific changes (or suggestions on the talk page, if they lack the expertise) for these things instead of simply making a low rating. On the side of inclusion, it might be useful to be able to attribute a cause to each article's low overall quality. If an article has good information but bad grammar (or vice versa), that's useful to know.

With regard to images, it may be advisable to direct specific suggestions to the talk page, but that may disrupt the opinion-taking process, and it doesn't really flag the article as specifically needing an image.

I don't think "appropriate context" is all that distinct from "comprehensiveness" plus "overall quality" metric, which I think will depend a lot on how well an article explains its subject. I'm also not sure many people new to the encyclopedia will have a clear idea of what we mean by "appropriate context." I've listed it as "additional" and would actually recommend not including it, to reduce total length. If it's kept, I recommend the standardized language proposed above.

You may also want to take note of the proposed ordering and glue copy.

Your comments and improvements are of course welcome. -- Beland 05:46, 28 May 2005 (UTC)

Don't call NPOV Neutrality, dont mix in fairness[edit]

NPOV has a very specific meaning that isn't the same as what the layman probably thinks when he hears 'neutrality', unless he's been working on building wiki enyclopedias in his spare time. :) We should call it NPOV and have a 'What the heck is NPOV?' link. We should also remove fairness since it's not clear what 'fairness' might mean here beyond NPOV. --Gmaxwell 04:10, 31 May 2005 (UTC)

I agree. We can't really get away from Wikipedia jargon here, because the concept we're trying to get a rating on is in fact a Wikipedia thing - David Gerard 21:53, 31 May 2005 (UTC)
I'm confused. Right now it says "Neutral point of view" in the survey. How could "NPOV" be more clear than "Neutral point of view"? Kaldari 22:13, 9 Jun 2005 (UTC)

Use "excellent", not "professional"[edit]

The word professional, as used in phrases such as "professional writer" or "professional computer programmer", means precisely one thing -- a person who gets paid to do the activity in question. The opposite is an amateur -- a person who does the activity for the love of it. ("Amateur" is from Latin "amor", meaning "love".) The terms are often incorrectly (and somewhat offensively) used to suggest that only paid professionals are capable of doing top-notch work; the word "amateurish" has thus an unfortunate connotation of shoddiness rather than one of devotion.

Wikipedia does not have any professional writers. Well, we may have contributors who are professional writers in their daily lives, but as far as I have ever heard, they do not professionally write for Wikipedia. Indeed, the whole point of the project is that volunteers (that is, people doing amateur work here) are capable of producing high-quality work.

Why use a word that means "paid for" when what we mean is "of high quality"? There are plenty of perfectly good English words to express superlative standards of quality without erroneously stating that the work in question was done for pay, or implying that only paid professionals can do high-quality work. --Fubar Obfusco 12:51, 2005 May 28 (UTC)

I can see that. I made the proposed change, though I left it in the "overall quality" description to help people calibrate their scale. I want to prod people to think of commercially available encyclopedias. Is this article as good or better than that, as opposed to some ideal article which may have far higher (or lower) standards. But feel free to make further improvements as you see fit. -- Beland 03:36, 29 May 2005 (UTC)

Fantastic work![edit]

This is very nicely thought through :-) Magnus, can something like this be put in relatively simply? Is it possible for the rating topic names to have HTML or wikitext in them? If this is basically possible, I'll start tweaking the above - David Gerard 22:55, 30 May 2005 (UTC)

I agree. Absolutely brilliant. Well done.
James F. (talk) 05:39, 31 May 2005 (UTC)

OK, here's what I (Magnus Manske) think:

  • First, I like it too! That said...
  • Based on this:
  • I can have an (editable) header/footer on the page
  • The detail explanation better goes onto some page like [[Wikipedia:Validation topics]]
  • There could automatically be a link to [[Wikipedia:Validation topics#foobar]] for each topic "foobar"
  • I think I could implement the subheadings thing ("Utility" etc.)
  • The special textbox fields should go as well; there can be a comment to each rating anyway

On a more personal note, even the consolidated version seems very ... detailed. That will scare many people away IMHO, as it looks like a survey ("and finally, your credit card number...";-)

  • External sources? 1-4 points, specifics in the comment. No need to have two options here.
  • Completeness and conciseness? 1-5: Way too short, too short, little short, just right, too long.
  • Images? 1-4: Too few/bad ones, could use some more/better ones, just right, drowning in images. Special requests in the comment.
  • "Layout" and "Links to other articles" could become "wikification" or "Layout/integration". Usually, there's either good markup and links, or there's neither. I've rarely seen an in-between.
  • "Utility" can be one item: "Did this answer your question?" 1-4 plus comment, in case you want to write which question it didn't answer. Remember, you can always rate it with "No opinion", which is the default.

But, as said, this last part is just my opinion; which topics we choose is not up to me, I only do the engine ;-) --Magnus Manske 21:36, 31 May 2005 (UTC)

I like your suggested shortenings. It's really too long for people to bother with at present, but we can shorten it quite a bit. (I'm sure there's some human factors designer reading this who can tell us precisely how many questions is too many. TheoClarke?) I'll leave it to cook a little longer. I just asked on #wikimedia-tech how long to 1.5 and Kate said "a few weeks", so we have some breathing space - David Gerard 21:53, 31 May 2005 (UTC)
The key issue is not the number of questions. Readers are most likely to respond if they can see at first glance the extent of their commitment. In broad terms this means that the optimum survey length is a single page (where, in this context, a page is the amount that is viewed at one time on the screen). The way to get more data is to have sub-pages that drop down from the main survey page rather than have a long list down which the reader scrolls. So the 10 (say) core questions can each link to a page of more granular questions. This structure maximises the response to the core questions but reduces response to the subsidiary questions. Putting everything at the same level will garner more answers to the 'subsidiary questions' but will reduce the overall level of response. Where the survey goes over the page length, response frequency increases if the number of questions is cited at the beginning. --TheoClarke 08:26, 1 Jun 2005 (UTC)

I suggest we change "Did this answer your question..etc" to "Did you find what you were looking for?" This is more general, yet still leaves room for comment.The1physicist 17:27, 24 November 2005 (UTC)

Well spotted! - David Gerard 15:46, 28 November 2005 (UTC)

Update for usability[edit]

I updated the table, above, incorporating some comments and my own thoughts about brevity. I've compressed a few of the options together, and added submit buttons after each detail-level in the series of questions, so that people can just fill out a single review, or a whole fleet of them, depending on how much time they have... expanded the options fo 5, because that just felt right, and added the comment that if a question doesn't apply, just don't fill out anything (so no need for yes/no "does this apply?" questions). There is still room for another question or two in the final "Usability" section. +sj | Translate the Quarto | + 22:34, 31 May 2005 (UTC)

It's very good to have a <submit> right after 'Overall quality' - people really need to get the idea that they don't have to fill the whole damn thing out. The only reason I don't say dump everything but 2 questions (Overall & encyclopedic), is that there may be a few readers who find a survey easier to use than a talk page.
An additional improvement would be to group together the questions using the standard 1-2-3-4 meanings versus those using question-specific choice meanings. To this end, I would move the 'Quality of lead' and 'Grammar' questions up in front of the 'Completeness' question, and also switch 'article you expected' with the preceeding question. -R. S. Shaw 19:54, 11 Jun 2005 (UTC)

For 'Completeness/conciseness' I think just three choices should be on it: too short - just right - too long. I would also suggest to rename it to something simpler like "Article length". — ChaTo 15:53, 25 July 2005 (UTC)

More table edits[edit]

So I managed to make the table even more concise by consolidating some more questions. Putting a text field next to every question is a great idea, and it makes things flow a lot better. We've broken somewhat with the bad/needs work/good/excellent standard by combining "too much" and "too little" metrics on one spectrum, so we need to do non-numerical selections. If that's a problem, we'll need to come up with a different intro to explain how to use those lines. It also steers people more toward rating quantity instead of quality, but maybe that's an acceptable compromise for brevity.

Two things I think were inappropriate consolidated were the "title" and "lead section" metrics. These are really completely different things, and conflating them will give a rather fuzzy picture about which one people are complaining about or praising in a given article.

We could perhaps consolidate "conciseness" with "Did you have a specific question in mind" by suggestion that people comment on specific content they were looking for but did not find, or that which they think should be stricken, on the completeness/conciseness line.

We're floating in the 9-10 questions range. I'm no survey expert, but I'm not sure there's really a magic number of questions. There's always a tradeoff between length and participation. Simply put, we don't want to ask any more questions than strictly necessary, and we don't want to leave anything out that we're interested to see how people answer. I'm not sure that cutting down to say, 7 questions, would make that much of a difference to participation, but you know, if this is just a test run, we can run a little long and see if people are turned off (or if the results are uninteresting), or a little short and see if there are unanswered questions. User:Beland

Accuracy and references[edit]

These are not the same thing. For example, I know that en:push poll is accurate, because I happen to know about push polls, but it has no references. I suggest separating these two metrics to avoid this confusion. Meelar 19:19, 2 Jun 2005 (UTC)

I've separated them again. I've also wikilinked a few terms (all should be, really) and expanded the wording where I think it's needed - David Gerard 14:28, 4 Jun 2005 (UTC)

I don't like having "Factual accuracy" in the validation. If a user knows for sure that an article has some errors in its facts, then he should be encouraged to correct them. While it might be difficult to correct NPOV or the other stuff, correcting a factual error is easy. And if a user can't tell which is the corrected fact, then he shouldn't be voting that the article is factually wrong. I suggest that instead of "Quality of references", we ask either "Facts are supported by references" or "Factual data is supported by references" or "References are provided to support facts" — ChaTo 15:48, 25 July 2005 (UTC)

I agree. "Facts supported by references" makes the most sense.The1physicist 17:27, 24 November 2005 (UTC)

Validation going the wrong way[edit]

We don't want to do a survey on "how much work do we still need to do".

COME ON! Sheesh!

Instead of filling in this survey, people should be editing the dang wiki and using talk pages instead! :-)

I get the impression that people will fill in the survey and leave the wiki untouched.

If this is the implementation used in 1.5 it should be left turned off. Unless/until we have some system that will actually help us validate articles.

80.126.238.189 22:14, 5 Jun 2005 (UTC) aka en:Kim Bruning

I tend to agree. Why not just a simple vote on the article? I envision a little pop-out toolbar with numbers 1-4 on it, or whatever we decide upon as a good range, as well as the article's current ranking. Anon votes count less. We don't want a torturous survey-type thing that needs to be filled out upon every revision. Also, are accounts limited to only one vote per revision? We don't want people voting up their edits. --Slowking Man 22:20, Jun 5, 2005 (UTC)
Various people are interested in various aspects. And casual readers of Wikipedia tremendously outnumber editors. I doubt anything will slow down Wikipedia editors from editing ;-) We can see if it does slow down edits, of course - David Gerard 10:52, 6 Jun 2005 (UTC)
Agree with David on this one. The problem with a simple survey is that it won't give us as much information. For example, do users really really want more references, or are images more important? It will provide interesting and useful guidance, far more than a simple 1-4. Meelar 16:46, 6 Jun 2005 (UTC)
I also agree. I almost certainly would not have become involved as a casual wikipedia editor if I felt there was another avenue to appeal to the "powers that be". The fact is that that anyone who has an opinion is directly encouraged to edit articles is at the heart of wikipedia. This not only accelerates the rate at which our articles asymptotically approach perfection, but it empowers our readers. --24.21.101.75 06:39, 12 Jun 2005 (UTC) aka en:Peter Farago

This feature is definitely necessary. It won't slow down editors that much, but it will enable unimaginable gains in editor productivity. For example, with this data, we could generate lists of article who are deficient in specific areas. Furthermore, the possibilities opened up by data mining are just too good to pass up.The1physicist 17:27, 24 November 2005 (UTC)

I think it's a good idea, just don't put the whole survey in! Put max 3 multiple-choice questions with 4 question left for comments. Or at least then offer 2 ratings: one "survey" type and another one very quick one: just click on number and you are done. Because surveys REALLY badly discourage people from participating. I worked as an interviewer for a while... People don't give a &^%&*^ about your questions, if you ask more than 2 or 3. I would prefer just one question and 5 ratings from 1 to 5. Simple, easy, effortless. The problems (layout, typos, too technical, etc.) could be fairly easily identified by an experienced user. en:User:Renata3 02:48, 8 December 2005 (UTC)

Rate vs. Validate (MediaZilla:4117)[edit]

If we're really going to have such an elaborate survey, shouldn't the tab be called "review" instead of "validate"? To me, "validate" means a simple up or down vote. What we're talking about seems more like reviewing. Plus, "review" is shorter and more easily understood, IMO. Any thoughts on this? Kaldari 22:22, 9 Jun 2005 (UTC)

Or, how about just "rate article"? The word "validate", to me, seems rather haughty, and to imply a finality of judgment for/against an article, based on a level of expertise on the part of raters over the ratee which may not in fact be there. 12.73.201.108 23:45, 9 Jun 2005 (UTC)
'Validate' is terrible. 'Review' is not too bad, but 'Rate' seems decidedly better. The casual reader will understand 'Rate', and unless we're trying to keep them out of the survey, this is good. -R. S. Shaw 19:54, 11 Jun 2005 (UTC)
Agree, 'Validate' is terrible. While "validation" could still be the internal name, I think it is better to name it 'Rate' for users. — ChaTo 15:58, 25 July 2005 (UTC)
It should be in a tab called "Rate this article". Everyone will immediately understand what that's for - David Gerard 20:50, 2 October 2005 (UTC)
Agree. That will definitely be understood by all. --134.214.79.170 13:00, 2 December 2005 (UTC)
I second "Rate this article." The term "Validate" at least to me means ensuring an article is free from vandalism/copyvio.The1physicist 17:27, 24 November 2005 (UTC)

Bias-Based Negative Ratings[edit]

This was somewhat discussed on the article validation possible problems page, but not that I could see completely addressed. That is, the possibility that negative ratings (1, and possibly 2, in the present scheme) may be given because the article contents do not suit the prejudice of the rater - e.g., the article doesn't stress homosexuality is a sin against God; the article doesn't stress that Bush lied to start the War in Iraq because he had a personal desire to 'get Saddam' - or because one or more raters simply don't like the person responsible for the most recent work. Another thing to consider is, the rater may just plain not know what the facts of the matter are, and is relying on misinformation from other sources to form his/her rating.

So, would it be possible to automate 1, and maybe 2, ratings to require a text explanation for the rating, otherwise pressing the "submit button" would result in a rejection? This seems to me to be an ideal way to expose ill-intended or ill-conceived critiques, so that they can be weeded out. 12.73.201.108 23:36, 9 Jun 2005 (UTC)

What about bias based positive ratings?
Those don't concern me quite as much, because there is no invitation in positive ratings to massive revisionism or pulling an article completely off the site without good cause. Positive bias, of course, might result in the opposite effect, of discouraging critical re-evaluation or adding further information, so perhaps explanations should be required from all raters for all ratings. Especially since we have no way of knowing the knowledge-base qualifications of *anyone* to rate someone else's work, even with the best NPOV intentions.
Definitely, I see no benefit to simply tallying up numbers that define which articles are "good" or "bad" without knowing why each rater felt that way, especially since there is nothing mandatory about rating anything and the whole thing will likely come down to the opinions of people prone to going around filling out rating scales. It's rather scary to view the "Should we keep this article" board and observe that opinions are being offered by about 6-12 users tops, from a site which boasts over 13,000 contributing editors and whoknows how many non-contributing users, and these little cliques will determine the fate of someone else's hard work.
Of concern, also, is that no one but you and I seem to have any thoughts on this problem whatsoever...

12.73.201.131 18:35, 11 Jun 2005 (UTC)

This is called Rating Vandalism as it is very easily dealt with. All you have to do is take the last X votes, average them, and then eliminate Y% of the outliers.The1physicist 17:27, 24 November 2005 (UTC)

Relevance[edit]

I'm just wondering what we're going to do with "how relevant is the topic to a general encyclopedia?" Do we really need our readers to tell us where the fancruft is, and are we going to do something with that information? (I.e. "not general enough, let's not include this in the print version because that will make it look bad"). Also, it's pretty unlikely for a casual reader to arrive at an obscure article he/she wasn't really looking for and actually vote on it, or for a reader who was looking for the article to enter a negative rating on relevance.

If we are indeed aiming to find out suitable topics for a "generalized" Wikipedia, are surveys a better idea than creating a page to list the most important articles and edit this by wiki? (I know we have just such a page on Meta, but the title eludes me.) JRM 11:46, 12 Jun 2005 (UTC)

IMO, there is indeed a danger of people looking at articles they really don't know anything about, and voting, especially negatively, based on the fact that they have nothing better to do with their time & this system has no safeguards against that. There are certainly enough "editors" who fall in this category, so I have no reason to believe it can't happen with raters. Furthermore, what is the business with "print version": at this juncture, based on my admittedly limited browsing around the site, I suspect that by the time this 'pedia is up to speed with the current printed types, paper will have little utility outside of a bathroom. Better, I think, to concentrate on having the rating system as an entry point for useful feedback to writers - useful, which can be distinguished from useless, which will also be contributed by the bored and ditzy - and develop Wikipedia as an online resource which, due to the relatively limitlessness of cyberspace (vs. bound paper volumes), supercedes any of those as a one-stop-shopping, in-home uberpedia which is of at least equal, if not superior, reliability and completeness vis-a-vis any competition. In that context, setting up a rating system is probably the easiest of first steps towards making this site what people want, or should want, it to be. 12.73.195.158 16:30, 14 Jun 2005 (UTC)

This isn't just about paper versions, it's about all static versions, e.g. to be downloaded and burnt onto DVDs. A rating system would allow all the top-rated articles to be included in "volume one", and all the fancruft to be placed at the other end. A list like List of articles all languages should have does not perform that function unless it's actually ordered by importance, and getting consensus on such a list would be virtually impossible. Incidentally plenty of people will encounter pages about obscure topics they have little interest in, by using the Random Page function. Kappa 04:59, 16 Jun 2005 (UTC)

People voting on things they know nothing about is not a problem. We can identify their votes by the process I've described in the previous section (Y will be slightly different, however). More importantly, if you do a standard deviation on the data (if the ratings are all over the map), then that implies there is something wrong with the article. Even noise is data.The1physicist 17:27, 24 November 2005 (UTC)

Captions[edit]

I've noticed recently that the captions on many Wikipedia articles are, well, pretty crappy. Should captions be included in image ratings, or should they get their own line? -- Beland 02:36, 2 Jul 2005 (UTC)

In a better world than ours a separate line would be good but I am not convinced that it is practical to ask readers to analyse down to this levl of granularity.—TheoClarke 22:14, 4 Jul 2005 (UTC)
I think they are important, and might be included with the images, something like "Images, illustrations and their captions" — ChaTo 15:44, 25 July 2005 (UTC)
I think ChaTo's idea would be too confusing for us. If an image got a bad rating, how would we know if it was the image or the caption that was bad? Just "Photos and diagrams" is ok. 69.158.174.42 06:12, 5 December 2005 (UTC)

Time limits between ratings[edit]

Since there is currently nothing that can be done to revert vandalisms by bots or other vandals, I think it would be a good idea to require a username or IP to wait some period of time between voting on each article. This would also prevent people from quickly voting negative on every article topic they hate. Something like: Vote on an article, wait 5 or 10 minutes before being able to vote on another one. This would also give people time to actually read the article on which they are voting. --Brian0918 19:31, 25 November 2005 (UTC)

Did you read my comments in the "Bias-Based Negative Ratings" and "Validation going the wrong way" sections? Identifying and excluding Rating Vandalism will not be a problem at all (except maybe initially).The1physicist 01:13, 30 November 2005 (UTC)

Aggregate rating visible on page?[edit]

Will some sort of aggregate rating be visible on each article? By itself, this is probably not that useful but combined with a visible "change vetting" feature I think this could be extraordinarily useful. An obvious problem with displaying the rating is that it might not be reflective of recent edits (I'm thinking in the bad direction - a "perfect" article was just vandalized and now looks like crap). If the rating is displayed, by itself it can't be trusted since there might have been edits subsequent to the rating. On the other hand, if there were also a feature where a simple "unvetted" flag is kept in the DB, set on any edit (unchanged by bots), cleared by some set of users (at least sysops, but likely a new class of user would be useful), and also displayed on the article - the combination of rating and change vetting would lead to an overall trustworthiness indicator. An article with a high rating but with unvetted changes may not be trustworthy (and if it looks like crap it was probably vandalized). An article with a high rating without any unvetted changes is almost certainly trustworthy. The action to clear the "unvetted" flag could be added as a check box on the edit form (like minor edit or watch this page), visible only to the permitted users and allowing these users to choose to edit an article without clearing the flag if they want. -- Rick Block 17:35, 16 December 2005 (UTC)

In Place[edit]

A validated template is announced by User:Eequor at External Peer Review. This beats out conjectural popularity contests (with dialogue boxes demanding you fill out skipped items?) in my opinion. If Wikimedia intends to publish, its Internet provider protection against libel suits would get far fetched. A tort attorney whose client wanted vengence could kill the foundation. Metarhyme 18 Dec 05