This is a personal essay that explores the question what quality is and how projects have assessed this question in the past. As of November 2009, I was asked to join the task force for quality at Wikimedia's strategic planning project. The subject this task force deals with is quality, a somewhat vague subject. This essay tries to explore what quality means at Wikimedia projects, in its broadest sense.
Input is always welcome! All wikimedia users are invited to respond, comment or suggest at the talk page.
This page isn't meant to be static, I may now and then change my thoughts and edit it.
The example of the Dutch writing contest
I often mused on what makes a good article at a Wikipedia project in the past, being a regular visitor of the Dutch review and candidates for featured article pages. I have been involved in the Dutch writing contest for two years now, in 2008 as jury member, in 2009 as an organizer. The Dutch Wikipedia was actually the first to hold a writing contest and the contest's standards for quality and cooperation are high. The jury of 2008 published a long and detailed list of requirements, which was shortened and clarified a bit this year. The current text gives a list of criteria which the jury can use for judging articles:
- The lede. This shouldn't be too long, but neither too short. It should contain an understandable and complete definition and a short summary of the article's subject.
- The structure of the article, including the choice for section titles and paragraph subdivisions. It should be clear, logical and/or follows a natural order. The main subjects are separated from the periphery.
- The lay-out. The lay-out should not differ too much from what is considered normal at this wiki. If possible, relevant images/figures are used. Figures, tables and graphs are when possible explained with a short text. The lay-out shouldn't distract from the content: images, templates and other elements shouldn't be used in an inconsistent and/or restless way.
- The content is balanced, neutral, relevant for the subject and especially as complete as possible. It should be sufficient to describe the subject thoroughly, but there shouldn't be more than necessary to do so. The article shouldn't contain information that isn't encyclopaedically relevant. Images, tables, templates and/or graphs are only used when they illustrate the information in the text (exception: elements used for navigation with related articles).
- The style, including 'readability', grammar, spelling, word choice and word order. The language of the article should be understandable for the target audience of readers: readers with a general interest who don't have specialistic knowledge.
- The verifiability. Controversial statements should be based on reliable sources. The sources used should be as much as possible relevant for the subject of the article. They should also be as trustworthy as possible.
- The findability. The article is well linked in related articles, if necessary by a short summary and a main article/subarticle construction. When necessary redirects have been created and links have been added to disambiguation pages.
A general set of quality factors for Wikipedias
Some of these requirements can be found in the various manuals of style and/or guidelines at Wikipedia projects around. They are seen as contributing to the quality of an article, and thus are quality factors. I presented this list here as an example, not for saying these are the only or the best ways to judge quality. Below, I tried to make a list of all known quality factors that have been recognized in Wikimedia projects. The list can be roughly split in four groups:
- requirements of content:
- encyclopaedicity of content: the E-thing. Is the content at home in the format of an encyclopaedia?
- verifiability: is the content truly tertiary - traceable to sources? Are these sources trusted? Primary/secondary? This requirement excludes original research.
- neutrality: is the content NPOV?
- balance: are all subjects covered in due weight to their relative importance? Are sources used in weight to their importance?
- requirements of demand (from the reader's POV):
- interestingness: this requirement (not present among the contest's rules) is about what people are searching for. With other words: what is the information demand and is it met?
- article completeness: are all important aspects of the subject covered?
- degree of specialism: is the subject covered in a way the target audience understands it? On the other hand, content may also be too simple. And the target audience can differ per subject!
- relevance: relevance of the content for the subject (this is different from 1.1 encyclopaedicity!)
- requirements of form:
- correctness of the language: spelling, grammar, interpunction
- encyclopaedicity of language: no weazeling, no boasting or needlessly difficult language: the text should be straight to the point and educating (this is related to 1.3 neutrality!).
- style of the text: is the 'wording' plain and clear? Is it 'easy to read'?
- structure: is the structure clear? This includes the lede and a logical division into sections/paragraphs.
- lay-out: is the lay-out clear? It shouldn't distract from the content either.
- requirements of the project
- findability of individual pages. Is the reader able to find the content he's searching?
- project completeness: the quantity of different subjects covered by the project.
- project balance: different areas of content have the same level of completeness.
- project consistency: the requirement that all pages in a project follow the same set-up or format in things like categorisation, lay-out, etc.
Are these quality factors different for different projects?
Browsing through this list, one will automatically find that the relative importance of these factors may be differently perceived by each user. The different projects hosted by WikiMedia may have different ideas about their relative importance too. The Urdu and German Wikipedia's readers will on average have different fields of interests, due to cultural, sociological and geographical differences. Some quality factors may be perceived to be less important in some projects than in others, due to differences in objectives and quality standards between the projects. This is reflected by the fact that the guidelines on quality vary enormously between different projects!
The list above refers to Wikipedias, yet similar lists can be made for other projects. For example: Wiktionaries have different format requirements, Wikinews projects have different source requirements, wikisource projects have no requirements of content at all (they are simply copies of the source). 'Encyclopaedicity of content' has its equal in a 'dictionaricity' requirement for Wiktionaries, for example.
The quality of change
Wikiprojects are dynamic sources of information, their content is changing every moment. Some projects have a high rate of change, others are slower. The quality of a project increases or decreases with every change (edit) made. The total amount of 'quality' in a project is not constant, it is growing or shrinking with time. In this section I try to analyse which things add and which destroy quality.
By simple reasoning one can come to the conclusion that there are only two types of edits: those that add quality and those that remove quality. However, there are two more types of edits to be considered, bringing the total amount of edit types at four:
- Edits only adding quality (of any type)
- Edits only removing quality (of any type)
- Neutral edits (example: an edit at an English project which changes the British spelling of an entire article in American spelling)
- Edits that both add and remove quality.
The first type includes many things: from the removal of linguistic errors (quality factor 3.1) to the addition of content (quality factor 2.2) to the removal of POV (quality factor 1.3). This type of edit is truly positive since it increases the total quality. Similarly, the second type includes the introduction of spelling errors, the destruction of wiki-syntax (perhaps caused by the inexperience of the editor, I'll come to that later), or the removal of good content. The first two types have a clear effect on quality. If we want to increase quality, we should stimulate the first type and try to discourage the second type.
The other two types are less easy to assess. Neutral edits seem at first to be of little concern. They don't change the total amount of quality. However, they may have a demotivating effect on other editors (I'll come to that later too). The fourth type is probably a major type of edit, yet seldomly thought of. An edit may be both constructive and destructive mainly because one quality factor increases, while another quality factor decreases. For example: the quality factor 'neutrality' increases, but the edit contained errors in grammar and/or interpunction. Or the other way around.
There are three main types of editors or users at WikiMedia projects with regard to their behaviour regarding quality:
- The destructive user type: those that remove quality (vandals, some 'true believers').
- The 'problematic' users. Those that have an effect on the total amount of quality by writing new content or helping with maintenance, but are percieved as having a largely negative effect on quality. Many of their edits will eventually be reverted. The community comes regard an user as 'problematic' by giving the increase in some quality factors more weight than the decrease in others.
- The 'good' users. Those that have a positive effect on the total amount of quality. Their edits are usually not reverted and perceived as constructive by the community. This perception means (like with problematic users) that the quality factors that increase by the added content are seen as more important than the quality factors that decrease. There are at least 17 ways in which quality can increase (see above), but I'd like to split 'good' users into two groups:
- The quality user type. Those that increase the total amount of quality by creating new content and in that way increase quality factors of content and demand (factors/requirements 1.1 to 2.4).
- The maintenance user type. Those that increase the total amount of quality by reverting destructive users or making small edits that mainly increase the project factors for quality and form (requirements/factors 3.1 to 4.4).
Of course, these five user types are endmembers of a black-and-white classification. Most users will be somewhere in between two or more of these types.
Editors have otherwise been grouped into IP-users, normal users, rollbackers, admins, bureaucrats, stewards, etc. Discussion and research about quality change has mainly focussed on the difference between the first type and the rest.
The law of wiki-erosion
At first thought, the destructive type of edit seems to be most damaging to quality. However, this type is easier to spot and will more often be reverted hours after being made. Another important factor is which quality factor decreases by the destructive edit. Among the quality factors of content, encyclopaedicity is the easiest to understand, POV is more difficult, and balance is the hardest. The chance that an edit destructive to balance is reverted is smaller than that chance for any edit introducing non-encyclopaedic content, simply because the maintenance user who judges the edit doesn't know enough about the content.
The English Wikipedia states that vandalism is any addition, removal, or change of content made in a deliberate attempt to compromise the integrity of Wikipedia. Vandalism clearly belongs to the second edit type I identified above, per definition. Most vandalistic edits will reduce 1.1 (encyclopaedicity of content). As we can see, there are more edit types removing quality than just vandalism.
Since vandalism is easy to spot, the long-term removal of quality due to plain vandalism is probably insignificant compared to the other types of destructive edits (edit type 2) and both destructive and constructive edits (edit type 4). Searching for ways to increase quality we should therefore focus on the other quality-removing edits and the editors that make these edits.
Let's first look at the process itself. If we assume the rate of change and the percentages of all four edit types to remain constant, the rates of quality increase and quality decrease remain constant too. This means any page in the project is subject to a slow decay in quality, which I call wiki-erosion. Quality is guarded by the community though. The ability to revert destructive edits of all kinds is thus related to the amount of knowledge in the community. This means destruction can only go as far as a certain quality level. If the community is larger, that level will be higher, if it is smaller, the level will be lower. Thus, in the long term, quality of any sort will stand a larger risk of being destroyed at small projects, even though the wiki-erosion rate is much smaller.
Earlier strategies to increase quality
- AndyZ's suggestions for quality standards at wp-en
- User:Bhneihouse - Barriers to quality
- Wikiproject Vandalism studies at wp-en
- User:Opabinia regalis - recent mainspace changes survey. Limited (n = 250) survey of recent changes at wp-en, mainly trying to establish their type. Note that the classes aren't objective though.
- User:Woudloper - Onvertrouwde bewerkingen. A survey (n = 750) of the type of recent changes by IP-users and new users at wp-nl (in Dutch).
- Free translation of nl:Wikipedia:Schrijfwedstrijd#Beoordelingscriteria
- Actually, some members of the jury have afterwards stated they won't include this requirement in their judgement. Being honest, I think it is the least important requirement for the contest. If I would rewrite the thing, I would include a sentence about the title choice to make it more relevant.
- "The simplest statements evoke the most wisdom: verbose language and fancy technical words are used to convey shallow thought." (Day, 1994)
- I use the word 'true believer' in the meaning of any user with a strong POV. See also en:Wikipedia:The Truth and Piotrus' essay "On the most dangerous of mindsets" for assessments of this type of user.
- See en:Wikipedia:Vandalism.