User:Eloquence/WikiQA

Still a work in progress. Good enough for reading, but some more examples will follow.

As of September 2006, there are plans to implement a Quality Assurance model based on two type of revision flags for "sighted revisions" [1] and "reviewed revisions" [2]. Whereas revision sighting is meant to be a simple mechanism to reduce the visibility of vandalism, article review is oriented towards identifying "accurate" articles without major gaps.

These proposed changes have the potential to dramatically alter Wikipedia's participatory model. For the better, because they could increase our credibility and attract more qualified writers. For the worse, because they could reduce the level of participation, cause frustration, and lead to a shift towards a much more restricted model of editing and reviewing article than is currently practiced. I would like to comment on some aspects of these changes, propose specific modifications, and suggest some long term strategic considerations.

Showing IP users the last sighted revision[edit]

Recommendations in a nutshell:

Quality protection for some pages. On selected pages, as an alternative to page protection, all users should see the last sighted revision.
Instant diffs. On quality-protected pages, trusted users who are allowed to mark revisions as sighted should see a diff to the last sighted revision with a link to instantly approve the changes.
Same revisions served for all users. Under no circumstances should unregistered users be shown different revisions of an article than registered and logged in users.

One proposed aspect of "sighted revision" is to show users who are not logged in the last revision of the article which has a "sighted" flag. While the proposal suggests that a global setting could adjust this preference, it also recommends enabling it for the German Wikipedia.

One of the most attractive aspects of wiki editing is that changes to an article are visible immediately. There is no pre-approval process and no waiting period. When you want to make a correction, you can do so quickly, and this feeling of instant satisfaction, arguably, drives users towards contributing more. Understandably, this model has its drawbacks: malicious changes are also immediately visible, and currently recently vandalized revisions are not distinguished in any way from those which have received extensive community feedback and review.

The proposal to flag individual revisions as "sighted" seeks to address this by identifying those revisions which have at least passed essential review. The right to flag revisions in that way is to be assigned broadly (it is not clear what the requirements will be, but they may very well be far below the threshold for adminship), and users with that privilege do not have to flag their own edits, as they are assumed to be trustworthy.

It is not the first proposal of that nature. Indeed, MediaWiki has long supported a feature called "Recent Changes Patrolling", where any user can flag changes from the Special:Recentchanges page as "patrolled" (see Help:Patrolled edit). Doing so has no consequences beyond the "Recent changes" page; the feature is designed to more efficiently manage the process of reviewing edits made to a wiki. The feature has received very little use, though some wiki communities have used it to focus on a subset of edits: new page creations.

The "sighted revision" feature is more sophisticated, in that information about sighted revisions is to be shown in multiple places. The fact that edits by trusted users do not need to be flagged also greatly reduces the workload of reviewing changes. On the other hand, the existing "RC patrol" should make it obvious that there are serious scalability issues with any review process.

If the process does not scale, there is a risk of users waiting for hours, days or even weeks for their changes to articles to be effectively approved. In the model where IP users do not see the latest version of an article, such a waiting period can be intensely frustrating. It therefore seems unwise to enable any kind of global setting before the scalability of the process is absolutely secure and the waiting period for sighting changes hardly ever exceeds a few minutes.

Even with a scalable process, the idea of showing logged in users a different revision than IP users is highly questionable. In such a model, URLs no longer reliably identify the same resource. Wherever a URL is used as a reference point, the information it refers to would depend on whether the user is a registered and logged in user of the wiki or not. This could lead to confusing situations in cases where significant changes have not yet been sighted. Discussions between IP users and registered users may be hampered by misunderstandings about revisions.

A much simpler alternative exists: if there is a global setting to show the latest sighted revision, it should apply to all users. However, registered users who are part of the sighting group should see a diff to the latest revision on pages with unreviewed changes. This would speed up changeset processing, and make the process of review more transparent.

However, as noted above, a global setting to show sighted revisions in preference to unsighted ones should not be enabled unless and until it is found to scale sufficiently well, and to not have a dramatic negative impact on the user experience. Instead, revision preference should first be enabled on a per-page level, allowing administrators to "quality protect" pages. This would be an alternative to full protection or semi-protection, and allow edits to be made where it is currently impossible. The criteria for quality protecting pages could be expanded over time, allowing for community-directed application of the functionality, rather than an a priori assumption of scalability.

Changes vs. revisions[edit]

Recommendation:

Useful diffs. The functionality to flag revisions as sighted should only be available from diffs to the last sighted revision, to avoid gaps in the review process.

The idea to show the latest revision to registered user highlights a critical flaw of thinking in the current proposal: it is highly revision-centric, whereas any scalable process of change review needs to be changeset-centric. Whether it is for the purpose of sighting or reviewing an article, iterations of either process should focus on identifying whether the changes (diffs) since the last review are beneficial. This has user interface consequences, as diffs need to be integrated into the process wherever possible. For example, it might be valuable to show a trusted user a diff when they are editing an article with unreviewed changes. That way, they can quickly indicate whether the most recent edits are trustworthy.

It is also critical that the user interface to flag revisions as being sighted is only available from pages showing useful diffs. For instance, a page showing only the diff to the most recent edit should not be considered useful when there are older edits which have not been sighted. Otherwise, earlier vandalism is easily overlooked in the process of flagging revisions.

Reviewing revisions[edit]

Recommendation:

Limited use. This feature should only be used to enrich existing community processes such as the "Featured article candidates" with machine-readable metadata, or new community processes to determine the best known available revision of an article.

The proposal also includes the notion of a higher quality review flag, the "reviewed revision." What exactly the quality standards for a "reviewed revision" would be is left open. However, the proposal recommends that only a selected group of users, "reviewers", are allowed to flag articles as being "free of errors and significant gaps."

I consider this, by far, the most problematic part of the stable versions proposal for the following key reasons:

it does not facilitate consensus-based community review processes
it creates a new class of users, without consideration of the social impact and potential for bias
it does not support distributed review work in small chunks, but only processes which are focused on single persons, which are known to scale poorly
its notion of quality is a simplistic binary flag: an article is either "reviewed" or not

Thus, I believe that the quality of the metadata that will be collected through this process must be considered to be poor at best. Nevertheless, some of these flaws apply only to the proposed use of the feature that is described in the proposal, and are not necessarily implicit in the implementation. For instance, the right to flag articles as reviewed could be a purely bureaucratic right, i.e., the users who exercise it must do so because a certain process of review has been followed. For example, the review feature could be used to enhance existing community review processes, such as Wikipedia:Featured article candidates and its equivalents in various languages.

As such, the flag could be seen as a machine-readable indication that an article revision has reached the gold standard of community review. For any more ambiguous statement about the quality of an article revision, I consider the proposed feature to be far too simplistic and not sufficiently collaborative.

An alternative would be to use the flag to indicate the "best available" revision of an article, as determined by a new process of community quality annotation. A "best available" revision could be identified for a far greater number of articles.

Quality annotation[edit]

Recommendation:

Positive, machine-readable quality annotation. Either by means of adding (reader-invisible) templates within the wiki source text or outside the wiki page model.

The proposed "reviewed revision" model does not address the need to systematically distribute the process of annotating and improving the quality of wiki pages. However, it can be used to express the outcome of such processes. Within the existing Wikipedia model, we can clearly distinguish examples of positive and negative quality annotation. Positive quality annotation includes featured article status and good article status. Negative quality annotation include per-article tags such as "NPOV dispute" tags for articles whose neutrality is disputed, or per-segment tags such as Template:Fact for claims lacking citations.

Positive tags are not typically added to articles, but only to the discussion page, and their number is very limited. Negative tags are numerous and very specific; note, for example, the high number of "cleanup tags", including statements such as "This article uses excessive clichés and jargon." or "This article contains passages or phrases in German that are in need of copyediting."

On the other hand, there are presently no mechanisms to express that an article's or section's sources have been verified, that its neutrality has been checked, that it is found to be comprehensive, that it complies with the manual of style, and so on. Instead, these observations are expected to be made in toto during processes like "featured article candidates" and "peer review". The above-mentioned new review process is even worse, in that it expects them to be made by a single person.

Why have Wikipedians not already created such positive quality annotation tags? A number of reasons speak against it:

The presence of visible meta-tags in an article has, to the reader, almost become synonymous with problems in an article.
Authors of an article are likely to make positive quality judgments about it, especially when a revision that meets all positive criteria is given special treatment. Negative quality judgments, on the other hand, are typically made by people who were not involved in the initial authoring process.
The validity of the annotations will quickly expire as the article itself is changed. This is also a problem for negative annotation, but such tags are used infrequently, whereas the goal of positive annotation would be full coverage of an entire article.
Even if the tags were made invisible to casual reader, their presence in the wiki page source could still be a distraction to editors.

Is it, then, at all possible to devise a quality annotation model that does not suffer from these problems? I believe so. Further, I posit that such a model should meet the following requirements:

useful for positive and negative quality annotations
revision-aware, i.e., it should be possible to view the quality annotations for any revision of the page
section-aware, i.e., there should be a machine-readable distinction between annotations that apply to a whole page, or only a subset

Any quality annotation can essentially be expressed as a tuple: a person expresses a statement about an object. The object is a specific revision of a wiki page, either the entire revision, or a subset thereof. The question is then where such annotations will be made by the user. I believe that a case can be made for the annotations to be made in the same location where the annotated content is being authored.

When authors do not see existing quality annotations while modifying content, the risk is otherwise great that they will make changes that break existing annotation-references, without either removing the annotation, or applying it to the changed article. For example: If a section of an article has been marked to have been fact-checked, but the section is then split up, what should happen with its annotation? Such decisions are best left to the editor making the changes, or subsequent editors noticing them.

A simple model of positive quality annotation could use existing the existing template logic of MediaWiki, e.g., as follows:

  {{QA|
    Sources checked   = [[User:Erik|Erik]] 14:17, 11 September 2006 (CEST) ; [[User:Klaus|Klaus]] 14:16, 11 September 2006 (CEST)|
    NPOV disputed  = [[User:Heinz|Heinz]] 14:16, 11 September 2006 (CEST)|
    Style checked = ~~~~
  }}

To distinguish between article and section level annotations, different templates "QA" and "QA section", could be created. The output of these templates could be rendered invisible to editors by default. In order to meet the other requirements described above, changes to MediaWiki would have to be made:

In order to make quality annotations machine-readable, template parameters and values, ideally on a section-level, would have to be tracked in the database. This would be useful in any case. It might be possible to use Semantic MediaWiki for this purpose.
In order to make quality annotations revision-aware, there would have to be a way to get from a signature to a specific revision of a page. The timestamped signature is not fully precise (it does not include seconds), but retrieving the first change made by a given user in a given minute should be sufficient.
To make the approach easy to use, it would be useful to be able to configure different default stylesheets for readers and editors, or to make the selection of custom stylesheets more intuitive.

This leaves us with the problem of cluttering up the source text of wiki pages with quality annotations. The extent of this problem depends on the amount of section-level annotations, as article-level annotations can all be stored in the same location (e.g. the beginning or the end of the page). Section-level annotations might in fact act as a motivation factor to review the entire article, in order to move the annotations to the end. However, ideally, it should be possible to fold quality annotations while editing the text.

Unfortunately, while folding is a standard feature in many source code editors, it is missing in the standard browser textarea control. It might be possible to implement folding-like mechanisms in WYSIWYG editors. As it stands, the minor annoyance of quality annotations on the section level cannot be easily dealt with. If the problem turns out to be too great, it might be advisable to only use page-level annotations for the time being.

The output of the annotation process could be used by reviewers to identify the best available revision of an article, especially if such annotations could be easily tracked across revisions. Reviewers would be uninvolved users who only make judgment calls about the quality of the metadata, rather than the quality of the article. For instance, they would take into account whether the annotations have been made by the authors of the article, or by independent reviewers; whether there are multiple reviewers supporting the key assertions; whether there are ongoing disputes, and so on.

Involvement of outside experts[edit]

While a wiki-internal QA model might work well to split the work of annotating the quality of an article into small packages, it does not address the need of getting the input from people who have no involvement in the wiki process. Such people will likely always be turned away by complex processes, even if they can be made more accessible using WYSIWYG technology. Essentially, the most we can ask from an uninvolved expert is to give us their opinion about a given revision of an article.

It might nevertheless be useful to facilitate the process of such feedback, for instance, by adding a "submit review" tab to the navigation bar of the standard skin for unregistered users. When clicked, the article would present a form asking the user for information about themselves (any academic affiliation, contact information, and so on), and provide them with a simple text field to state their opinion about the article. (Uploading a document such as a Word or OpenOffice.org file might also be supported, as many users prefer to type in a word processor.)

In addition, the user would be asked to state whether they want their review to appear publicly on the site, or to be sent to a private mailing list where only trusted editors will see it. Public reviews would be attached to the discussion page of the article (sans any private information) in addition to being sent to the private list. In either case, the community of editors could look into the review and apply recommended changes as needed. Naturally, the review form should record which revision of the article the reviewer has looked at.