Article count reform vote

See Article count reform

As of March 13th 2003, 20:00 GMT, no new options may be proposed for this voting round. We will now vote on which solution should be used. The voting system we use is average voting (also known as range voting). Simply put, you assign each option you care about a value from 1 to 6 (1 being very good, 6 very bad); the mean average for each option is calculated, and the best-rated option wins. The deadline for determining the outcome of the vote is Monday, March 17th, 20:00 GMT.

Duplicate and invalid (such as 7 or -2) votes are not counted.

Proposed solutions

VOTING IS NOW CLOSED. DO NOT ADD ANY VOTES BELOW.

Note that some of the solutions below can be combined; specifically, an article size criterion can be combined with one or more extra restriction(s) on what constitutes an article.

Article size

Please note: All article count proposals (including non-zero) do not count articles that contain nothing but whitespace (blanks, tabs, newlines), and do not count redirects in such manner.

Non-zero: An article is counted if it is greater than 0 bytes per page.

Pro:

Very simple, straightforward definition. It may count some stub articles, but in general, even stubs should be considered articles -- or removed. Blanked articles are not counted.

Contras:

Perhaps the threshold should be slightly higher, to avoid counting recently added nonsense pages.
- Retort: this is a very small noise fluctuation (mostly less than 10%), and best handled by blanking, removing, or turning them into articles as they are found.
Bot-generated articles are counted, which may be considered a problem if the count is supposed to reflect actual human effort.
- Retort: but if it's meant to reflect content, such articles should be counted. And human effort can be seen in edit counts.
articles that have been incorrectly blanked (eg a single space or a single new line) will show up in the count.
- Trailing whitespace is removed on save, so such a page would be zero bytes long.

Votes for this option:

1 (very good): Brion VIBBER Kurt Jansson Jan Pedersen Didier Urosp Schewek, TakuyaMurata Vulture RoseParks, Toby
2: Eloquence Eclecticology Yann Aldie Formulax Snark Lorenzarius, Maveric149 Athymik
3: Snoyes ffx The Anome Ryo Andre Engels Geoffrey
4: Cordyph Nicolas
5: Tomos Scipius
6 (very bad): Ams80 Ducker Aoineko Smurf Wizzer MyRedDice Koyaanis Qatsi Quercusrobur Anthere Giskart

5 bytes: An article is counted if it is greater than 5 bytes in size no matter what.

Pro:

Eliminates essentially blanked articles inadvertently left with blank spaces or lines.

- This is a non-problem, see above.

Votes for this option:

1 (very good): Eclecticology Formulax
2: Cordyph Lorenzarius Vulture, Toby
3: Snoyes ffx Brion VIBBER, TakuyaMurata Menchi RoseParks
4: Eloquence Aldie Kurt Jansson Wizzer Andre Engels
5: Tomos Nicolas Urosp
6 (very bad): Yann Ams80 Ducker Smurf Ryo Jan Pedersen Didier Schewek Geoffrey MyRedDice Koyaanis Qatsi, Maveric149 Anthere Aoineko Giskart Scipius

20 bytes: An article is counted if it is greater than 20 bytes in size.

Pro:

Could solve the problem of nonsense articles being counted.

Contra:

20 bytes is shorter |<-- than this line, not a significant improvement in thresholding.
Easy to manipulate
Too big; 20 bytes of Kanji can be enough for an article.
- Or maybe not. Examples: 21bytes, 20 bytes

Votes for this option:

1 (very good): Eloquence Snoyes MyRedDice Formulax Aldie Cordyph
2: Lorenzarius Menchi Quercusrobur, Toby
3: ffx Brion VIBBER Wizzer Andre Engels Urosp Geoffrey Vulture RoseParks
4:Eclecticology The Anome, TakuyaMurata
5: Ams80 Ducker Tomos Kurt Jansson Aoineko Nicolas Koyaanis Qatsi Anthere Scipius
6 (very bad): Yann Smurf Ryo Jan Pedersen Didier Schewek, Maveric149 Giskart

100 bytes: An article is counted if it is greater than 100 bytes in size.

Pro:

The smallest stubs (one line or 2 short lines) are no longer counted

Contra:

Some consider small stubs legitimate articles too.

Votes for this option:

1 (very good): Yann Smurf Andre Engels Geoffrey MyRedDice Koyaanis Qatsi Anthere
2: Eloquence ffx Formulax Cordyph Wizzer Aoineko Scipius
3: Brion VIBBER The Anome Ryo Nicolas Menchi RoseParks
4: Snoyes Ams80 Ducker Vulture Giskart
5: Tomos, TakuyaMurata, Toby
6 (very bad):Eclecticology Aldie Kurt Jansson Jan Pedersen Didier Lorenzarius Urosp Schewek, Maveric149

250 bytes: An article is counted if it is greater than 250 bytes in size.

Pro:

Stub articles are no longer counted.

Contra:

Some legitimate articles might always be this small.
- retort: That's questionable - perhaps we should merge content that will be a permenant stub
Some crap articles including datadump'd stub are bigger than this size.

Votes for this option:

1 (very good): MyRedDice Koyaanis Qatsi Anthere Giskart Scipius
2: Yann Ams80 ffx Formulax Andre Engels Nicolas Geoffrey
3: Brion VIBBER Ducker Tomos Wizzer
4: Eloquence Ryo
5: Snoyes Aoineko Cordyph Urosp
6 (very bad):Eclecticology Aldie Kurt Jansson Smurf Jan Pedersen Didier Lorenzarius Schewek, TakuyaMurata Menchi Vulture RoseParks, Maveric149 Quercusrobur, Toby

500 bytes: An article is counted if it is greater than 500 bytes in size.

Pro:

Stub articles are no longer counted.

Contra:

threshold is too high.
The number of articles in en wikipedia become under 100,000.
- retort:delaying the implementation may solve the problem.
- retort:If the number of usable, information-containing articles really is less than 100 000, so be it; we don't need to force lower thresholds to get nice, meaningless numbers.

Votes for this option:

1 (very good): Ams80 Tomos, Maveric149 Anthere Scipius
2: Ducker Koyaanis Qatsi
3: Formulax Nicolas Giskart
4: ffx Andre Engels
5: Eloquence Geoffrey
6 (very bad):Eclecticology Snoyes Yann Brion VIBBER Aldie Kurt Jansson Aoineko Smurf Cordyph Wizzer Ryo Jan Pedersen Didier Lorenzarius Urosp Schewek, TakuyaMurata Menchi Vulture MyRedDice RoseParks Quercusrobur, Toby

Dynamic: For example, calculate the average size of stub articles then use it as threshold

Pro:

Instead of arbitrary number, use the number at least based on statical data.

Contra:

Determine what is a stub may be a problem (see below)
- Retort: Use the <stub> tag or flag.
The threshold varies in time

Votes for this option:

1 (very good): Ducker, TakuyaMurata
2: Geoffrey
3: MyRedDice
4: Eloquence Tomos Formulax Lorenzarius
5: Snoyes Ams80 The Anome Cordyph Ryo, Toby
6 (very bad):Eclecticology Yann Brion VIBBER Aldie Kurt Jansson Smurf Wizzer Jan Pedersen Andre Engels Didier Nicolas Urosp Schewek Vulture RoseParks Koyaanis Qatsi, Maveric149 Anthere 62.255.64.6 Aoineko Giskart Scipius

compaction of language: There seems to be an issue with compactness of different written languages. If we choose to define articles by being a certain size, i.e. 250bytes, then perhaps there should be a scaling factor for each language. To do this we could take a passage in English with a certain size, say 2500bytes and get a native speaker of each language to translate it to their language. If in Japanese (for example) the translation is 1500bytes then the criteria for an article in Japanese should be 150 bytes instead of 250bytes. [Or, idea just came to me, find a very common text, War and Peace or a Dickens novel or even the Bible and use that as the base comparison.]

The second method seems better than the first, since a translation will be different (I think generally larger) than an original text. Also it is more objective.

Note: could use the translations file for basic language-verbosity info to a 1st approximation

Votes for this option:

1 (very good): Ams80 Ducker Formulax Geoffrey, MyRedDice TakuyaMurata
2: The Anome
3: Nicolas
4: Ryo Lorenzarius
5: Eloquence Cordyph Wizzer Andre Engels Vulture, Toby
6 (very bad): Yann Brion VIBBER Tomos Aldie Kurt Jansson Smurf Jan Pedersen Didier Urosp Schewek RoseParks, Maveric149 Anthere Athymik Aoineko Giskart Scipius

Further restrictions

No further restrictions: Only the above size criterion should be used.

Pro:

simplicity.

Contra:

article definition not very accurate.
we still must not count redirects, user pages, talk pages, etc.

Votes for this option:

1 (very good):
2: Andre Engels, Toby
3: Eclecticology Urosp
4: Eloquence Snoyes Kurt Jansson Anthere Ams80
5: The Anome Cordyph
6 (very bad): Ducker Aldie Brion VIBBER (This suggestion doesn't seem to make sense. The article count is, by definition, a count of non-redirect pages in article namespace. This seems to suggest counting redirects, user pages, and talk pages, which defeats the purpose. It may have been miswritten.) Tomos Smurf Formulax Wizzer Ryo Didier Lorenzarius Nicolas Geoffrey, TakuyaMurata Vulture MyRedDice RoseParks, Maveric149 Athymik Aoineko Giskart Scipius

Comma: Only an article that includes a comma is counted.

Pro:

Compatible with the current system
History of article statistics doesn't break
Normal articles in English must contain comma.

Contra:

Unfair because some language notoriously Japanese don't use comma much.
- Retort: Those languages could be excluded from the comma rule.
A bizarre criterion that can be thwarted by adding a comma to every article that lacks one.
- Retort: Not more bizarre then trying to define an article by its size or the number of edits. Every criterion can be thwarted.

Votes for this option:

1 (very good): Kurt Jansson
2:
3:
4: Andre Engels Geoffrey Anthere
5: Eloquence ffx Ducker Menchi Scipius
6 (very bad): Eclecticology Snoyes Yann Brion VIBBER Aldie The Anome Tomos Aoineko Smurf Formulax Cordyph Wizzer Ryo Didier Lorenzarius Nicolas Urosp Schewek, TakuyaMurata Vulture MyRedDice RoseParks Koyaanis Qatsi, Maveric149 Quercusrobur Athymik Ams80, Toby Giskart

Language-dependent punctuation: Only an article that includes particular punctuation dependent on language is counted. (e.g. ?or ? in Japanese)

Pro:

en wikipedia remain untouched
most of languages use certain punctuation

Contra:

Requires an internal decision for each language

Votes for this option:

1 (very good): Kurt Jansson Menchi Anthere
2: TakuyaMurata
3: Tomos Cordyph Lorenzarius
4: Eloquence ffx Ducker Urosp Aoineko
5: Ryo Andre Engels Geoffrey Vulture, Toby Scipius
6 (very bad): Eclecticology Snoyes Brion VIBBER Aldie The Anome Smurf Formulax Wizzer Didier Nicolas Schewek MyRedDice RoseParks Koyaanis Qatsi, Maveric149 Athymik Ams80 Giskart

Link: Only pages with at least one link (existing or broken) are counted.

Pro:

Would remove unchecked newbie pages, as well as some arguably non-encyclopedic content, while keeping in most legitimate articles.

Contra:

Might still lose some legitimate articles.

Votes for this option:

1 (very good): Yann The Anome Aoineko Ryo Andre Engels Didier Geoffrey MyRedDice Anthere, Toby Scipius
2: Eloquence Snoyes Cordyph Urosp
3: ffx Brion VIBBER Tomos Wizzer Lorenzarius Nicolas RoseParks
4: Aldie Ducker Ams80 Giskart
5: Vulture Athymik
6 (very bad):Eclecticology Kurt Jansson Smurf Formulax Jan Pedersen Schewek, TakuyaMurata, Maveric149

Stub flag: Stub articles are flagged in some unique way so that an alternative count can be provided that does not include stubs. For example, by linking to them from en:find or fix a stub (or equivalent). Other similar flags could be necessary (to exclude various lists, which are not stubs), but they can be implemented the same way.

Pro:

More accurate definition of "article", stub criterion is provided by humans, not by some arbitrary byte size.
On en:wikipedia we already link stubs in this way

Contra:

Extra effort.
- Not really, we already link to "This article is a stub from many stub pages. This information is stored already (try "What links here" on the stub page), it just needs to be standardized.
There are many stubs that contain useful information
- So it might be useful to provide both counts.
More confusing meta information in articles for new editors
- It's already there.
Defining what qualifies for a stub flag would be a whole new debate
- We already do this.
(similar to the <ARTICLE>-tag further down)
- No, not at all, we already do flag stubs, we just don't use the info.

Votes for this option:

1 (very good): Eloquence Aldie Smurf MyRedDice
2: ffx The Anome Formulax Geoffrey
3: Ducker Scipius
4: Wizzer Ryo
5: Snoyes, Toby
6 (very bad):Eclecticology Yann Brion VIBBER Tomos Kurt Jansson Aoineko Cordyph Andre Engels Didier Lorenzarius Nicolas Urosp Schewek, TakuyaMurata Vulture RoseParks, Maveric149 Anthere Ams80 Giskart

Minimum edits: An article is counted only if a certain number of edits has been made (e.g. 2).

Pro:

Bot articles no longer counted.

Contra:

Some people want bot articles to be counted
- Suggestion: we could offer a variety of counts, rather than just one
Article may be perfect even though it has only been edited once. (eg moved from Nupedia, etc)
We may need another round of voting to set the required # of edits

Votes for this option:

1 (very good): Giskart
2: Eloquence Geoffrey
3: Nicolas Urosp
4: Tomos Wizzer
5: Snoyes ffx Formulax Vulture, Toby
6 (very bad):Eclecticology Yann Brion VIBBER Aldie Ducker Kurt Jansson Aoineko Smurf Cordyph Ryo Andre Engels Didier Lorenzarius Schewek, TakuyaMurata RoseParks, Maveric149 Anthere Ams80 Scipius

Minimum contributors: An article is counted only if a certain number of contributors have edited it (e.g. 3).

Pro:

Bot articles no longer counted.
By agreeing on the number of contributors needed, we can assume a "certain" degree of quality for the articles (this is an average, of course), meaning the article has been read, re-thought, re-modeled, etc by different people.

Contra:

Some people want bot articles to be counted
- Suggestion: we could offer a variety of counts, rather than just one
Article may be perfect even though it has only been edited once. (eg moved from Nupedia, etc)
We may need another round of voting to set the required # of contributors
Is more a 'quality' measure.

Votes for this option:

1 (very good): MyRedDice Anthere (with a number set differently depending on wikipedias:-)) Giskart
2: Eloquence The Anome Nicolas Geoffrey
3: Wizzer Koyaanis Qatsi
4: ffx Tomos Urosp
5: Snoyes Formulax Vulture
6 (very bad):Eclecticology Yann Brion VIBBER Aldie Ducker Kurt Jansson Aoineko Smurf Cordyph Ryo Andre Engels Didier Lorenzarius Schewek, TakuyaMurata RoseParks, Maveric149 Ams80, Toby Scipius

Two paragraphs: An article is counted only when two-paragraph long or more.

Pro:

Encourages people to break up paragraphs appropriately, which is good style

Contra:

English encyclopaedia contains many one-paragraph articles
- Retort: but how many of these are stubs that we don't want to count?

Votes for this option:

1 (very good): Ducker MyRedDice
2: The Anome
3: Tomos Geoffrey Vulture
4: Formulax Cordyph
5: Eloquence Snoyes ffx Andre Engels Nicolas Ams80
6 (very bad):Eclecticology Yann Brion VIBBER Aldie Kurt Jansson Aoineko Smurf Wizzer Ryo Didier Lorenzarius Urosp Schewek, TakuyaMurata RoseParks, Maveric149 Anthere, Toby Giskart Scipius

<ARTICLE> Tag: A tag is added to all entries that can be considerd articles.

Pro:

Small articles will be included
Lists can be excluded, if so desired.

Contra:

Some people may put the tag on things that others may not consider articles.
Extra effort.
More confusing meta information in articles for new editors
- Retort: Not very complicated or hard to understand

Votes for this option:

1 (very good): ffx Aldie Wizzer
2: Formulax Geoffrey
3: Ducker
4:
5: Urosp Athymik
6 (very bad): Eloquence Eclecticology Snoyes Yann Brion VIBBER Tomos Kurt Jansson Aoineko Smurf Cordyph Ryo Andre Engels Didier Lorenzarius Nicolas Schewek, TakuyaMurata Vulture MyRedDice RoseParks, Maveric149 Anthere Ams80, Toby Giskart Scipius

Independant systems for each wikipedia

Choice being decided by each wikipedia. Most could have the same system in the end

Votes for this option:

1 (very good): Aoineko Schewek , TakuyaMurata, Maveric149 Athymik, Toby
2: ffx Ryo
3: Ducker Cordyph
4: Vulture
5: Andre Engels Urosp
6 (very bad): Yann Brion VIBBER Aldie Tomos Kurt Jansson Smurf Formulax Wizzer Didier Lorenzarius Nicolas Geoffrey (This might impede the same software being used in all versions; if we decide here, that system can be hard-coded in.) MyRedDice RoseParks Anthere Ams80 Giskart Scipius

Divide the size of database by certain byte-size to determine the number of articles.

Pro:

Simple to calculate
More difficult to manipulate the number of articles.

Contra:

Estimation lacks accuracy, especially because it includes talk pages.

Votes for this option:

1 (very good):
2:
3: TakuyaMurata
4: The Anome Tomos, Maveric149
5: Eloquence Snoyes Ducker Cordyph Wizzer Nicolas Geoffrey Giskart
6 (very bad): Eclecticology Aldie Aoineko Brion VIBBER (offtopic; this does not in any way measure the number of articles) Smurf Formulax Ryo Andre Engels Didier Lorenzarius Urosp Vulture MyRedDice RoseParks Anthere Ams80, Toby Scipius