Wiki stats other than the article count

From Meta, a Wikimedia project coordination wiki

Maintaining additional statistics is a distinct issue from article count, though often serves the same purpose (inter-language comparison, comparison with commercial encyclopaedia, marking milestones).


Voting doesn't really make sense on these, as they are not mutually exclusive: every single one can be produced and provided if anyone is interested. --Brion VIBBER 02:31 14 Mar 2003 (UTC)

I agree. Consider the votes more an expression of interest in any of those options being implemented, not a vote to actually implement them. --Eloquence 21:38 14 Mar 2003 (UTC)

No additional statistics: No additional stats need to be provided.

Votes for this option:


Total size: Total size of all articles (uncompressed) shown

Pro:

  • Another measure of achievement
  • May motivate some to write longer than create new stubbish article
  • Might be a more fair comparison between Wikipedias with differing stub/dictionary policies

Contra:

  • May motivate large datadumps rather than real article writing

Votes for this option:


Total pages in A4 paper: Total number of pages in terms of A4 paper is counted

Pro:

  • Easier to compare with paper-based encyclopaedias
  • Another measure of achievement
  • May motivate some to write longer than create new stubbish article

Contra:

  • May need language specific coefficients to convert byte size into A4 papers
  • Font size and other typographical setting affects much
  • A4 paper is not well known in North America
  • Depends on font size / font type / page formatting / ...

Votes for this option:


Total number and frequency table of words: The number of words of each article is counted. Offer total number of words and a graphical histogram.

Pro:

  • more common measure for text than number of bytes
  • you can also show the growth of single articles

Contra:

  • Number of words depends on the language

Votes for this option:


Frequency Table ([1]) by 50 bytes: Number of article of different size-ranges (1-50, 51-100, ...951-1000, 1000-1050, ... 29950-30000, 30000+) is calculated once a week.

Yes, but rather than a table, this should be plotted as a graph! That avoids the problem of having a huge table to get up to the 30,000 size, and it allows us to present the information visually - a picture is worth a thousand words...

Pro:

  • Give more in-depth understanding of each language-part of Wikipedia
  • Especially useful for monitoring behaviour of small articles.

Contra:

  • Not sensitive to hour-to-hour activities
    • Retort: Nobody needs these statistics so frequently
  • Not useful over entire range; articles up to 30,000 bytes would be in 600 ranges.

Votes for this option:


Frequency Table by 250/1000bytes: Number of article of different size-ranges (1-250, 251-500, 501-750, 751-1000, 1001-2000, ...9001-10000, 10001+) is calculated once a month.

Pro:

  • Give more in-depth understanding of each language-part of Wikipedia
  • A comprehensive picture behind the "article count" can be shown
  • Capturing various aspects of the growth, it encourages writers to improve existing articles rather than just creating new ones.
  • Better than the too-fine (1-50/51-100/...) table.

Contra:

  • The general 'size'-related issues (see above).

Votes for this option:


Frequency Table by vigintiles: Shows size of articles N/20, 2N/20, ... 19N/20 where N is total number of articles.

Pro:

  • I think this would give a good comparison between languages of the quality of articles in each language as opposed to just the number of articles in each language.

Contra:

  • Relatively unknown method

Votes for this option: