List of Wikipedias by sample of articles

From Meta, a Wikimedia project coordination wiki
This is an archived version of this page, as edited by MarsRover (talk | contribs) at 06:10, 19 April 2008 (fix language links). It may differ significantly from the current version.

This page contains a list of the largest Wikipedias under the auspices of the Wikimedia Foundation for various languages. Test Wikipedias are listed at the Wikimedia Incubator Wiki project.

This list of Wikipedias is based on the List of articles every Wikipedia should have (total: 1052 on the 3th of April) as a sample, but the actual list which is used is at the end of List of Wikipedias by sample of articles/Source code and can be a little different. For every Wikipedia, the articles in this sample list is retrieved (based on interwiki links from the English Wikipedia) and the number of characters is calculated (minus "comments" and the "interwiki" text at the bottom of the article). The size of each article is then adjusted for each language by multiplying it by the language weight. The articles are divided in four classes: "absent" (i.e. non-existing; size = 0), "stubs" (size in characters inferior to 10,000), "articles" (size between 10,000 and 30,000) and "long articles" (size superior to 30,000). The average weighted size of the non-absent articles in the sample is also calculated. Finally, a score is computed, based on the following formula: rawscore = stubs + articles*4 + long.articles*9. In order to have a consistent scale the raw score is normalized by dividing by the maximum score and multiplying by 100. The maximum score would be maxscore = (absent + stubs + articles + long.articles)*9. The final score is the following score = rawscore / maxscore * 100. The language editions are then listed in order of decreasing score.

A copy of the programm used to obtain this list is in List of Wikipedias by sample of articles/Source code.

Absent articles for major Wikipedias are in List of Wikipedias by sample of articles/Absent Articles.

See also:


Last Update: 18 April 2008

Wiki Language Weight Average Article
Size (wt.chars)
Absent
(0k)
Stubs
(< 10k)
Articles
(10-30k)
Long Art.
(> 30k)
Score
1 en English 1.0 44 119 0 71 331 650 76.52
2 de Deutsch 1.0 33 099 6 190 432 424 60.56
3 fr Français 1.0 28 427 4 273 439 336 53.37
4 es Español 1.1 25 154 2 325 457 268 48.22
5 it Italiano 1.1 22 389 29 339 436 247 45.52
6 zh 中文 3.7 23 562 1 429 376 245 43.75
7 ru Русский 1.4 21 455 1 424 407 220 42.59
8 ja 日本語 1.9 15 918 10 485 424 133 35.68
9 pt Português 1.1 15 335 14 529 381 128 33.85
10 pl Polski 1.1 13 803 14 588 343 107 30.87
11 hu Magyar 1.1 14 093 93 544 298 115 29.32
12 fi Suomi 1.1 13 056 35 615 300 102 28.87
13 sv Svenska 1.1 11 719 12 666 281 93 27.75
14 cs Čeština 1.3 12 221 61 590 316 85 27.66
15 he עברית 1.2 11 458 37 595 352 68 27.62
16 nl Nederlands 0.9 11 100 20 618 353 61 27.24
17 vi Tiếng Việt 1.1 15 000 220 477 241 114 26.06
18 no Norsk (Bokmål) 1.2 10 839 45 675 255 77 25.22
19 ca Català 1.1 9 811 6 728 265 53 23.92
20 tr Türkçe 1.3 10 728 66 661 256 63 23.92
21 uk Українська 1.3 9 935 43 695 251 62 23.86
22 sr Српски / Srpski 1.4 11 253 133 611 229 79 23.64
23 hr Hrvatski 1.3 9 042 120 674 206 52 20.76
24 sk Slovenčina 1.3 10 050 179 618 197 57 20.29
25 ro Română 1.1 9 586 170 636 193 53 19.91
26 da Dansk 1.2 8 025 56 790 157 48 19.56
27 el Ελληνικά 1.1 9 891 277 541 179 55 18.50
28 bg Български 1.1 7 796 127 713 177 35 18.34
29 ko 한국어 2.5 7 333 99 757 158 38 18.28
30 id Bahasa Indonesia 1.0* 6 400 95 776 152 28 17.30
31 ar العربية 1.0 5 487 55 834 148 15 16.49
32 sl Slovenščina 1.2 6 834 113 768 153 18 16.29
33 gl Galego 1.0* 7 803 220 635 176 20 16.06
34 eo Esperanto 1.1 5 625 13 926 92 21 15.66
35 th ไทย 1.0 6 000 170 739 121 22 15.01
36 fa فارسی 1.2 5 558 211 707 116 18 14.08
37 ms Bahasa Melayu 1.0* 7 996 374 520 129 29 13.70
38 lt Lietuvių 1.0* 5 248 159 776 108 7 13.45
39 is Íslenska 1.0* 2 749 24 978 45 5 12.71
40 simple Simple English 1.0* 3 610 66 922 59 5 12.71
41 sh Srpskohrvatski / Српскохрватски 1.0* 6 570 371 555 109 17 12.08
42 et Eesti 1.0* 4 367 212 764 61 12 11.82
43 bs Bosanski 1.0* 4 716 265 695 90 2 11.33
44 la Latina 1.1 3 759 231 764 45 12 11.11
45 nn Nynorsk 1.0* 4 482 278 702 61 11 11.04
46 eu Euskara 1.0* 4 105 268 711 65 8 11.02
47 lv Latviešu 1.0* 5 274 392 568 82 10 10.41
48 af Afrikaans 1.0* 8 507 581 358 88 25 9.88
49 mk Македонски 1.0* 4 757 420 565 54 12 9.40
50 ps پښتو 1.0* 33 785 921 23 36 72 8.61
51 ka ქართული 1.0* 3 821 418 591 34 8 8.45
52 cy Cymraeg 1.0* 2 390 362 676 12 2 7.84
53 bn বাংলা 1.0* 3 783 503 498 40 9 7.82
54 br Brezhoneg 1.0* 4 304 497 504 45 6 7.79
55 zh-yue 粵語 3.7** 7 076 659 327 50 16 7.09
56 ta தமிழ் 0.9 3 416 500 520 28 3 6.97
57 lb Lëtzebuergesch 1.0* 4 467 586 421 36 8 6.73
58 yi ייִדיש 1.0* 3 199 547 474 23 7 6.65
59 qu Runa Simi 1.0* 2 658 519 501 32 0 6.64
60 sq Shqip 1.0* 4 156 583 428 36 5 6.52
61 scn Sicilianu 1.0* 2 474 514 517 20 1 6.40
62 ast Asturianu 1.0* 3 636 555 467 28 2 6.31
63 hi हिन्दी 1.0* 2 472 539 489 20 3 6.30
64 oc Occitan 1.0* 3 555 575 441 32 3 6.30
65 ml മലയാളം 1.0* 5 093 637 365 45 5 6.23
66 sw Kiswahili 1.0* 2 986 539 503 9 1 5.79
67 io Ido 1.0* 1 674 516 534 1 1 5.78
68 tl Tagalog 1.0* 3 202 627 396 25 3 5.53
69 zh-min-nan Bân-lâm-gú 1.2 1 947 550 496 6 0 5.49
70 ga Gaeilge 1.0* 4 317 666 354 25 7 5.46
71 nds Plattdüütsch 1.0* 4 266 691 326 25 10 5.45
72 ur اردو 1.0* 4 148 684 328 33 6 5.43
73 be-x-old Беларуская (тарашкевіца) 1.0* 3 737 661 363 28 0 5.02
74 su Basa Sunda 1.0* 9 101 844 143 53 11 4.80
75 az Azərbaycan 1.0* 2 355 655 384 10 2 4.67
76 ku Kurdî / كوردی 1.0* 2 220 650 391 11 0 4.59
77 an Aragonés 1.0* 2 614 660 381 10 1 4.54
78 ia Interlingua 1.0* 2 694 700 338 13 1 4.21
79 mr मराठी 1.0* 2 703 744 295 8 5 3.93
80 tg Тоҷикӣ 1.0* 1 936 725 320 5 2 3.78
81 mn Монгол 1.0* 4 034 775 257 17 3 3.72
82 als Alemannisch 1.0* 7 423 859 153 33 7 3.68
83 te తెలుగు 1.0* 6 408 858 153 35 6 3.66
84 jv Basa Jawa 1.0* 3 134 742 297 12 0 3.65
85 vec Vèneto 1.0* 3 394 794 242 14 2 3.34
86 be Беларуская 1.0* 3 393 794 243 13 2 3.31
87 gd Gàidhlig 1.0* 2 176 769 275 8 0 3.24
88 li Limburgs 1.0* 3 718 796 241 14 1 3.23
89 vo Volapük 1.0* 1 609 766 283 1 2 3.22
90 fy Frysk 1.0* 3 623 791 247 14 0 3.20
91 kn ಕನ್ನಡ 1.0* 4 957 859 170 15 7 3.10
92 uz O‘zbek 1.0* 2 727 807 233 10 2 3.07
93 bat-smg Žemaitėška 1.0* 1 493 776 274 2 0 2.98
94 cv Чăваш 1.0* 2 626 804 241 7 0 2.84
95 hy Հայերեն 1.2 2 481 826 218 8 0 2.64
96 ht Krèyol ayisyen 1.0* 1 332 817 232 2 1 2.63
97 nrm Nouormand/Normaund 1.0* 1 981 822 230 0 0 2.43
98 pms Piemontèis 1.0* 3 385 874 169 6 3 2.32
99 sco Scots 1.0* 1 914 839 212 1 0 2.28
100 fo Føroyskt 1.0* 2 385 860 185 7 0 2.25
101 nah Nāhuatl 1.0* 2 511 862 184 4 1 2.21
102 fur Furlan 1.0* 2 851 863 183 6 0 2.19
103 mt Malti 1.0* 4 280 909 127 14 2 2.12
104 pam Kapampangan 1.0* 5 642 912 124 15 1 2.04
105 nov Novial 1.0* 1 564 875 173 4 0 2.00
106 zh-classical 古文 / 文言文 3.7** 5 234 904 135 11 1 1.99
107 kk Қазақша 1.0* 4 479 903 138 10 1 1.98
108 bar Boarisch 1.0* 4 730 916 122 13 1 1.93
109 dv ދިވެހިބަސް 1.0* 3 619 918 122 10 2 1.90
110 wa Walon 1.0* 2 345 889 160 3 0 1.82
111 sa संस्कृतम् 1.0* 1 163 889 161 2 0 1.78
112 lij Líguru 1.0* 1 466 888 164 0 0 1.73
113 jbo Lojban 1.0* 1 218 889 163 0 0 1.72
114 ceb Sinugboanong Binisaya 0.8 2 024 902 146 4 0 1.71
115 ln Lingala 1.0* 1 710 904 144 4 0 1.69
116 diq Zazaki 1.0* 2 635 903 148 1 0 1.61
117 nds-nl Nedersaksisch 1.0* 2 871 905 146 1 0 1.58
118 kw Kernewek/Karnuack 1.0* 2 001 907 145 0 0 1.53
119 frp Arpitan 1.0* 1 426 911 141 0 0 1.49
120 ne नेपाली 1.0* 3 203 936 110 5 1 1.47
121 os Иронау 1.0* 1 896 917 134 1 0 1.46
122 am አማርኛ 1.0* 1 166 918 134 0 0 1.42
123 new नेपाल भाषा 1.0* 4 758 967 73 11 1 1.33
124 wuu 吴语 3.7** 15 682 1 001 36 9 6 1.33
125 ang Englisc 1.0* 1 822 933 116 2 0 1.31
126 ksh Ripoarisch 1.0* 2 247 942 109 0 1 1.25
127 hsb Hornjoserbsce 1.0* 2 312 940 111 1 0 1.21
128 ilo Ilokano 1.0* 2 713 940 111 1 0 1.21
129 tpi Tok Pisin 1.0* 4 623 975 68 7 2 1.20
130 se Sámegiella 1.0* 1 160 939 113 0 0 1.19
131 arc ܐܪܡܝܐ 1.0* 1 115 943 109 0 0 1.15
132 gv Gaelg 1.0* 2 013 947 104 1 0 1.14
133 wo Wolof 1.0* 2 533 961 87 3 1 1.14
134 lad Dzhudezmo 1.0* 3 259 958 90 4 0 1.12
135 bpy ইমার ঠার/বিষ্ণুপ্রিয়া মণিপুরী 1.0* 4 890 965 83 1 2 1.11
136 gu ગુજરાતી 1.0* 3 046 968 77 7 0 1.11
137 rm Rumantsch 1.0* 1 871 966 81 5 0 1.07
138 yo Yorùbá 1.0* 1 806 950 100 0 0 1.06
139 ay Aymar 1.0* 742 955 97 0 0 1.02
140 cbk-zam Chavacano de Zamboanga 1.0* 21 788 1 024 15 7 6 1.02
141 vls West-Vlams 1.0* 3 737 970 77 2 1 0.99
142 lmo Lumbaart 1.0* 3 951 982 62 8 0 0.99
143 co Corsu 1.0* 1 862 966 85 1 0 0.94
144 ie Interlingue 1.0* 2 249 977 72 2 1 0.94
145 fiu-vro Võro 1.0* 1 177 964 88 0 0 0.93
146 crh Qırımtatarca 1.0* 1 516 966 86 0 0 0.91
147 so Soomaaliga 1.0* 4 033 990 58 2 2 0.89
148 war Winaray 1.0* 1 522 974 78 0 0 0.82
149 csb Kaszëbsczi 1.0* 1 581 976 76 0 0 0.80
150 sc Sardu 1.0* 1 934 983 67 2 0 0.79
151 nap Nnapulitano 1.0* 2 278 980 70 1 0 0.78
152 tt Tatarça / Татарча 1.0* 2 398 986 62 3 0 0.78
153 pdc Deitsch 1.0* 1 349 980 72 0 0 0.76
154 kab Taqbaylit 1.0* 2 632 990 59 3 0 0.75
155 si සිංහල 1.0* 12 286 1 024 21 3 4 0.73
156 ky Кыргызча 1.0* 1 064 984 68 0 0 0.72
157 pag Pangasinan 1.0* 2 131 993 56 3 0 0.72
158 ba Башҡорт 1.0* 2 082 990 61 1 0 0.69
159 eml Emiliàn e rumagnòl 1.0* 2 594 995 56 0 1 0.69
160 lo ລາວ 1.0* 1 696 990 62 0 0 0.65
161 sd سنڌي، سندھی ، सिन्ध 1.0* 18 318 1 037 6 5 4 0.65
162 iu ᐃᓄᒃᑎᑐᑦ 1.0* 769 993 59 0 0 0.62
163 km ភាសាខ្មែរ 1.0* 2 691 1 003 46 3 0 0.61
164 tk تركمن / Туркмен 1.0* 1 588 995 57 0 0 0.60
165 mg Malagasy 1.0* 1 348 1 001 51 0 0 0.54
166 cu Словѣньскъ 1.0* 1 232 1 000 50 0 0 0.53
167 pa ਪੰਜਾਬੀ 1.0* 5 402 1 020 26 6 0 0.53
168 mi Māori 1.0* 2 585 1 011 39 2 0 0.50
169 rmy romani - रोमानी 1.0* 1 664 1 006 46 0 0 0.49
170 tet Tetun 1.0* 2 923 1 006 46 0 0 0.49
171 gn Avañe'ẽ 1.0* 1 733 1 011 40 1 0 0.46
172 map-bms Basa Banyumasan 1.0* 1 497 1 010 42 0 0 0.44
173 kg KiKongo 1.0* 1 309 1 011 41 0 0 0.43
174 na dorerin Naoero 1.0* 1 791 1 011 41 0 0 0.43
175 udm Удмурт кыл 1.0* 3 032 1 014 38 0 0 0.40
176 roa-rup Armãneashce 1.0* 1 828 1 016 36 0 0 0.38
177 ks कश्मीरी / كشميري 1.0* 3 114 1 023 26 2 0 0.36
178 ig Igbo 1.0* 3 330 1 027 24 0 1 0.35
179 zea Zeêuws 1.0* 3 479 1 022 30 0 0 0.32
180 haw Hawai`i 1.0* 1 088 1 024 28 0 0 0.30
181 mzn مَزِروني 1.0* 1 261 1 024 28 0 0 0.30
182 ty Reo Mā`ohi 1.0* 1 144 1 025 27 0 0 0.29
183 ce Нохчийн 1.0* 2 028 1 029 23 0 0 0.24
184 to faka Tonga 1.0* 1 788 1 032 18 0 0 0.19
185 stq Seeltersk 1.0* 1 730 1 035 17 0 0 0.18
186 roa-tara Tarandíne 1.0* 5 424 1 043 7 2 0 0.16
187 pap Papiamentu 1.0* 2 743 1 042 9 1 0 0.14
188 or ଓଡ଼ିଆ 1.0* 827 1 043 9 0 0 0.10
189 pi पाऴि 1.0* 3 506 1 047 4 1 0 0.08
190 bh भोजपुरी 1.0* 1 681 1 047 5 0 0 0.05
191 glk گیلکی 1.0* 815 1 047 5 0 0 0.05
  • weights with "*" have no weight available so using default weight of 1.0
  • weights with "**" use the weight of the known related language (ex. 'zh')