List of Wikipedias by sample of articles

From Meta, a Wikimedia project coordination wiki
This is an archived version of this page, as edited by MarsRover (talk | contribs) at 16:41, 3 May 2008 (Recalculated zh:, simple:, ca:, es: for May 2 update). It may differ significantly from the current version.

This page contains a list of the largest Wikipedias under the auspices of the Wikimedia Foundation for various languages. Test Wikipedias are listed at the Wikimedia Incubator Wiki project.

This list of Wikipedias is based on the List of articles every Wikipedia should have (total: 1052 on the 3th of April) as a sample, but the actual list which is used is at the end of List of Wikipedias by sample of articles/Source code and can be a little different. For every Wikipedia, the articles in this sample list is retrieved (based on interwiki links from the English Wikipedia) and the number of characters is calculated (minus "comments" and the "interwiki" text at the bottom of the article). The size of each article is then adjusted for each language by multiplying it by the language weight. The articles are divided in four classes: "absent" (i.e. non-existing; size = 0), "stubs" (size in characters inferior to 10,000), "articles" (size between 10,000 and 30,000) and "long articles" (size superior to 30,000). The average weighted size of the non-absent articles in the sample is also calculated. Finally, a score is computed, based on the following formula: rawscore = stubs + articles*4 + long.articles*9. In order to have a consistent scale the raw score is normalized by dividing by the maximum score and multiplying by 100. The maximum score would be maxscore = (absent + stubs + articles + long.articles)*9. The final score is the following score = rawscore / maxscore * 100. The language editions are then listed in order of decreasing score.

A copy of the program used to obtain this list is in List of Wikipedias by sample of articles/Source code.

Absent articles for major Wikipedias are in List of Wikipedias by sample of articles/Absent Articles.

See also:


Last Update: 2 May 2008

Wiki Language Weight Average Article
Size (wt.chars)
Absent
(0k)
Stubs
(< 10k)
Articles
(10-30k)
Long Art.
(> 30k)
Score
1 en English 1.0 44 366 0 69 332 650 76.61
2 de Deutsch 1.0 33 182 7 190 430 425 60.57
3 fr Français 1.0 28 753 6 271 436 339 53.51
4 es Español 1.1 25 386 2 321 456 273 48.61
5 it Italiano 1.1 22 558 26 341 433 252 45.85
6 zh 中文 3.7 23 787 0 428 378 246 43.87
7 ru Русский 1.4 21 672 2 419 410 221 42.75
8 ja 日本語 1.9 16 027 11 483 425 133 35.70
9 pt Português 1.1 15 542 14 521 386 131 34.26
10 pl Polski 1.1 13 890 16 586 343 107 30.85
11 hu Magyar 1.1 14 244 95 537 302 117 29.58
12 fi Suomi 1.1 13 174 35 614 300 103 28.95
13 sv Svenska 1.1 11 725 4 673 281 94 27.92
14 cs Čeština 1.3 12 328 59 592 313 87 27.77
15 he עברית 1.2 11 521 36 595 351 70 27.77
16 nl Nederlands 0.9 11 141 23 615 352 61 27.19
17 vi Tiếng Việt 1.1 15 025 216 477 244 115 26.28
18 no Norsk (Bokmål) 1.2 10 941 45 673 255 79 25.39
19 uk Українська 1.3 10 172 41 688 257 65 24.33
20 ca Català 1.1 9 926 1 728 268 55 24.24
21 tr Türkçe 1.3 10 977 68 658 256 64 23.99
22 sr Српски / Srpski 1.4 11 265 131 611 231 79 23.72
23 hr Hrvatski 1.3 9 132 118 672 209 53 20.97
24 sk Slovenčina 1.3 10 182 179 612 201 59 20.58
25 ro Română 1.1 9 668 168 632 198 54 20.17
26 da Dansk 1.2 8 093 56 788 159 48 19.62
27 el Ελληνικά 1.1 9 938 277 539 180 56 18.62
28 bg Български 1.1 7 855 128 708 179 36 18.48
29 ko 한국어 2.5 7 381 99 755 159 39 18.40
30 id Bahasa Indonesia 1.0* 6 449 89 781 152 29 17.44
31 ar العربية 1.0 5 469 38 853 144 17 16.71
32 sl Slovenščina 1.2 6 884 111 769 154 18 16.34
33 gl Galego 1.0* 7 915 218 636 177 21 16.19
34 eo Esperanto 1.1 5 638 0 938 92 22 15.89
35 th ไทย 1.0 6 098 169 735 126 22 15.18
36 fa فارسی 1.2 5 945 204 703 123 22 14.71
37 ms Bahasa Melayu 1.0* 7 991 365 527 130 30 13.91
38 lt Lietuvių 1.0* 5 252 143 790 111 7 13.71
39 simple Simple English 1.0* 3 546 0 986 60 6 13.52
40 is Íslenska 1.0* 2 763 24 978 45 5 12.71
41 sh Srpskohrvatski / Српскохрватски 1.0* 6 588 371 555 109 17 12.08
42 et Eesti 1.0* 4 368 209 765 63 12 11.92
43 bs Bosanski 1.0* 4 738 265 695 90 2 11.33
44 la Latina 1.1 3 796 225 768 47 12 11.24
45 nn Nynorsk 1.0* 4 479 272 708 60 12 11.15
46 eu Euskara 1.0* 4 123 264 715 65 8 11.06
47 lv Latviešu 1.0* 5 356 389 570 82 11 10.53
48 af Afrikaans 1.0* 8 492 577 360 90 25 9.98
49 mk Македонски 1.0* 4 794 417 567 55 12 9.46
50 ka ქართული 1.0* 3 846 419 590 34 8 8.44
51 ps پښتو 1.0* 32 656 920 26 37 69 8.40
52 bn বাংলা 1.0* 3 822 501 498 41 9 7.87
53 cy Cymraeg 1.0* 2 395 360 678 12 2 7.86
54 br Brezhoneg 1.0* 4 321 493 508 45 6 7.84
55 ta தமிழ் 0.9 3 425 483 537 28 3 7.15
56 zh-yue 粵語 3.7** 7 080 658 328 50 16 7.10
57 lb Lëtzebuergesch 1.0* 4 476 585 422 36 8 6.74
58 qu Runa Simi 1.0* 2 647 513 507 32 0 6.71
59 yi ייִדיש 1.0* 3 205 547 474 23 7 6.65
60 sq Shqip 1.0* 4 212 580 430 37 5 6.58
61 scn Sicilianu 1.0* 2 476 513 518 20 1 6.41
62 hi हिन्दी 1.0* 2 539 534 493 21 3 6.39
63 ml മലയാളം 1.0* 5 170 636 363 48 5 6.34
64 oc Occitan 1.0* 3 559 572 443 32 3 6.33
65 ast Asturianu 1.0* 3 637 554 468 28 2 6.32
66 sw Kiswahili 1.0* 3 031 524 517 10 1 5.98
67 io Ido 1.0* 1 677 514 536 1 1 5.80
68 tl Tagalog 1.0* 3 249 625 397 26 3 5.58
69 ga Gaeilge 1.0* 4 326 662 358 24 8 5.56
70 nds Plattdüütsch 1.0* 4 324 690 326 25 11 5.54
71 zh-min-nan Bân-lâm-gú 1.2 1 953 549 497 6 0 5.50
72 ur اردو 1.0* 4 147 682 329 33 6 5.45
73 be-x-old Беларуская (тарашкевіца) 1.0* 3 802 659 364 29 0 5.07
74 su Basa Sunda 1.0* 9 105 844 143 53 11 4.80
75 az Azərbaycan 1.0* 2 367 652 387 10 2 4.70
76 an Aragonés 1.0* 2 647 657 383 11 1 4.60
77 ku Kurdî / كوردی 1.0* 2 227 649 392 11 0 4.60
78 ia Interlingua 1.0* 2 702 700 338 13 1 4.21
79 mr मराठी 1.0* 2 805 741 298 8 5 3.96
80 fy Frysk 1.0* 3 834 747 286 18 1 3.88
81 als Alemannisch 1.0* 7 608 853 156 36 7 3.83
82 tg Тоҷикӣ 1.0* 1 943 725 320 5 2 3.78
83 te తెలుగు 1.0* 6 626 856 154 35 7 3.77
84 mn Монгол 1.0* 4 034 774 258 17 3 3.73
85 jv Basa Jawa 1.0* 3 147 735 304 12 0 3.72
86 vec Vèneto 1.0* 3 425 788 246 16 2 3.46
87 be Беларуская 1.0* 3 432 794 243 13 2 3.31
88 li Limburgs 1.0* 3 692 793 244 14 1 3.26
89 gd Gàidhlig 1.0* 2 204 769 275 8 0 3.24
90 vo Volapük 1.0* 1 616 766 283 1 2 3.22
91 uz O‘zbek 1.0* 2 703 801 239 10 2 3.14
92 kn ಕನ್ನಡ 1.0* 4 976 859 170 15 7 3.10
93 bat-smg Žemaitėška 1.0* 1 494 773 277 2 0 3.01
94 cv Чăваш 1.0* 2 622 803 242 7 0 2.85
95 ht Krèyol ayisyen 1.0* 1 332 814 235 2 1 2.66
96 hy Հայերեն 1.2 2 492 825 219 8 0 2.65
97 nah Nāhuatl 1.0* 2 313 824 222 4 1 2.61
98 fur Furlan 1.0* 2 885 837 208 7 0 2.49
99 nrm Nouormand/Normaund 1.0* 1 979 820 232 0 0 2.45
100 pms Piemontèis 1.0* 3 378 872 171 6 3 2.34
101 mt Malti 1.0* 4 584 907 126 16 3 2.29
102 sco Scots 1.0* 1 921 839 212 1 0 2.28
103 fo Føroyskt 1.0* 2 436 860 185 7 0 2.25
104 kk Қазақша 1.0* 4 270 888 152 11 0 2.07
105 pam Kapampangan 1.0* 5 696 912 124 15 1 2.04
106 nov Novial 1.0* 1 571 872 176 4 0 2.03
107 zh-classical 古文 / 文言文 3.7** 5 227 903 137 11 1 2.01
108 bar Boarisch 1.0* 4 713 912 126 13 1 1.98
109 dv ދިވެހިބަސް 1.0* 3 674 919 121 10 2 1.89
110 wa Walon 1.0* 2 346 887 161 3 0 1.83
111 sa संस्कृतम् 1.0* 1 169 889 161 2 0 1.78
112 ceb Sinugboanong Binisaya 0.8 2 021 899 149 4 0 1.74
113 lij Líguru 1.0* 1 469 887 165 0 0 1.74
114 jbo Lojban 1.0* 1 223 889 163 0 0 1.72
115 ln Lingala 1.0* 1 715 904 144 4 0 1.69
116 diq Zazaki 1.0* 2 641 903 148 1 0 1.61
117 nds-nl Nedersaksisch 1.0* 2 882 904 147 1 0 1.59
118 kw Kernewek/Karnuack 1.0* 2 008 907 145 0 0 1.53
119 frp Arpitan 1.0* 1 434 911 141 0 0 1.49
120 ne नेपाली 1.0* 3 208 936 110 5 1 1.47
121 os Иронау 1.0* 1 904 917 134 1 0 1.46
122 wuu 吴语 3.7** 13 399 990 47 9 6 1.45
123 new नेपाल भाषा 1.0* 4 420 957 83 11 1 1.44
124 am አማርኛ 1.0* 1 227 918 134 0 0 1.42
125 ang Englisc 1.0* 1 809 932 118 2 0 1.33
126 ksh Ripoarisch 1.0* 2 283 941 110 0 1 1.26
127 hsb Hornjoserbsce 1.0* 2 307 938 113 1 0 1.24
128 ilo Ilokano 1.0* 2 755 938 113 1 0 1.24
129 tpi Tok Pisin 1.0* 4 629 975 68 7 2 1.20
130 se Sámegiella 1.0* 1 169 939 113 0 0 1.19
131 gv Gaelg 1.0* 2 045 944 107 1 0 1.17
132 wo Wolof 1.0* 2 489 958 90 3 1 1.17
133 arc ܐܪܡܝܐ 1.0* 1 122 943 109 0 0 1.15
134 lad Dzhudezmo 1.0* 3 239 956 92 4 0 1.14
135 bpy ইমার ঠার/বিষ্ণুপ্রিয়া মণিপুরী 1.0* 4 918 965 83 1 2 1.11
136 gu ગુજરાતી 1.0* 3 085 968 77 7 0 1.11
137 ay Aymar 1.0* 732 950 102 0 0 1.08
138 rm Rumantsch 1.0* 1 866 965 82 5 0 1.08
139 yo Yorùbá 1.0* 1 812 950 100 0 0 1.06
140 cbk-zam Chavacano de Zamboanga 1.0* 21 266 1 023 16 7 6 1.04
141 lmo Lumbaart 1.0* 3 891 980 64 8 0 1.01
142 vls West-Vlams 1.0* 3 741 970 77 2 1 0.99
143 co Corsu 1.0* 1 873 966 85 1 0 0.94
144 ie Interlingue 1.0* 2 256 977 72 2 1 0.94
145 fiu-vro Võro 1.0* 1 183 964 88 0 0 0.93
146 crh Qırımtatarca 1.0* 1 533 966 86 0 0 0.91
147 so Soomaaliga 1.0* 3 985 989 59 2 2 0.90
148 war Winaray 1.0* 1 528 974 78 0 0 0.82
149 csb Kaszëbsczi 1.0* 1 589 976 76 0 0 0.80
150 nap Nnapulitano 1.0* 2 244 979 72 1 0 0.80
151 sc Sardu 1.0* 1 962 983 67 2 0 0.79
152 tt Tatarça / Татарча 1.0* 2 404 987 62 3 0 0.78
153 kab Taqbaylit 1.0* 2 585 988 61 3 0 0.77
154 ky Кыргызча 1.0* 1 128 979 72 0 0 0.76
155 pdc Deitsch 1.0* 1 388 980 72 0 0 0.76
156 si සිංහල 1.0* 12 295 1 024 21 3 4 0.73
157 eml Emiliàn e rumagnòl 1.0* 2 568 992 59 0 1 0.72
158 pag Pangasinan 1.0* 2 137 993 56 3 0 0.72
159 ba Башҡорт 1.0* 2 089 990 61 1 0 0.69
160 km ភាសាខ្មែរ 1.0* 2 764 997 52 3 0 0.68
161 lo ລາວ 1.0* 1 706 989 63 0 0 0.67
162 sd سنڌي، سندھی ، सिन्ध 1.0* 19 566 1 037 5 5 4 0.64
163 iu ᐃᓄᒃᑎᑐᑦ 1.0* 773 993 59 0 0 0.62
164 tk تركمن / Туркмен 1.0* 1 593 995 57 0 0 0.60
165 mg Malagasy 1.0* 1 354 1 001 51 0 0 0.54
166 cu Словѣньскъ 1.0* 1 240 999 50 0 0 0.53
167 pa ਪੰਜਾਬੀ 1.0* 5 407 1 020 26 6 0 0.53
168 mi Māori 1.0* 2 591 1 011 39 2 0 0.50
169 kg KiKongo 1.0* 1 293 1 006 46 0 0 0.49
170 rmy romani - रोमानी 1.0* 1 671 1 006 46 0 0 0.49
171 tet Tetun 1.0* 2 930 1 006 46 0 0 0.49
172 map-bms Basa Banyumasan 1.0* 1 459 1 007 45 0 0 0.48
173 gn Avañe'ẽ 1.0* 1 742 1 011 40 1 0 0.46
174 na dorerin Naoero 1.0* 1 798 1 011 41 0 0 0.43
175 udm Удмурт кыл 1.0* 3 046 1 014 38 0 0 0.40
176 roa-rup Armãneashce 1.0* 1 834 1 016 36 0 0 0.38
177 ks कश्मीरी / كشميري 1.0* 3 128 1 023 26 2 0 0.36
178 ig Igbo 1.0* 3 251 1 026 25 0 1 0.36
179 zea Zeêuws 1.0* 3 524 1 019 33 0 0 0.35
180 haw Hawai`i 1.0* 1 141 1 022 30 0 0 0.32
181 mzn مَزِروني 1.0* 1 265 1 024 28 0 0 0.30
182 ty Reo Mā`ohi 1.0* 1 150 1 025 27 0 0 0.29
183 ce Нохчийн 1.0* 2 015 1 027 25 0 0 0.26
184 stq Seeltersk 1.0* 1 811 1 033 19 0 0 0.20
185 to faka Tonga 1.0* 1 791 1 032 18 0 0 0.19
186 roa-tara Tarandíne 1.0* 5 428 1 043 7 2 0 0.16
187 pap Papiamentu 1.0* 2 755 1 042 9 1 0 0.14
188 or ଓଡ଼ିଆ 1.0* 835 1 043 9 0 0 0.10
189 pi पाऴि 1.0* 3 519 1 047 4 1 0 0.08
190 bh भोजपुरी 1.0* 1 681 1 047 5 0 0 0.05
191 glk گیلکی 1.0* 820 1 047 5 0 0 0.05
  • weights with "*" have no weight available so using default weight of 1.0
  • weights with "**" use the weight of the known related language (ex. 'zh')