List of Wikipedias by sample of articles

From Meta, a Wikimedia project coordination wiki
This is an archived version of this page, as edited by MarsRover (talk | contribs) at 05:55, 5 April 2008 (...to be more precise). It may differ significantly from the current version.

This page contains a list of the largest Wikipedias under the auspices of the Wikimedia Foundation for various languages. Test Wikipedias are listed at the Wikimedia Incubator Wiki project.

This list of Wikipedias is based on the List of articles every Wikipedia should have (total: 1052 on the 3th of April) as a sample, but the actual list which is used is at the end of List of Wikipedias by sample of articles/Source code and can be a little different. For every Wikipedia, the articles in this sample list is retrieved (based on interwiki links from the English Wikipedia) and the number of characters is calculated (minus the "interwiki" text at the bottom of the article). The size of each article is then adjusted for each language by multiplying it by the language weight. The articles are divided in four classes: "absent" (i.e. non-existing; size = 0), "stubs" (size in characters inferior to 10,000), "articles" (size between 10,000 and 30,000) and "long articles" (size superior to 30,000). The average weighted size of the non-absent articles in the sample is also calculated. Finally, a score is computed, based on the following formula: rawscore = stubs + articles*4 + long.articles*9. In order to have a consistent scale the raw score is normalized by dividing by the maximum score and multiplying by 100. The maximum score would be maxscore = (absent + stubs + articles + long.articles)*9. The final score is the following score = rawscore / maxscore * 100. The language editions are then listed in order of decreasing score.

A copy of the programm used to obtain this list is in List of Wikipedias by sample of articles/Source code.

Absent articles for major Wikipedias are in List of Wikipedias by sample of articles/Absent Articles.

See also:


Last Update: 3 April 2008

Language Weight Average Article
Size (wt.chars)
Absent
(0k)
Stubs
(< 10k)
Articles
(10-30k)
Long Art.
(> 30k)
Score
1 English 1.0 43 932 0 75 333 644 76.08
2 Deutsch 1.0 33 129 13 187 428 424 60.36
3 Français 1.0 28 050 11 269 445 327 52.72
4 Español 1.1 24 878 7 327 453 265 47.78
5 Italiano 1.1 22 528 35 329 440 247 45.59
6 中文 3.7 24 226 5 426 364 254 44.15
7 Русский 1.4 21 584 7 416 403 226 42.90
8 日本語 1.9 16 046 12 477 427 136 36.01
9 Português 1.1 15 327 20 522 382 128 33.82
10 Polski 1.1 13 802 18 583 345 106 30.81
11 Magyar 1.1 14 242 100 533 301 117 29.50
12 Suomi 1.1 13 148 44 605 303 100 28.70
13 Svenska 1.1 11 563 19 663 281 89 27.33
14 עברית 1.2 11 416 43 593 350 66 27.32
15 Česky 1.3 12 138 68 588 316 80 27.17
16 Nederlands 0.9 11 087 25 613 355 59 27.08
17 Norsk (bokmål) 1.2 10 795 54 668 253 77 25.06
18 Tiếng Việt 1.0* 13 475 227 495 230 100 24.45
19 Türkçe 1.3 10 658 73 664 248 63 23.57
20 Українська 1.3 9 795 50 691 251 59 23.53
21 Català 1.1 9 617 23 719 258 52 23.44
22 Hrvatski 1.3 9 226 151 647 201 52 20.29
23 Slovenčina 1.3 9 953 184 616 196 55 20.03
24 Română 1.1 9 495 176 631 195 50 19.66
25 Dansk 1.2 7 962 61 787 155 48 19.44
26 Српски / Srpski 1.0* 8 156 138 682 190 42 19.22
27 Ελληνικά 1.1 10 186 280 532 178 62 19.03
28 한국어 2.5 7 387 104 750 158 40 18.40
29 Български 1.1 7 898 130 709 177 36 18.39
30 Bahasa Indonesia 1.0* 7 007 102 751 162 36 18.22
31 العربية 1.0 5 414 70 825 142 15 16.14
32 Slovenščina 1.2 6 808 125 756 154 17 16.11
33 Esperanto 1.1 5 923 16 914 100 22 15.97
34 Galego 1.0* 7 744 231 630 170 20 15.75
35 ไทย 1.0 6 027 185 724 123 20 14.74
36 Bahasa Melayu 1.0* 8 577 381 505 133 33 14.09
37 فارسی 1.2 5 387 225 698 117 12 13.46
38 Lietuvių 1.0* 5 241 166 770 106 7 13.31
39 Simple English 1.0* 3 629 72 918 57 5 12.58
40 Íslenska 1.0* 2 768 47 952 48 5 12.56
41 Srpskohrvatski/ Српскохрватски 1.0* 6 616 373 552 108 18 12.12
42 Eesti 1.0* 4 358 217 762 59 12 11.70
43 Bosanski 1.0* 4 740 274 685 90 3 11.32
44 Nynorsk 1.0* 4 572 280 700 60 12 11.07
45 Euskara 1.0* 4 113 278 701 65 8 10.91
46 Afrikaans 1.0* 9 200 582 344 96 30 10.54
47 Latviešu 1.0* 5 161 399 566 77 10 10.18
48 Latina 1.0* 3 654 321 682 38 10 9.77
49 Македонски 1.0* 4 748 423 563 53 12 9.34
50 پښتو 1.0* 35 302 926 19 32 75 8.68
51 ქართული 1.0* 3 785 421 589 34 7 8.33
52 Brezhoneg 1.0* 4 733 498 491 55 8 8.27
53 বাংলা 1.0* 3 958 509 492 38 12 7.95
54 Cymraeg 1.0* 2 346 388 652 11 1 7.45
55 Basa Sunda 1.0* 14 362 846 96 86 23 6.84
56 தமிழ் 0.9 3 401 511 510 27 3 6.82
57 Lëtzebuergesch 1.0* 4 503 593 415 36 8 6.66
58 ייִדיש 1.0* 3 227 550 470 24 7 6.65
59 Quechua 1.0* 2 668 522 498 32 0 6.61
60 Shqip 1.0* 4 227 586 424 37 5 6.52
61 Sicilianu 1.0* 2 498 514 516 21 1 6.43
62 Asturianu 1.0* 3 651 561 461 28 2 6.24
63 മലയാളം 1.0* 5 160 648 355 43 6 6.14
64 हिन्दी 1.0* 2 453 550 478 20 2 6.10
65 Occitan 1.0* 3 541 598 421 29 3 5.96
66 Ido 1.0* 1 676 520 530 1 1 5.74
67 Tagalog 1.0* 3 207 629 394 25 3 5.51
68 Plattdüütsch 1.0* 4 280 692 325 25 10 5.44
69 Gaeilge 1.0* 4 350 672 346 26 7 5.42
70 Kiswahili 1.0* 3 042 575 467 9 1 5.41
71 Bân-lâm-gú 3.7** 1 633 558 492 2 0 5.28
72 اردو 1.0* 4 329 714 298 34 5 5.06
73 Беларуская (тарашкевіца) 1.0* 3 763 667 356 29 0 4.99
74 粵語 3.7** 2 128 663 374 13 2 4.69
75 Kurdî / كوردی 1.0* 2 249 654 387 10 1 4.60
76 Azərbaycan 1.0* 2 157 658 382 9 1 4.52
77 Aragonese 1.0* 2 606 664 377 10 1 4.50
78 Interlingua 1.0* 2 721 702 336 13 1 4.19
79 मराठी 1.0* 2 761 748 290 9 5 3.92
80 Basa Jawa 1.0* 3 732 758 275 17 2 3.81
81 Тоҷикӣ 1.0* 1 936 725 320 5 2 3.78
82 Alemannisch 1.0* 7 193 859 155 32 6 3.56
83 Монгол 1.0* 4 080 789 244 16 3 3.54
84 తెలుగు 1.0* 6 484 865 148 33 6 3.53
85 Vèneto 1.0* 3 504 796 239 15 2 3.35
86 Беларуская 1.0* 3 371 800 237 13 2 3.24
87 Gàidhlig 1.0* 2 124 771 274 7 0 3.19
88 Frysk 1.0* 3 627 795 242 14 0 3.15
89 Volapük 1.0* 1 607 776 273 1 2 3.12
90 Limburgs 1.0* 3 759 801 239 11 1 3.08
91 ಕನ್ನಡ 1.0* 5 006 861 168 15 7 3.08
92 Žemaitėška 1.0* 1 505 782 268 2 0 2.92
93 O‘zbek 1.0* 2 802 822 218 10 2 2.92
94 Հայերեն 1.2 2 845 829 208 14 0 2.79
95 Чăваш 1.0* 2 671 816 228 7 0 2.71
96 Krèyol ayisyen 1.0* 1 322 817 232 2 1 2.63
97 Nouormand/Normaund 1.0* 1 975 824 228 0 0 2.41
98 Piemontèis 1.0* 3 361 874 169 6 3 2.32
99 Nahuatl 1.0* 2 589 863 182 5 1 2.23
100 Scots 1.0* 1 947 845 206 1 0 2.22
101 Furlan 1.0* 2 851 864 182 6 0 2.18
102 Malti 1.0* 3 990 910 126 15 1 2.06
103 Novial 1.0* 1 570 875 173 4 0 2.00
104 Kapampangan 1.0* 5 710 914 123 14 1 1.99
105 Boarisch 1.0* 4 741 917 121 13 1 1.92
106 ދިވެހިބަސް 1.0* 3 627 918 122 10 2 1.90
107 Қазақша 1.0* 4 331 907 135 9 1 1.90
108 Føroyskt 1.0* 2 460 891 155 6 0 1.89
109 Walon 1.0* 2 336 889 160 3 0 1.82
110 संस्कृतम् 1.0* 1 162 889 161 2 0 1.78
111 Líguru 1.0* 1 464 888 163 1 0 1.76
112 Lojban 1.0* 1 218 889 163 0 0 1.72
113 Lingala 1.0* 1 709 904 144 4 0 1.69
114 Sinugboanong Binisaya 0.8 1 906 908 140 4 0 1.65
115 古文 / 文言文 3.7** 1 423 904 147 1 0 1.59
116 Zazaki 1.0* 2 670 906 145 1 0 1.57
117 Nedersaksisch 1.0* 2 894 908 143 1 0 1.55
118 Kernewek/Karnuack 1.0* 2 013 909 143 0 0 1.51
119 Arpitan 1.0* 1 421 911 141 0 0 1.49
120 Ripoarisch 1.0* 3 540 942 105 2 3 1.48
121 नेपाली 1.0* 3 203 936 110 5 1 1.47
122 Иронау 1.0* 2 042 917 134 1 0 1.46
123 Amharic 1.0* 1 190 918 134 0 0 1.42
124 Englisc 1.0* 2 350 934 113 5 0 1.40
125 नेपाल भाषा 1.0* 4 794 967 73 11 1 1.33
126 Ilokano 1.0* 3 314 940 108 4 0 1.31
127 Hornjoserbsce 1.0* 2 242 942 109 1 0 1.19
128 Sámegiella 1.0* 1 158 939 113 0 0 1.19
129 Tok Pisin 1.0* 4 768 977 66 7 2 1.18
130 Wolof 1.0* 2 553 961 87 3 1 1.14
131 ܐܪܡܝܐ 1.0* 1 131 946 106 0 0 1.12
132 ইমার ঠার/বিষ্ণুপ্রিয়া মণিপুরী 1.0* 4 899 965 84 1 2 1.12
133 Gaelg 1.0* 2 007 950 101 1 0 1.11
134 Dzhudezmo 1.0* 3 561 962 85 5 0 1.11
135 ગુજરાતી 1.0* 3 043 969 76 7 0 1.10
136 Rumantsch 1.0* 1 867 966 81 5 0 1.07
137 Lumbaart 1.0* 4 095 982 61 9 0 1.02
138 Yorùbá 1.0* 1 730 954 96 0 0 1.02
139 Aymar 1.0* 746 957 95 0 0 1.00
140 Chavacano de Zamboanga 1.0* 21 614 1 025 15 6 6 0.98
141 Interlingue 1.0* 2 246 977 72 2 1 0.94
142 Corsu 1.0* 1 888 967 84 1 0 0.93
143 West-Vlams 1.0* 3 679 977 70 2 1 0.92
144 Qırımtatarca 1.0* 1 514 966 86 0 0 0.91
145 Võro 1.0* 1 192 968 84 0 0 0.89
146 Winaray 1.0* 1 983 974 76 2 0 0.89
147 Tatarça / Татарча 1.0* 2 977 986 58 6 0 0.87
148 吴语 1.0* 4 927 1 005 40 5 2 0.82
149 Kaszëbsczi 1.0* 1 579 976 76 0 0 0.80
150 Nnapulitano 1.0* 2 314 981 70 1 0 0.78
151 Pangasinan 1.0* 2 791 995 53 3 1 0.78
152 Sardu 1.0* 1 960 985 65 2 0 0.77
153 Deitsch 1.0* 1 347 980 72 0 0 0.76
154 Кыргызча 1.0* 1 060 983 69 0 0 0.73
155 Sinhalese 1.0* 12 421 1 024 21 3 4 0.73
156 Emiliàn e rumagnòl 1.0* 2 799 995 55 1 1 0.72
157 Türkmen 1.0* 1 588 995 57 0 0 0.60
158 Malagasy 1.0* 1 370 1 001 51 0 0 0.54
159 Khmer 1.0* 2 857 1 013 36 3 0 0.51
160 Māori 1.0* 2 520 1 011 39 2 0 0.50
161 KiKongo 1.0* 1 307 1 011 41 0 0 0.43
162 Basa Banyumasan 1.0* 1 510 1 011 41 0 0 0.43
163 Armãneashce 1.0* 1 824 1 016 36 0 0 0.38
164 Igbo 1.0* 3 333 1 027 24 0 1 0.35
165 Zeêuws 1.0* 3 479 1 022 30 0 0 0.32
166 Reo Ma'ohi 1.0* 1 140 1 025 27 0 0 0.29
167 Hawai`i 1.0* 1 105 1 027 25 0 0 0.26
168 faka Tonga 1.0* 1 784 1 032 18 0 0 0.19
169 Tarandine 1.0* 5 424 1 043 7 2 0 0.16
170 Oriya 1.0* 823 1 043 9 0 0 0.10
171 पाऴि 1.0* 3 496 1 047 4 1 0 0.08
172 भोजपुरी 1.0* 1 668 1 047 5 0 0 0.05
173 گیلکی 1.0* 1 082 1 047 5 0 0 0.05
  • weights with "*" have no weight available so using default weight of 1.0
  • weights with "**" use the weight of the known related language (ex. 'zh')