List of Wikipedias by sample of articles

From Meta, a Wikimedia project coordination wiki
This is an archived version of this page, as edited by MarsRover (talk | contribs) at 15:36, 26 July 2008 (actually july 26 update). It may differ significantly from the current version.

This page contains a list of the largest Wikipedias under the auspices of the Wikimedia Foundation for various languages. Test Wikipedias are listed at the Wikimedia Incubator Wiki project.

This list of Wikipedias is based on the List of articles every Wikipedia should have (total: 1047 on the 2nd of July, 2008) as a sample, but the actual list which is used is at the end of List of Wikipedias by sample of articles/Source code and can be a little different. For every Wikipedia, the articles in this sample list is retrieved (based on interwiki links from the English Wikipedia) and the number of characters is calculated (minus "comments" and the "interwiki" text at the bottom of the article). The size of each article is then adjusted for each language by multiplying it by the language weight. The articles are divided in four classes: "absent" (i.e. non-existing; size = 0), "stubs" (size in characters inferior to 10,000), "articles" (size between 10,000 and 30,000) and "long articles" (size superior to 30,000). The average weighted size of the non-absent articles in the sample is also calculated. Finally, a score is computed, based on the following formula: rawscore = stubs + articles*4 + long.articles*9. In order to have a consistent scale the raw score is normalized by dividing by the maximum score and multiplying by 100. The maximum score would be maxscore = (absent + stubs + articles + long.articles)*9. The final score is the following score = rawscore / maxscore * 100. The language editions are then listed in order of decreasing score.

A copy of the program used to obtain this list is in List of Wikipedias by sample of articles/Source code.

Absent articles for major Wikipedias are in List of Wikipedias by sample of articles/Absent Articles.

See also:


Last Update: 26 July 2008

Wiki Language Weight Average Article
Size (wt.chars)
Absent
(0k)
Stubs
(< 10k)
Articles
(10-30k)
Long Art.
(> 30k)
Score Growth
1 en English 1.0 46 205 0 59 317 663 78.00 +0.34
2 de Deutsch 1.0 34 646 6 168 426 439 62.27 +0.85
3 fr Français 1.0 30 103 6 252 426 355 55.09 +0.48
4 es Español 1.1 27 654 7 283 449 300 51.11 +0.66
5 it Italiano 1.1 24 245 6 317 437 279 48.94 +0.99
6 ru Русский 1.4 23 564 1 373 414 251 45.86 +0.70
7 zh 中文 3.7 24 662 0 402 386 251 44.97 +0.37
8 ja 日本語 1.9 16 603 7 462 428 142 36.92 +0.46
9 pt Português 1.1 16 482 7 491 394 147 36.25 +0.95
10 pl Polski 1.1 15 013 16 559 346 118 32.14 +0.77
11 fi Suomi 1.1 13 724 20 597 306 116 30.64 +0.81
12 hu Magyar 1.1 14 985 86 519 312 122 30.64 +0.45
13 cs Čeština 1.3 13 069 47 569 329 94 29.21 +0.30
14 he עברית 1.2 12 005 30 574 359 75 28.74 +0.34
15 sv Svenska 1.1 12 169 4 651 287 97 28.57 +0.24
16 nl Nederlands 0.9 11 412 15 609 351 64 27.69 +0.10
17 vi Tiếng Việt 1.1 15 408 203 472 247 117 26.87 +0.19
18 no Norsk (Bokmål) 1.2 11 212 18 672 265 84 26.61 +0.27
19 sr Српски / Srpski 1.4 12 026 110 595 242 92 25.57 +0.54
20 uk Українська 1.3 10 664 26 667 280 66 25.46 +0.54
21 ca Català 1.1 10 592 1 705 267 66 25.31 +0.63
22 tr Türkçe 1.3 11 771 56 642 270 69 25.10 +0.37
23 hr Hrvatski 1.3 9 766 94 663 223 59 22.31 +0.46
24 sk Slovenčina 1.3 10 456 152 611 215 61 21.60 +0.29
25 ro Română 1.1 9 887 143 635 199 60 21.12 +0.43
26 da Dansk 1.2 8 404 48 782 159 50 19.98 +0.11
27 ko 한국어 2.5 7 984 76 750 164 49 19.75 +0.47
28 el Ελληνικά 1.1 10 111 242 546 195 56 19.57 +0.37
29 bg Български 1.1 8 141 116 696 191 36 19.08 +0.04
30 id Bahasa Indonesia 1.0* 6 486 55 805 151 28 17.76 -0.03
31 ar العربية 1.0 5 473 1 870 149 19 17.51 +0.11
32 sl Slovenščina 1.2 7 033 94 768 157 20 16.85 +0.22
33 gl Galego 1.0* 8 188 194 645 175 25 16.79 +0.06
34 eo Esperanto 1.1 5 903 2 909 104 24 16.48 +0.20
35 th ไทย 1.0 6 300 149 735 132 23 15.72 +0.13
36 fa فارسی 1.2 6 407 178 704 130 27 15.69 -0.02
37 lt Lietuvių 1.0* 5 338 106 812 113 8 14.29 +0.23
38 ms Bahasa Melayu 1.0* 8 067 336 542 131 30 14.29 +0.09
39 simple Simple English 1.0* 3 774 0 963 70 6 13.87 +0.12
40 nn Nynorsk 1.2** 5 542 229 705 88 17 12.94 +0.52
41 is Íslenska 1.0* 2 870 25 960 49 5 12.84 -0.01
42 sh Srpskohrvatski / Српскохрватски 1.0* 6 791 354 555 111 19 12.51 +0.19
43 et Eesti 1.0* 4 432 175 785 66 12 12.38 +0.22
44 bs Bosanski 1.0* 4 731 204 738 92 3 12.14 +0.14
45 la Latina 1.1 3 995 209 762 55 13 11.75 +0.15
46 lv Latviešu 1.0* 5 480 315 619 94 11 11.70 +0.28
47 eu Euskara 1.0* 4 323 244 717 70 8 11.43 +0.18
48 af Afrikaans 1.0* 8 911 528 388 92 31 11.07 +0.29
49 mk Македонски 1.0* 5 092 388 575 62 13 10.06 +0.26
50 cy Cymraeg 1.0* 2 374 203 814 19 3 9.81 +0.37
51 ka ქართული 1.0* 3 820 355 638 38 8 9.22 +0.56
52 zh-yue 粵語 3.7** 6 149 532 436 55 15 8.47 +0.94
53 br Brezhoneg 1.0* 4 572 461 524 45 9 8.39 +0.08
54 ps پښتو 1.0* 30 967 906 31 35 67 8.28 +0.09
55 bn বাংলা 1.0* 3 957 475 512 40 11 8.25 +0.26
56 ta தமிழ் 0.9 3 531 430 575 31 3 7.76 +0.12
57 ml മലയാളം 1.0* 5 188 527 453 53 6 7.69 +0.44
58 hi हिन्दी 1.0* 2 961 467 538 28 5 7.44 +0.33
59 sq Shqip 1.0* 4 549 547 440 45 7 7.30 +0.31
60 lb Lëtzebuergesch 1.0* 4 414 542 450 38 8 7.21 +0.17
61 qu Runa Simi 1.0* 2 624 471 536 32 0 7.10 +0.16
62 yi ייִדיש 1.0* 3 061 523 486 22 7 6.82 +0.13
63 sw Kiswahili 1.0* 3 060 453 574 11 1 6.71 +0.25
64 ga Gaeilge 1.0* 4 036 569 434 27 9 6.66 +0.06
65 oc Occitan 1.0* 3 660 548 453 34 3 6.59 +0.09
66 scn Sicilianu 1.0* 2 549 491 527 20 1 6.59 +0.06
67 ast Asturianu 1.0* 3 660 529 481 27 2 6.49 +0.14
68 tl Tagalog 1.0* 3 412 578 427 30 4 6.23 +0.30
69 nds Plattdüütsch 1.0* 4 813 661 336 27 15 6.19 +0.18
70 be-x-old Беларуская (тарашкевіца) 1.0* 4 252 606 391 42 0 5.98 +0.10
71 io Ido 1.0* 1 701 497 539 2 1 5.95 +0.18
72 ur اردو 1.0* 4 043 642 357 34 6 5.85 +0.22
73 zh-min-nan Bân-lâm-gú 1.2 1 984 536 497 6 0 5.57 +0.05
74 az Azərbaycan 1.0* 2 687 603 418 14 3 5.36 +0.39
75 te తెలుగు 1.0* 6 969 775 207 47 10 5.19 +0.24
76 su Basa Sunda 1.0* 9 751 830 139 58 12 5.12 +0.05
77 an Aragonés 1.0* 2 765 625 399 13 1 4.92 +0.17
78 ku Kurdî / كوردی 1.0* 2 303 628 399 12 0 4.78 +0.11
79 jv Basa Jawa 1.0* 3 509 653 369 16 1 4.73 +0.21
80 als Alemannisch 1.0* 7 888 812 175 44 8 4.52 +0.26
81 mn Монгол 1.0* 4 361 708 309 17 5 4.51 +0.25
82 be Беларуская 1.0* 3 951 714 297 26 2 4.48 +0.43
83 ia Interlingua 1.0* 2 788 681 343 14 1 4.36 +0.07
84 mr मराठी 1.0* 2 741 699 326 9 5 4.35 +0.21
85 fy Frysk 1.0* 4 009 727 292 19 1 4.03 +0.12
86 gd Gàidhlig 1.0* 2 247 702 329 8 0 3.86 +0.00
87 tg Тоҷикӣ 1.0* 2 079 712 319 6 2 3.86 +0.04
88 bat-smg Žemaitėška 1.0* 1 486 699 337 3 0 3.73 +0.62
89 vec Vèneto 1.0* 3 424 764 257 16 2 3.63 +0.06
90 li Limburgs 1.0* 3 769 771 251 16 1 3.46 +0.02
91 kn ಕನ್ನಡ 1.0* 5 099 834 181 16 8 3.39 +0.12
92 uz O‘zbek 1.0* 2 548 771 256 11 1 3.30 -0.03
93 vo Volapük 1.0* 1 696 751 285 1 2 3.28 +0.02
94 cv Чăваш 1.0* 2 653 776 256 7 0 3.04 +0.10
95 nah Nāhuatl 1.0* 2 302 793 239 4 2 2.92 +0.05
96 hy Հայերեն 1.2 2 601 804 225 10 0 2.83 +0.04
97 pam Kapampangan 1.0* 5 522 852 165 21 1 2.76 +0.58
98 ht Krèyol ayisyen 1.0* 1 413 798 238 2 1 2.73 +0.01
99 mt Malti 1.0* 5 775 888 127 19 5 2.65 +0.16
100 fo Føroyskt 1.0* 2 861 832 197 9 1 2.59 +0.06
101 fur Furlan 1.0* 2 903 819 213 7 0 2.58 +0.03
102 kk Қазақша 1.0* 4 539 848 178 11 1 2.47 +0.04
103 nrm Nouormand/Normaund 1.0* 2 041 810 229 0 0 2.45 +0.00
104 pms Piemontèis 1.0* 3 372 852 178 6 3 2.45 +0.03
105 sco Scots 1.0* 2 018 822 216 1 0 2.35 -0.01
106 bar Boarisch 1.0* 4 887 888 136 14 1 2.15 +0.06
107 zh-classical 古文 / 文言文 3.7** 5 303 883 141 15 0 2.15 +0.03
108 nov Novial 1.0* 1 624 861 174 4 0 2.03 +0.00
109 ceb Sinugboanong Binisaya 0.8 2 521 876 156 6 1 2.02 +0.02
110 wa Walon 1.0* 2 402 862 173 4 0 2.02 +0.02
111 lij Líguru 1.0* 1 500 866 172 1 0 1.88 +0.04
112 dv ދިވެހިބަސް 1.0* 3 436 908 118 11 1 1.83 -0.01
113 jbo Lojban 1.0* 1 248 868 171 0 0 1.83 -0.01
114 sa संस्कृतम् 1.0* 1 176 874 163 2 0 1.83 +0.00
115 wuu 吴语 3.7** 10 612 947 75 10 6 1.81 +0.07
116 gv Gaelg 1.0* 2 142 881 156 2 0 1.75 +0.05
117 nds-nl Nedersaksisch 1.0* 2 931 878 160 1 0 1.75 +0.03
118 ln Lingala 1.0* 1 782 891 144 4 0 1.71 +0.00
119 diq Zazaki 1.0* 2 863 885 151 2 0 1.70 +0.04
120 szl Ślůnski 1.0* 3 087 889 146 1 0 1.61 +0.09
121 am አማርኛ 1.0* 1 263 890 149 0 0 1.59 +0.01
122 ne नेपाली 1.0* 3 835 922 110 5 2 1.58 +0.02
123 rm Rumantsch 1.0* 3 575 928 102 7 2 1.58 +0.12
124 kw Kernewek/Karnuack 1.0* 2 058 895 144 0 0 1.54 +0.01
125 new नेपाल भाषा 1.0* 4 359 937 90 11 1 1.53 +0.02
126 frp Arpitan 1.0* 1 468 897 142 0 0 1.52 +0.01
127 os Иронау 1.0* 2 098 905 132 2 0 1.50 +0.05
128 ksh Ripoarisch 1.0* 2 396 915 121 2 1 1.48 +0.00
129 gan 贛語 3.7** 3 633 910 125 3 0 1.47 +0.07
130 ang Englisc 1.0* 1 840 916 121 2 0 1.38 -0.01
131 hsb Hornjoserbsce 1.0* 2 496 920 116 3 0 1.37 +0.05
132 vls West-Vlams 1.0* 3 638 933 101 3 1 1.31 +0.21
133 si සිංහල 1.0* 15 605 998 29 3 9 1.30 -0.04
134 ext Estremeñu 1.0* 4 337 945 86 8 0 1.26 n/a
135 hak Hak-kâ-fa / 客家話 1.0* 1 623 921 118 0 0 1.26 -0.29
136 ilo Ilokano 1.0* 2 785 925 113 1 0 1.25 +0.03
137 yo Yorùbá 1.0* 2 129 922 117 0 0 1.25 +0.04
138 lad Dzhudezmo 1.0* 3 392 939 95 5 0 1.23 +0.01
139 bpy ইমার ঠার/বিষ্ণুপ্রিয়া মণিপুরী 1.0* 4 740 944 91 1 2 1.21 +0.03
140 tpi Tok Pisin 1.0* 4 689 963 67 7 2 1.21 -0.01
141 lmo Lumbaart 1.0* 3 415 951 80 8 0 1.20 +0.09
142 arc ܐܪܡܝܐ 1.0* 1 168 928 111 0 0 1.19 +0.02
143 gu ગુજરાતી 1.0* 3 108 949 83 7 0 1.19 +0.03
144 se Sámegiella 1.0* 1 246 929 110 0 0 1.18 -0.01
145 wo Wolof 1.0* 2 615 946 89 3 1 1.18 +0.01
146 ay Aymar 1.0* 826 932 107 0 0 1.14 -0.01
147 gn Avañe'ẽ 1.0* 1 072 940 98 1 0 1.09 +0.03
148 cbk-zam Chavacano de Zamboanga 1.0* 19 448 1 007 19 7 6 1.08 +0.00
149 crh Qırımtatarca 1.0* 1 566 942 97 0 0 1.04 +0.11
150 ba Башҡорт 1.0* 1 935 952 85 2 0 0.99 +0.21
151 ie Interlingue 1.0* 2 266 960 76 2 1 0.99 +0.00
152 co Corsu 1.0* 1 894 950 88 1 0 0.98 +0.00
153 fiu-vro Võro 1.0* 1 249 947 92 0 0 0.98 +0.05
154 sc Sardu 1.0* 2 129 966 70 3 0 0.88 +0.02
155 nap Nnapulitano 1.0* 2 293 961 77 1 0 0.87 +0.00
156 so Soomaaliga 1.0* 2 321 974 62 2 1 0.84 -0.09
157 ky Кыргызча 1.0* 1 172 961 78 0 0 0.83 +0.00
158 war Winaray 1.0* 1 577 961 78 0 0 0.83 -0.01
159 bo བོད་སྐད་ 1.0* 886 962 77 0 0 0.82 -0.02
160 kab Taqbaylit 1.0* 2 386 973 63 3 0 0.80 -0.01
161 csb Kaszëbsczi 1.0* 1 672 965 74 0 0 0.79 +0.00
162 lo ລາວ 1.0* 1 613 965 74 0 0 0.79 +0.02
163 tt Tatarça / Татарча 1.0* 2 465 974 62 3 0 0.79 +0.02
164 pdc Deitsch 1.0* 1 445 967 72 0 0 0.77 +0.00
165 eml Emiliàn e rumagnòl 1.0* 2 542 976 62 0 1 0.76 +0.03
166 iu ᐃᓄᒃᑎᑐᑦ 1.0* 862 969 70 0 0 0.75 +0.04
167 pag Pangasinan 1.0* 2 232 979 57 3 0 0.74 +0.01
168 kaa Qaraqalpaq tili 1.0* 2 799 976 62 1 0 0.71 +0.02
169 bcl Bikol 1.0* 1 714 974 65 0 0 0.70 +0.02
170 sd سنڌي، سندھی ، सिन्ध 1.0* 16 443 1 022 8 5 4 0.68 +0.01
171 tk تركمن / Туркмен 1.0* 1 648 980 59 0 0 0.63 +0.03
172 km ភាសាខ្មែរ 1.0* 2 811 989 48 2 0 0.60 +0.01
173 mg Malagasy 1.0* 1 299 983 56 0 0 0.60 +0.00
174 my Burmese 1.0* 3 443 992 43 3 0 0.59 n/a
175 cu Словѣньскъ 1.0* 1 428 984 55 0 0 0.59 +0.01
176 pa ਪੰਜਾਬੀ 1.0* 5 504 1 006 26 7 0 0.58 +0.02
177 mi Māori 1.0* 2 539 994 43 2 0 0.55 +0.05
178 zea Zeêuws 1.0* 3 549 993 45 1 0 0.52 +0.01
179 na dorerin Naoero 1.0* 1 839 997 40 2 0 0.51 +0.02
180 kg KiKongo 1.0* 1 383 992 47 0 0 0.50 +0.00
181 rmy romani - रोमानी 1.0* 1 663 992 47 0 0 0.50 +0.00
182 map-bms Basa Banyumasan 1.0* 1 531 994 45 0 0 0.48 +0.00
183 tet Tetun 1.0* 2 805 994 45 0 0 0.48 -0.01
184 haw Hawai`i 1.0* 1 236 996 43 0 0 0.46 +0.02
185 ig Igbo 1.0* 3 360 1 009 29 0 1 0.41 +0.02
186 stq Seeltersk 1.0* 3 086 1 005 33 1 0 0.40 +0.03
187 roa-rup Armãneashce 1.0* 1 884 1 003 36 0 0 0.38 +0.00
188 udm Удмурт кыл 1.0* 3 080 1 003 36 0 0 0.38 -0.01
189 ks कश्मीरी / كشميري 1.0* 3 130 1 009 27 2 0 0.37 +0.01
190 myv Эрзянь (Erzjanj Kelj) 1.0* 1 409 1 005 34 0 0 0.36 +0.01
191 mzn مَزِروني 1.0* 1 384 1 012 27 0 0 0.29 -0.01
192 ty Reo Mā`ohi 1.0* 1 242 1 013 26 0 0 0.28 -0.01
193 ce Нохчийн 1.0* 2 045 1 014 25 0 0 0.27 -0.01
194 pap Papiamentu 1.0* 1 988 1 020 18 1 0 0.24 +0.01
195 to faka Tonga 1.0* 1 712 1 017 22 0 0 0.24 +0.02
196 roa-tara Tarandíne 1.0* 5 489 1 029 8 2 0 0.17 +0.01
197 sah Саха тыла (Saxa Tyla) 1.0* 2 264 1 027 12 0 0 0.13 +0.03
198 or ଓଡ଼ିଆ 1.0* 898 1 030 9 0 0 0.10 +0.00
199 pi पाऴि 1.0* 3 574 1 034 4 1 0 0.09 +0.01
200 bh भोजपुरी 1.0* 1 777 1 034 5 0 0 0.05 +0.00
201 glk گیلکی 1.0* 875 1 034 5 0 0 0.05 +0.00
202 hif Fiji Hindi 1.0* 1 255 1 036 3 0 0 0.03 n/a
  • weights with "*" have no weight available so using default weight of 1.0
  • weights with "**" use the weight of the known related language (ex. 'zh')