List of Wikipedias by sample of articles

From Meta, a Wikimedia project coordination wiki
This is an archived version of this page, as edited by MarsRover (talk | contribs) at 02:41, 1 June 2008 (+[kaa:],[bcl:] to table). It may differ significantly from the current version.

This page contains a list of the largest Wikipedias under the auspices of the Wikimedia Foundation for various languages. Test Wikipedias are listed at the Wikimedia Incubator Wiki project.

This list of Wikipedias is based on the List of articles every Wikipedia should have (total: 1051 on the 31st of May, 2008) as a sample, but the actual list which is used is at the end of List of Wikipedias by sample of articles/Source code and can be a little different. For every Wikipedia, the articles in this sample list is retrieved (based on interwiki links from the English Wikipedia) and the number of characters is calculated (minus "comments" and the "interwiki" text at the bottom of the article). The size of each article is then adjusted for each language by multiplying it by the language weight. The articles are divided in four classes: "absent" (i.e. non-existing; size = 0), "stubs" (size in characters inferior to 10,000), "articles" (size between 10,000 and 30,000) and "long articles" (size superior to 30,000). The average weighted size of the non-absent articles in the sample is also calculated. Finally, a score is computed, based on the following formula: rawscore = stubs + articles*4 + long.articles*9. In order to have a consistent scale the raw score is normalized by dividing by the maximum score and multiplying by 100. The maximum score would be maxscore = (absent + stubs + articles + long.articles)*9. The final score is the following score = rawscore / maxscore * 100. The language editions are then listed in order of decreasing score.

A copy of the program used to obtain this list is in List of Wikipedias by sample of articles/Source code.

Absent articles for major Wikipedias are in List of Wikipedias by sample of articles/Absent Articles.

See also:


Last Update: 31 May 2008

Wiki Language Weight Average Article
Size (wt.chars)
Absent
(0k)
Stubs
(< 10k)
Articles
(10-30k)
Long Art.
(> 30k)
Score Growth
1 en English 1.0 44 689 0 68 327 656 76.96 +0.35
2 de Deutsch 1.0 33 765 9 185 425 431 60.99 +0.42
3 fr Français 1.0 29 236 7 270 430 344 53.77 +0.26
4 es Español 1.1 26 242 6 308 451 286 49.54 +0.93
5 it Italiano 1.1 22 967 9 340 442 260 47.02 +1.17
6 zh 中文 3.7 23 786 0 421 384 246 44.10 +0.23
7 ru Русский 1.4 22 338 5 406 407 233 43.67 +0.92
8 ja 日本語 1.9 16 175 10 483 422 136 35.89 +0.19
9 pt Português 1.1 15 772 14 508 392 137 34.98 +0.72
10 pl Polski 1.1 14 079 17 582 342 110 31.08 +0.23
11 hu Magyar 1.1 14 559 94 530 307 120 30.00 +0.42
12 fi Suomi 1.1 13 269 38 607 301 105 29.14 +0.19
13 cs Čeština 1.3 12 603 58 586 316 91 28.22 +0.45
14 sv Svenska 1.1 11 832 2 674 280 95 28.01 +0.09
15 he עברית 1.2 11 668 32 596 351 72 27.99 +0.22
16 nl Nederlands 0.9 11 231 18 624 346 63 27.22 +0.03
17 vi Tiếng Việt 1.1 15 224 214 478 241 118 26.47 +0.19
18 no Norsk (Bokmål) 1.2 10 971 21 689 259 82 26.04 +0.65
19 uk Українська 1.3 10 274 37 682 267 65 24.69 +0.36
20 ca Català 1.1 10 096 2 723 266 60 24.60 +0.36
21 sr Српски / Srpski 1.4 11 626 125 601 241 84 24.54 +0.82
22 tr Türkçe 1.3 11 190 62 654 265 66 24.49 +0.50
23 hr Hrvatski 1.3 9 307 113 671 213 54 21.24 +0.27
24 sk Slovenčina 1.3 10 332 162 620 210 59 21.05 +0.47
25 ro Română 1.1 9 732 166 630 201 54 20.30 +0.13
26 da Dansk 1.2 8 256 57 788 158 48 19.58 -0.04
27 ko 한국어 2.5 7 551 95 751 160 44 18.91 +0.51
28 el Ελληνικά 1.1 10 088 271 539 184 57 18.90 +0.28
29 bg Български 1.1 8 051 127 702 184 38 18.82 +0.34
30 id Bahasa Indonesia 1.0* 6 346 78 795 151 27 17.36 -0.08
31 ar العربية 1.0 5 497 37 853 141 20 16.88 +0.17
32 gl Galego 1.0* 7 985 206 645 177 23 16.49 +0.30
33 sl Slovenščina 1.2 6 920 107 769 157 18 16.48 +0.14
34 eo Esperanto 1.1 5 725 1 931 97 22 16.04 +0.15
35 th ไทย 1.0 6 225 165 737 125 24 15.36 +0.18
36 fa فارسی 1.2 6 194 199 703 122 27 15.16 +0.45
37 ms Bahasa Melayu 1.0* 8 003 360 530 131 30 14.00 +0.09
38 lt Lietuvių 1.0* 5 314 139 792 113 7 13.82 +0.11
39 simple Simple English 1.0* 3 587 0 982 63 6 13.62 +0.10
40 is Íslenska 1.0* 2 791 23 977 46 5 12.75 +0.04
41 nn Nynorsk 1.2** 5 362 262 689 85 15 12.31 +1.16
42 sh Srpskohrvatski / Српскохрватски 1.0* 6 633 368 556 110 17 12.15 +0.07
43 et Eesti 1.0* 4 420 207 766 64 12 11.97 +0.05
44 bs Bosanski 1.0* 4 794 262 696 90 3 11.45 +0.12
45 la Latina 1.1 3 857 225 761 53 12 11.43 +0.19
46 eu Euskara 1.0* 4 179 260 716 67 8 11.16 +0.10
47 lv Latviešu 1.0* 5 548 380 571 89 11 10.85 +0.32
48 af Afrikaans 1.0* 8 308 555 381 89 26 10.27 +0.29
49 mk Македонски 1.0* 4 899 410 569 57 13 9.68 +0.22
50 ka ქართული 1.0* 3 893 418 591 34 8 8.45 +0.01
51 br Brezhoneg 1.0* 4 517 486 511 45 9 8.16 +0.32
52 ps پښتو 1.0* 31 317 919 29 36 66 8.12 -0.28
53 cy Cymraeg 1.0* 2 485 351 683 14 3 8.10 +0.24
54 bn বাংলা 1.0* 3 864 502 498 40 10 7.92 +0.05
55 ta தமிழ் 0.9 3 505 462 556 29 3 7.40 +0.25
56 zh-yue 粵語 3.7** 6 974 650 335 51 15 7.13 +0.03
57 qu Runa Simi 1.0* 2 620 501 518 32 0 6.83 +0.12
58 lb Lëtzebuergesch 1.0* 4 461 579 428 36 8 6.81 +0.07
59 sq Shqip 1.0* 4 250 574 431 41 5 6.77 +0.19
60 yi ייִדיש 1.0* 3 180 543 476 23 7 6.68 +0.03
61 ml മലയാളം 1.0* 5 128 609 387 50 5 6.68 +0.34
62 hi हिन्दी 1.0* 2 584 520 506 20 4 6.58 +0.19
63 ga Gaeilge 1.0* 3 992 584 432 26 9 6.52 +0.96
64 scn Sicilianu 1.0* 2 498 508 522 20 1 6.46 +0.05
65 oc Occitan 1.0* 3 549 567 448 32 3 6.38 +0.05
66 ast Asturianu 1.0* 3 667 551 470 28 2 6.34 +0.02
67 sw Kiswahili 1.0* 3 049 494 545 11 1 6.32 +0.34
68 nds Plattdüütsch 1.0* 4 573 681 331 26 13 5.84 +0.30
69 io Ido 1.0* 1 694 515 534 1 1 5.78 -0.02
70 tl Tagalog 1.0* 3 250 617 404 27 3 5.70 +0.12
71 be-x-old Беларуская (тарашкевіца) 1.0* 4 093 635 378 38 0 5.60 +0.53
72 ur اردو 1.0* 4 151 678 333 33 6 5.49 +0.04
73 zh-min-nan Bân-lâm-gú 1.2 1 962 551 494 6 0 5.48 -0.02
74 su Basa Sunda 1.0* 9 470 840 143 55 12 4.98 +0.18
75 az Azərbaycan 1.0* 2 373 646 392 10 2 4.76 +0.06
76 ku Kurdî / كوردی 1.0* 2 233 645 395 11 0 4.64 +0.04
77 an Aragonés 1.0* 2 659 655 384 11 1 4.62 +0.02
78 ia Interlingua 1.0* 2 724 694 343 13 1 4.27 +0.06
79 als Alemannisch 1.0* 7 418 836 171 37 7 4.04 +0.21
80 mr मराठी 1.0* 2 793 733 305 8 5 4.04 +0.08
81 te తెలుగు 1.0* 6 799 846 160 38 7 3.96 +0.19
82 jv Basa Jawa 1.0* 3 176 721 317 13 0 3.90 +0.18
83 fy Frysk 1.0* 3 870 747 285 18 1 3.87 -0.01
84 gd Gàidhlig 1.0* 2 206 714 329 8 0 3.82 +0.58
85 tg Тоҷикӣ 1.0* 1 961 725 319 5 2 3.77 -0.01
86 mn Монгол 1.0* 4 066 771 260 17 3 3.75 +0.02
87 be Беларуская 1.0* 3 415 767 267 15 2 3.65 +0.34
88 vec Vèneto 1.0* 3 433 786 247 16 2 3.48 +0.02
89 li Limburgs 1.0* 3 689 790 246 14 1 3.29 +0.03
90 uz O‘zbek 1.0* 2 663 789 250 10 2 3.26 +0.12
91 vo Volapük 1.0* 1 635 765 283 1 2 3.22 +0.00
92 kn ಕನ್ನಡ 1.0* 5 140 857 170 15 8 3.20 +0.10
93 bat-smg Žemaitėška 1.0* 1 502 769 280 2 0 3.04 +0.03
94 cv Чăваш 1.0* 2 627 798 246 7 0 2.90 +0.05
95 nah Nāhuatl 1.0* 2 350 819 225 5 1 2.69 +0.08
96 ht Krèyol ayisyen 1.0* 1 365 812 236 2 1 2.67 +0.01
97 hy Հայերեն 1.2 2 519 822 221 8 0 2.67 +0.02
98 fur Furlan 1.0* 2 890 834 210 7 0 2.52 +0.03
99 nrm Nouormand/Normaund 1.0* 2 003 820 231 0 0 2.44 -0.01
100 mt Malti 1.0* 4 830 907 124 16 4 2.37 +0.08
101 pms Piemontèis 1.0* 3 407 871 171 6 3 2.35 +0.01
102 sco Scots 1.0* 1 926 833 217 1 0 2.34 +0.06
103 fo Føroyskt 1.0* 2 474 856 188 7 0 2.28 +0.03
104 kk Қазақша 1.0* 4 251 871 169 11 0 2.25 +0.18
105 pam Kapampangan 1.0* 5 807 910 124 16 1 2.08 +0.04
106 bar Boarisch 1.0* 4 856 906 130 14 1 2.06 +0.08
107 zh-classical 古文 / 文言文 3.7** 5 347 903 134 13 1 2.06 +0.05
108 nov Novial 1.0* 1 596 872 175 4 0 2.02 -0.01
109 wa Walon 1.0* 2 394 880 167 4 0 1.93 +0.10
110 dv ދިވެހިބަސް 1.0* 3 748 918 120 10 2 1.88 -0.01
111 ceb Sinugboanong Binisaya 0.8 2 181 893 153 5 0 1.83 +0.09
112 sa संस्कृतम् 1.0* 1 189 889 160 2 0 1.78 +0.00
113 lij Líguru 1.0* 1 484 885 166 0 0 1.75 +0.01
114 jbo Lojban 1.0* 1 235 886 165 0 0 1.74 +0.02
115 ln Lingala 1.0* 1 722 902 145 4 0 1.70 +0.01
116 nds-nl Nedersaksisch 1.0* 2 910 894 156 1 0 1.69 +0.10
117 diq Zazaki 1.0* 2 652 901 149 1 0 1.62 +0.01
118 ne नेपाली 1.0* 3 769 934 110 5 2 1.56 +0.09
119 kw Kernewek/Karnuack 1.0* 2 022 906 145 0 0 1.53 +0.00
120 am አማርኛ 1.0* 1 217 908 143 0 0 1.51 +0.09
121 wuu 吴语 3.7** 12 609 984 52 9 6 1.50 +0.05
122 frp Arpitan 1.0* 1 449 910 141 0 0 1.49 +0.00
123 new नेपाल भाषा 1.0* 4 355 954 85 11 1 1.46 +0.02
124 os Иронау 1.0* 1 937 917 133 1 0 1.45 -0.01
125 gv Gaelg 1.0* 2 170 922 127 2 0 1.43 +0.26
126 ang Englisc 1.0* 1 788 926 123 2 0 1.38 +0.05
127 ksh Ripoarisch 1.0* 2 227 929 120 0 1 1.37 +0.11
128 hsb Hornjoserbsce 1.0* 2 300 933 116 2 0 1.31 +0.07
129 rm Rumantsch 1.0* 2 355 954 90 6 1 1.30 +0.22
130 ilo Ilokano 1.0* 2 802 939 110 1 0 1.21 -0.03
131 tpi Tok Pisin 1.0* 4 659 974 68 7 2 1.21 +0.01
132 se Sámegiella 1.0* 1 203 939 112 0 0 1.18 -0.01
133 bpy ইমার ঠার/বিষ্ণুপ্রিয়া মণিপুরী 1.0* 4 772 959 88 1 2 1.16 +0.05
134 lad Dzhudezmo 1.0* 3 274 953 94 4 0 1.16 +0.02
135 wo Wolof 1.0* 2 579 959 88 3 1 1.15 -0.02
136 arc ܐܪܡܝܐ 1.0* 1 150 943 108 0 0 1.14 -0.01
137 ay Aymar 1.0* 780 943 108 0 0 1.14 +0.06
138 gu ગુજરાતી 1.0* 3 046 965 79 7 0 1.13 +0.02
139 yo Yorùbá 1.0* 1 909 945 106 0 0 1.12 +0.06
140 cbk-zam Chavacano de Zamboanga 1.0* 18 985 1 018 20 7 6 1.08 +0.04
141 vls West-Vlams 1.0* 3 673 967 80 2 1 1.03 +0.04
142 lmo Lumbaart 1.0* 3 923 979 64 8 0 1.01 +0.00
143 ie Interlingue 1.0* 2 237 972 76 2 1 0.98 +0.04
144 co Corsu 1.0* 1 889 965 85 1 0 0.94 +0.00
145 fiu-vro Võro 1.0* 1 205 963 88 0 0 0.93 +0.00
146 so Soomaaliga 1.0* 3 861 985 62 2 2 0.93 +0.03
147 crh Qırımtatarca 1.0* 1 569 965 86 0 0 0.91 +0.00
148 nap Nnapulitano 1.0* 2 234 975 75 1 0 0.84 +0.04
149 war Winaray 1.0* 1 543 972 79 0 0 0.84 +0.02
150 sc Sardu 1.0* 1 931 979 70 2 0 0.82 +0.03
151 si සිංහල 1.0* 13 282 1 022 21 3 5 0.82 +0.09
152 kab Taqbaylit 1.0* 2 466 985 63 3 0 0.79 +0.02
153 ky Кыргызча 1.0* 1 120 976 75 0 0 0.79 +0.03
154 csb Kaszëbsczi 1.0* 1 644 977 74 0 0 0.78 -0.02
155 tt Tatarça / Татарча 1.0* 2 438 986 62 3 0 0.78 +0.00
156 lo ລາວ 1.0* 1 620 978 73 0 0 0.77 +0.10
157 pdc Deitsch 1.0* 1 402 978 73 0 0 0.77 +0.01
158 eml Emiliàn e rumagnòl 1.0* 2 583 991 59 0 1 0.72 +0.00
159 pag Pangasinan 1.0* 2 186 993 55 3 0 0.71 -0.01
160 iu ᐃᓄᒃᑎᑐᑦ 1.0* 801 985 66 0 0 0.70 +0.08
161 ba Башҡорт 1.0* 2 113 989 61 1 0 0.69 +0.00
162 bcl bcl:Bikol 1.0* 1 694 987 64 0 0 0.68 n/a
163 sd سنڌي، سندھی ، सिन्ध 1.0* 17 292 1 035 7 5 4 0.67 +0.03
164 gn Avañe'ẽ 1.0* 1 493 997 53 1 0 0.60 +0.14
165 tk تركمن / Туркмен 1.0* 1 631 994 57 0 0 0.60 +0.00
166 km ភាសាខ្មែរ 1.0* 2 666 999 48 2 0 0.59 -0.09
167 mg Malagasy 1.0* 1 313 997 54 0 0 0.57 +0.03
168 cu Словѣньскъ 1.0* 1 357 1 000 51 0 0 0.54 +0.01
169 pa ਪੰਜਾਬੀ 1.0* 5 411 1 019 26 6 0 0.53 +0.00
170 mi Māori 1.0* 2 648 1 010 39 2 0 0.50 +0.00
171 rmy romani - रोमानी 1.0* 1 683 1 004 47 0 0 0.50 +0.01
172 kaa kaa:Qaraqalpaq tili 1.0* 2 324 1 008 42 1 0 0.49 n/a
173 kg KiKongo 1.0* 1 287 1 005 46 0 0 0.49 +0.00
174 tet Tetun 1.0* 2 779 1 005 46 0 0 0.49 +0.00
175 zea Zeêuws 1.0* 3 668 1 008 42 1 0 0.49 +0.14
176 map-bms Basa Banyumasan 1.0* 1 528 1 007 44 0 0 0.47 -0.01
177 na dorerin Naoero 1.0* 1 704 1 012 39 0 0 0.41 -0.02
178 udm Удмурт кыл 1.0* 3 068 1 014 37 0 0 0.39 -0.01
179 ig Igbo 1.0* 3 206 1 023 27 0 1 0.38 +0.02
180 roa-rup Armãneashce 1.0* 1 864 1 015 36 0 0 0.38 +0.00
181 ks कश्मीरी / كشميري 1.0* 3 160 1 022 26 2 0 0.36 +0.00
182 haw Hawai`i 1.0* 1 169 1 021 30 0 0 0.32 +0.00
183 mzn مَزِروني 1.0* 1 321 1 024 27 0 0 0.29 -0.01
184 ty Reo Mā`ohi 1.0* 1 178 1 024 27 0 0 0.29 +0.00
185 stq Seeltersk 1.0* 2 339 1 028 22 1 0 0.27 +0.07
186 ce Нохчийн 1.0* 2 057 1 026 25 0 0 0.26 +0.00
187 pap Papiamentu 1.0* 2 024 1 032 17 1 0 0.22 +0.08
188 to faka Tonga 1.0* 1 683 1 030 21 0 0 0.22 +0.03
189 roa-tara Tarandíne 1.0* 5 437 1 042 7 2 0 0.16 +0.00
190 or ଓଡ଼ିଆ 1.0* 856 1 042 9 0 0 0.10 +0.00
191 pi पाऴि 1.0* 3 543 1 046 4 1 0 0.08 +0.00
192 bh भोजपुरी 1.0* 1 688 1 046 5 0 0 0.05 +0.00
193 glk گیلکی 1.0* 848 1 046 5 0 0 0.05 +0.00
  • weights with "*" have no weight available so using default weight of 1.0
  • weights with "**" use the weight of the known related language (ex. 'zh')