Logos.Wiktionary

From Meta, a Wikimedia project coordination wiki
Jump to navigation Jump to search

Ultimate Wiktionary is a proposed project to bring the best of all the Wiktionaries together. It will give us one project where all lexicological content of all the Wiktionaries and the communities of all the Wiktionaries can be merged. The intention of this project is to create synergy and prevent the endless duplication of labour.

There is much that can be read about whata we can do when we have an important lexicological resource. In his keynote speech on Wikimania, Jimbo named words and lexicological information as one of the ten important information areas that should be Free.

Even with all the information we have in the Wiktionaries, we are only scratching the surface of the lexicological content we need to have in order to present comprehensive information to our public. There are many languages and the Dutch language alone has over 222.930 words when you consider that the Dutch Wiktionary has only 19.577 words and that many of the articles we have are in one of 293 languages, you will understand that our aim of having all words in all language is a bit of an undertaking.

Logos is a company based in Italy that for many years has had the same dream that we, the wiktionarians, share with them; they have been working for many years on a resource where you can find free lexicological information on the Internet. Their dicologos dictionary now has more than 8.000.000 words in 232 languages. This resource that started within the Logos Company now boasts a community with more than 6.000 registered users.

While working on Ultimate Wiktionary, we spent a lot of attention on getting content to make our dream come true. The content of Logos and their community was something we found rather quickly, Sabine knew them professionally and Sj got into contact with them while working on Wikimania 2005. This resulted in a presentation given by Rodrigo Vergara on Wikimania where he demonstrated this apple of his eye. We had a lot of contact with SJ prior to Wikimania and in the week before Wikimania he even went to Modena to talk to Logos about cooperation in the new project.

For Rodrigo there was one thing that was really important; he wanted to have recognition for the work done by the community of translators that created this most beautiful resource. This was discussed in a meeting with Jimbo and later with other members of the board. It is therefore proposed to use both the names "logos" and "Wiktionary" in the new project by putting it at the URL http://logos.wiktionary.org. Doing this would be in keeping with the Chilean tradition where Rodrigo is from; the names of both parents are given to their offspring. The Logos Wiktionary would be a beautiful name: logos means word, Wiktionary is a combination of wiki and dictionary so it expresses very well what the project is about.

One of the other important consequences of this cooperation will be that this huge and important resource would be licensed under the GFDL. Where it has always been freely available as a reference, it would now be free to use in so many more new ways.

Some numbers[edit]

The following list is a breakdown of the number of words in the Logos "dicologos" dictionary. They are only headwords. With inflections, the number would be much bigger.

ENGLISH - 947147
ITALIAN - 678000
SPANISH - 410740
FRENCH - 481497
GERMAN - 467000
ESPERANTO - 106470
PORTUGUESE - 85247
SWEDISH - 61874
LATIN - 56660
ARABIC - 52982
PERSIAN - 50876
DUTCH - 49956
BASQUE - 41106
DANISH - 37785
POLISH - 26380
FINNISH - 24162
GREEK - 22103
JAPANESE - 911
NORWEGIAN - 20720
RUSSIAN - 19639
TURKISH - 19480
CATALAN - 19170
SLOVAK - 18869
CHINESE - 18767
HUNGARIAN - 18623
ROMAGNOLO - 18145
ALBANIAN - 17835
HINDI - 16677
GALICIAN - 16118
CZECH - 15824
VALENCIAN - 15503
FURLAN - 13141
GUJARATI - 12571
WELSH - 12461
MARATHI - 11512
PUNJABI - 11248
SANSKRIT - 11025
BRETON - 10854
ROMANIAN - 10673
SERBIAN - 9867
MAPUNZUGUN - 8647
GUARANI - 8494
HEBREW - 7830
CALABRESE - 7188
BRAZILIAN PORTUGUESE - 7119
VENETIAN - 7046
LATVIAN - 6181
FLEMISH - 5327
CROATIAN - 5242
IRISH - 5042
PIEMONTESE - 4805
SARDINIAN CAMPIDANESU - 4773
ESTONIAN - 4445
BRESCIANO - 4434
BULGARIAN - 4320
MUDNéS - 3988
SICILIAN - 3807
TRADITIONAL CHINESE - 3555
NAPULITANO - 3443
INDONESIAN - 2914
KOREAN - 2777
AFRIKAANS - 2751
UKRAINIAN - 2724
AYMARA - 2658
BOLOGNESE - 2496
LIMBURGIAN - 2418
ZENEIZE - 2279
DZORATÂI - 2165
ARAGONES - 2000
ROMAN - 1910
RAPANUI - 1905
REGGIANO - 1820
QUECHUA - 1803
THAI - 1761
YIDDISH - 1741
PADUAN - 1694
GRIKO SALENTINO - 1632
LITHUANIAN - 1606
JUDEO-SPANISH - 1582
BENGALI; BANGLA - 1573
PARMIGIANO - 1507
KURDISH KURMANJI - 1486
ICELANDIC - 1472
SLOVENIAN - 1432
TRIESTINO - 1429
MOKSHAN - 1396
MALTESE - 1354
ASTURIAN - 1254
MANTUAN - 1236
OCCITAN - 1189
LOMBARDO OCCIDENTALE - 1165
PAPIAMENTU - 1142
SARDINIAN LOGUDORESU - 1125
LUNFARDO - 1117
WALLON - 1093
LEONESE - 1051
AZERI (LATIN SCRIPT) - 964
SWAHILI - 902
KURDISH SORANI - 885
OLD GREEK - 865
MAASAI - 846
SAAMI - 801
MALAY - 724
MALAGASY - 710
FERRARESE - 703
GALEGO EONAVIEGO - 679
MARCHIGIANO - 650
SARDINIAN (LIMBA SARDA UNIFICADA) - 604
LADIN - 576
CATANESE - 557
BERGAMASCO - 538
UMBRO-SABINO - 513
PROVENÇAL - 497
CORNISH - 490
FRISIAN - 477
SAMMARINESE - 411
BYELORUSSIAN - 404
FAEROESE - 384
SWISS GERMAN - 377
LUXEMBOURGISH - 357
MONGOLIAN - 347
TREVISAN - 339
CALÓ - 333
MACEDONIAN - 332
VIESTANO - 293
BOSNIAN - 281
TURKMEN - 277
PERUGINO - 274
COSENTINO - 271
CHECHEN - 269
MAORI - 262
SAMOAN - 247
CORSICAN - 240
MODENESE ORIENTALE - 238
KONKNNI - 231
SCOTS GAELIC - 223
PUGLIESE - 223
SHONA - 216
SANGO - 216
MANX - 212
TAGALOG - 211
ROMANSH - 207
KAZAKH - 205
URDU - 202
VIETNAMESE - 198
GEORGIAN - 189
LINGALA - 176
BIELLESE - 172
ARMENIAN - 162
LECCESE - 158
SOMALI - 149
FANESE - 146
AZERI (ARABIC SCRIPT) - 131
UZBEK - 101
LIGURIAN - 96
TAJIK - 92
WOLOF - 92
OLD NORSE - 73
PUTENZESE - 69
ZULU - 64
NUORESE - 53
FIJI - 50
TATAR - 49
MAYA - 46
CANADIAN FRENCH - 45
SCHWÄBISCH - 43
REGGIANO ARSÀVE - 42
KIRGHIZ - 40
(AFAN) OROMO - 40
SUNDANESE - 39
ARAMAIC - 38
BISLAMA - 38
UIGHUR - 37
HAWAIIAN - 33
NAHUATL - 32
KUNZA - 23
AMHARIC - 22
TIGRINYA - 22
BIHARI - 19
INTERLINGUE - 19
INNU - 19
SINGHALESE - 19
GREENLANDIC - 14
ROMANI - 14
SETSWANA - 14
SRANAN - 14
TAMIL - 13
PASHTO; PUSHTO - 12
SPANGLISH - 10
CHINOOK - 8
LOMBARDO ORIENTALE - 8
XHOSA - 8
TRENTINO - 8
CREE - 6
MIRANDESE - 6
CAMBODIAN - 6
DARI - 5
HAUSA - 4
VOLAPUK - 4
KANNADA - 4
AFAR - 3
BANTU - 3
BURMESE - 3
SESOTHO - 3
RHAETO-ROMANCE - 3
MALAYALAM - 3
BASHKIR - 3
ABKHAZIAN - 3
JAVANESE - 3
BEHDINI - 2
YORUBA - 2
TONGA - 2
KINYARWANDA - 2
KIRUNDI - 2
NEPALI - 2
KAWÉSQAR(ALACALUFE) - 2
LAOTIAN - 2
ASSAMESE - 1
TELUGU - 1
ORIYA - 1
MAYANGNA - 1
HMONG - 1
ILUKO - 1
BHUTANESE - 1