Wikimedia Blog/Drafts/Swedish Wikipedia hits a million articles

From Meta, a Wikimedia project coordination wiki

Swedish Wikipedia hits a million articles[edit]

On June 15, 2013, Swedish Wikipedia hit one million articles, joining the club of English, Dutch, German, French, Italian, Russian and Spanish Wikipedias. The article that broke the barrier was the butterfly species Erysichton elaborata. There is, however, one fact that separates this million article milestone from almost all others.

The one milionth article was not manually created by a human, but written by a piece of software (a "bot"). The bot, in this case, Lsjbot, collects data from different sources, and then compiles the information into a format that fits Wikipedia. Lsjbot has to date created about 454.000 articles, almost half of the articles on Swedish Wikipedia.

Lsj, Sverker Johansson, who runs Lsjbot

Bot-created articles has led to some debate, both before Lsjbot started his run, and currently. First, there was a lengthy discussion on Swedish Wikipedia after the initial proposal by Lsjbot's operator, science teaher Sverker Johansson. The Swedish Wikipedia community was wary, having learned the lessons from previous conflicts about article-creating bots, including rambot in 2002. But there was also curiosity, so a series of test runs was made to make sure that the articles turned out okay.

After review, the Swedish Wikipedia editor community said okay. Lsjbot started by creating articles about different species of animals and plants - articles that are largely uncontroversial and that can have a similar format without feeling mechanical.

Other criticism has come from prolific article writer Achim Raschka on German Wikipedia's Kurier. Here the main complaint was that article is short: only 4 sentences long. This is a valid complaint. Even if longer articles are not always better, they tend to contain more information.

Therein lies the rub. The bots use as many datasets as their operators can find, but many sources are behind paywalls or are incomplete across entire taxon (covering only selected species). The upside of this criticism is that each statement in articles created by bots is supported by references, something that doesn't happen in many other articles. This means that more references are added to Wikipedia by bots than by humans. This is of course not in itself a sign of quality, but it is a start for human contributors to search for more information. As with any article in Wikipedia, the readers can also help make bot-created articles better.

Is this the future for Wikipedia, to let software create articles? With Wikidata, it is certainly becoming easier to use software to create articles, something that can benefit the smaller Wikipedias. But we still need more humans to help make the determination of which sources are high quality, what information is presented correctly and what qualifies as clear writing.

So far, bots have shown that they are much quicker to create articles. In that respect, I, for one, bow to our robot overlords.

Lennart Guldbransson, Swedish Wikipedia editor

Notes[edit]