Talk:List of Wikipedias by expanded sample of articles

From Meta, a Wikimedia project coordination wiki
Jump to navigation Jump to search

Stats[edit]

I see the mean and median sizes are identical for every WP here. I realize this is statistically possible, but it seems a bit implausible! :-) I also like the idea of using the alternate language weights. This will be useful. A. Mahoney (talk) 12:25, 21 August 2013 (UTC)

Oops, that was a mistake. Thanks for spotting it. Boivie (talk) 06:25, 22 August 2013 (UTC)

Hello, Boivie. How long does your bot make this sheet? :) Zemliakov (talk) 09:38, 6 September 2013 (UTC)

I don't understand exactly what you're asking for. But it takes a few hours to run the code, and I intend to run it once a month for a while. When I have found time to clean up the messy parts of the code I plan to publish it here somewhere, so it will be easy for someone else to update this page when I no longer do it. Boivie (talk) 16:57, 6 September 2013 (UTC)

Any ideas why (760*2 + 1453 *3 + 7721*4) / 400 ≠ 92.30 for enwiki? Since I see the same for other wikies there is no complains, but I am curious. --Igel B TyMaHe (talk) 19:18, 17 March 2014 (UTC)

That score should show percent of the maximum possible points. The formula as it is written is based of it being 10000 articles in the list. So it should really be 100 * (stubs*2 + articles*3 + long.articles*4) / (total.items*4). That means enwiki should get 100 * (760*2 + 1453 *3 + 7721*4) / (9957*4) = 92.30. Boivie (talk) 20:37, 17 March 2014 (UTC)
16 May 2014. enwiki: 100*(755*2+1457*3+7765*4)/(9957*4) = 92.75 ≠ 92.35. (755*2+1457*3+7765*4)/400 = 92.35. total.items is now 10000? --Igel B TyMaHe (talk) 09:18, 25 May 2014 (UTC)
Yes, it was 10000 on the 16th of May. I forgot to update the number in the top of the page. Boivie (talk) 19:43, 25 May 2014 (UTC)

"Shortest"[edit]

I wonder what's the point of the "shortest articles" listing. At this scale, it only displays 200-entry subset of missing articles anyway (except for :enwiki). Perhaps something like the Neglected article list from the List of Wikipedias by sample of articles would be more useful. — Yerpo Eh? 09:56, 31 March 2015 (UTC)

The point is to answer the question (that no one has asked): "If I want to improve my Wikipedia, where should I start?". So I suppose it's similar to the point of the Neglected page. I see some problems with using the Neglected page here. First, I see the Neglected page like a complement to the Absent Articles page. "What can I do besides creating the absent articles?" And here we don't have a page for (all) absent articles, because it would be to large. Secondly, I don't really like the edge factor. It seems to be more focused on improving scores, than improving Wikipedia. But the popularity factor is carried over to this page in a way. The absent articles are sorted with the most popular first. So you get the 200 most popular articles that are absent in each Wikipedia. Popularity is here counted by number of languages that have the article. Boivie (talk) 12:53, 31 March 2015 (UTC)
Oh, if they are sorted by popularity, then it makes much more sense, yes. Sorry, I didn't look at it too closely, so I thought they were only selected by name or position within the expanded list of articles. — Yerpo Eh? 14:14, 31 March 2015 (UTC)

Maithili[edit]

I suggest adding :maiwiki to the list, the pywikimedia framework has been finally updated this month so the wiki doesn't register as missing anymore. Plus, the community seems to be quite active. — Yerpo Eh? 07:10, 16 June 2015 (UTC)

Please update[edit]

Please update the list every early month.It will be more use ful--AJITH MS (talk) 17:16, 7 September 2015 (UTC)

I've been trying to update this list the 16th each month. Why would it be more useful if it was updated on another date? Boivie (talk) 05:56, 8 September 2015 (UTC)

Here internet is very limited so every early month we get the internet.I understood the reality.Sorry for my suggestion and thank for your information--AJITH MS (talk) 10:11, 8 September 2015 (UTC)

Gothic Wikipedia[edit]

Why the language column for Gothic Wikipedia is ðミフᄇðミフ﾿ðミヘトðミフᄚðミヘツðミフᄚðミフᄊðミフᄈðミフᄚ, and not 𐌲𐌿𐍄𐌹𐍃𐌺 as in List of Wikipedias by sample of articles? Hanif Al Husaini (talk) 13:23, 26 February 2017 (UTC)

It's a code table issue. I fix the List of Wikipedias by sample of articles by hand every month (but not the sub-pages - see e.g. List of Wikipedias by sample of articles/Stubs). — Yerpo Eh? 15:56, 26 February 2017 (UTC)

Absent Articles page[edit]

It would be helpful if Absent Articles page (https://meta.wikimedia.org/wiki/List_of_Wikipedias_by_expanded_sample_of_articles/Shortest) can be extended for all language wikis. This could help Editors to easily identify missing articles and start them - currently this page is populated for first 40 wikis only. —The preceding unsigned comment was added by 132.183.13.69 (talk) 13:30, 5. julij 2017‎

Unfortunately, such a page would be huge, so it is not practically possible. If the community is active and diverse, I encourage someone to figure out how to run the script locally and make a separate list somewhere in the project space. It can be easily modified to show all absent articles for one language. — Yerpo Eh? 16:49, 6 July 2017 (UTC)
I've done this for Latin -- I set it up as a copy of this list, but with links to the Latin pages if they exist, or a selection of other languages if they don't. See la:Vicipaedia:Paginae_quas_omnibus_Wikipediis_contineri_oportet/Expansio for the list, and see la:Usor:Amahoney/Myrias_epitome for our statistics. I'm happy to share the Perl code if it's useful. A. Mahoney (talk) 16:57, 12 July 2017 (UTC)

Weights of Chinese wikipedias[edit]

I noticed that the weights of zh.wiki and zh-classical.wiki are both 3.786. I think there should be more in zh-classical.wiki because classical Chinese uses much shorter sentences to express one thing.

Language Example 1 Example 2 Example 3
Chinese 走一千里路,是从迈第一步开始的。(14) 我怎么能够将你比作夏天?
你比夏天更美丽温婉(20)
过氧乙酸可以通过乙醛的自氧化反应制得。(18)
Classical Chinese 千里之行,始于足下。(8) 卿如夏日,载欣载和。
西风列列,众芳独嗟。(16)
过氧乙酸者,乙醛自氧化制之。(12)
English A journey of a thousand li begins with a single step. Shall I compare thee to a summer’s day?
Thou art more lovely and more temperate
Peracetic acid is produced industrially by the autoxidation of acetaldehyde.

--Leiem (talk) 15:53, 8 July 2018 (UTC)

Redirects are not encounted in absent column[edit]

For example if I click on absent Russian wiki articles the first will be an "elephant". This article doesn't exist and redirects to elephantine. Yanpas (talk) 21:51, 20 July 2018 (UTC)

The script completely relies on Wikidata, so if a redirect is included there, it will be counted as an article. I'm not sure what's current policy about listing redirects in Wikidata items, but it could probably be removed. In a wider context, it's a problem of content organization. Do we describe organisms in line with the common (usually English) use of their name or in line with taxonomy? We haven't really come to a consensus about it yet. — Yerpo Eh? 06:43, 22 July 2018 (UTC)

Please help updating this[edit]

The list supposes to be updated arround 16 August, but it has still not been updated after a week. Would somebody help updating this? Thank you very much.--Yaukasin (talk) 04:47, 23 August 2018 (UTC)

It seems like I don't have time to get the script working on my computer, so I won't be able to keep on updating this list monthly anymore. If someone else would like to run the script and update the list, please do! A version of the script is at List of Wikipedias by expanded sample of articles/Source code. Boivie (talk) 09:59, 23 October 2018 (UTC)
@Boivie: I have tried to run this with pywikibot but it seems that the code is out of date in print and it requires a json module that I don't have. -Theklan (talk) 10:02, 28 October 2018 (UTC)
Yes, that kind of print statements was okay in Python 2, but not in Python 3 that is mostly used nowadays. I don't think it should be too difficult to install a module if you can control your environment. But I can't guarantee that you won't run into more problems along the way. Boivie (talk) 04:29, 31 October 2018 (UTC)
I've taken the liberty of updating the script so it doesn't return a ton of 'rvslots' notifications, and include language editons that were started since the last update. Unfortunately, I cannot take responsibility for updating both sample rankings, but I can help occasionally. — Yerpo Eh? 14:33, 2 November 2018 (UTC)

Does anyone have any idea why the script would return "Q902 has no wikidata item" and then quit (also: "UnboundLocalError: local variable 'pagetext' referenced before assignment")? I stumbled upon that error with the expanded list for Djibouti (Q977), which is why I didn't update it two weeks ago, but now it's happening with the 1000 list too. I would imagine this to happen if there was a redirect linked, but I clicked through all the interwikis and didn't find any such case. — Yerpo Eh? 18:51, 6 November 2018 (UTC)

@Yerpo: @Boivie: I can't get it running. I get this error message:
Traceback (most recent call last):
  File "####\core\pwb.py", line 263, in <module>
    if not main():
  File "####\core\pwb.py", line 256, in main
    run_python_file(filename, [filename] + args, argvu, file_package)
  File "####\core\pwb.py", line 121, in run_python_file
    main_mod.__dict__)
  File ".\scripts\ListExpandedSample.py", line 15, in <module>
    import simplejson as json
ModuleNotFoundError: No module named 'simplejson'
<class 'ModuleNotFoundError'>
CRITICAL: Closing network session.
I don't know what to do now. -Theklan (talk) 18:58, 8 November 2018 (UTC)