Talk:Lingua Libre

From Meta, a Wikimedia project coordination wiki

/Archive 2020/Special user-rights

Hi everyone. This page is opened as a contingency plan. You can continue or open conversations here, store information and anything. We will also periodically gather relevant status information here and ping everyone when we start to be back online. Please add your username below if you wish to be notified of our progress.

{{ping|Pamputt|Olaf|Poslovitch|WikiLucas00|Poemat|Eihel|Titodutta|सुबोध कुलकर्णी|Subodh (CIS-A2K)|Lyokoï|LoquaxFR|DSwissK|Vami|Bicolino34}}

Yug (talk) 13:07, 11 March 2021 (UTC)[reply]

Server fire and Backup?[edit]

Millions of websites offline after fire at French cloud services firm --Reuters.com
See official Wikimedia France message on #Lingua_Libre_2.3_-_Phoenix_Edition_ǃ section.
TL;DR: All recorded audios are already and safely on Wikimedia Commons. 3 weeks required to restore website, database into proper shape. Some documentations, lists, translations may be lost.

Operations to prepare[edit]

Bots, Programation[edit]

@Poslovitch: were you able to save the requests form from the kur and cat communities ? Yug (talk) 14:55, 14 March 2021 (UTC)[reply]

@Yug: No. I wasn't. --Poslovitch (talk) 15:46, 14 March 2021 (UTC)[reply]

@Olaf: Sascha Brawer from UNILEX just shared a Qrank system for Wikidata entities, based on agregated Wikipedia articles's pageviews. No actionable now for us, just stored here. For 2025 when all words of wikt will be recorded XD Yug (talk) 11:34, 16 March 2021 (UTC)[reply]

@Yug: Well, the top of this list is:
  1. Wikimedia main page
  2. United States Senate
  3. Donald Trump
  4. Joe Biden
  5. Bible
  6. Elizabeth II
  7. Kamala Harris
  8. 1918-1920 flu pandemic
  9. COVID-19 pandemic
  10. YouTube
  11. United States of America
Not directly useful. Still after a bit of filtering, one could produce some interesting lists from this - for example lists of famous people or places in each country. Especially for English, because apparently American point of view dominates the whole dataset. Olaf (talk) 15:52, 16 March 2021 (UTC)[reply]
After functional words, concepts (aka "articles") would be a fair heuristic to record more. I think any popular page is a popular concept and should have its audio. Far, far in the future :D Yug (talk) 20:21, 16 March 2021 (UTC)[reply]

Interesting contacts[edit]

Networking operation to prepare for after back online and MediaWiki upgrades.

Wished contact Organisation Domain Contact LL volunteer
April tailingua.com Taiwanese languages contact Yug
April Moedict Chinese, Taiwanese languages Yug
April ELARarchive.org Endangered Languages @elararchive Yug
April NTNU MTC Chinese courses Miao Lin-Zucker Yug
*: footnote comments here.

Languages[edit]

101 Sign Language Project[edit]

Moved to Talk:Lingua Libre/SignIt. Yug (talk) 09:05, 14 April 2023 (UTC)[reply]

Lingua Libre 2.3 - Phoenix Edition ǃ[edit]

A Phoenix version of LiLi's logo!
TL;DR: All previously recorded audios are on Wikimedia Commons. We need 3 weeks to restore the website, Wikibase and plugins into proper shape and upgrade MediaWiki to 1.35. Some documentation, word lists and translations may be lost.
Hi everyone,


After the fire at OVHCloud, here are some news on Lingua Libre, how and when we will have it back up again.
Lingua Libre is hosted on two separate servers. The first one, which hosts our BlazeGraph and a copy of the data for each item in the Wikibase, is preserved. This is probably not the case to date with the second, which hosted our MediaWiki and Wikibase installation. Either way, your work is not lost: all the recordings are already and still available on Commons. The OVH fire forces us to restore data from our BlazeGraph. The uncertainty on the part of the host leads us to prioritize the reconstruction of a functional version of Lingua Libre under MediaWiki 1.35.

It requires to:

  • Set up LinguaLibre with MediaWiki 1.35
  • Regenerate the elements of the Wikibase from the BlazeGraph.
  • Check in documentation, help pages and talk pages from Google, Yahoo, Yandex, Bing and Internet Archive caches.
  • Recreate word lists with bots: words without wiktionary records (72 lists), frequent words (100+ lists), Marathi by letter, etc.
  • Rebuild site configurations and modules.


Best case scenario recovery date: April 9, 2021. More news on that will come on the 2nd of April.
Best regards to everyone ǃ Thank you for your patience and help ǃ --Adelaide Calais WMFr (talk) 16:02, 19 March 2021 (UTC)[reply]

Thank Adelaide for this announcement.
Greetings @Pamputt, Olaf, Poslovitch, WikiLucas00, Poemat, Eihel, Titodutta, सुबोध कुलकर्णी, Subodh (CIS-A2K), Lyokoï, LoquaxFR, DSwissK, and Vami:
So for us contributors we now have 3 weeks pause, when developers are working, restoring the data into the suitable form, and doing the required verification. We will then have to lead the Wikipages restoration and wikification effort. Special help will be needed to restore things up : Wikilucas for some templates & translation set up, Olaf for the wikt lists bot, Eihel for modules, every willing volunteers for wikification of pages. Happy to have a clear plan ahead, after April 9. Yug (tallk) 18:06, 19 March 2021 (UTC)[reply]
Thanks for the update Adélaïde Calais WMFr about the page that may be lost, especially the help pages and other wiki pages, have they been already copied from the different caches (Google, Bing, Yahoo, etc.) by someone (Wikivalley or other) or do you still need to do it? Pamputt (talk) 19:47, 21 March 2021 (UTC)[reply]
@Pamputt: We made a first scrapping of Google, Yahoo, Bing, Yandex cached page by hand for English wikipages. We then shared our files on https://github.com/hugolpz/lilidown. We haven't digged for translations nor for lists. It is needed but I haven't had time these past days. Few languages (gascon, swedish, ...) have been especially active on translations and that would be a pity to lose those. Yug (talk) 22:46, 21 March 2021 (UTC)[reply]
Many thanks Adélaïde Calais WMFr for this optimistic updates. We really appreciate the efforts being put by all of you to get back all of us to the recording studio soon. Take care... - सुबोध कुलकर्णी (talk) 04:51, 22 March 2021 (UTC)[reply]
Thanks Adélaïde Calais WMFr; since we're on the 5th of April, are there more news available? (cf "more news on the 2nd April"). Missing Lili too much :p Julien Baley 16:40, 5 April 2021 (UTC)[reply]
@Julien Baley: Adélaïde Calais WMFr made this announcement last week (in French, I quickly translated):
"We held our Lingua Libre rebuilding committee this morning.
To answer your questions about OVHCloud: the Lingua Libre machine is part of the special cases of the company, and will thus be treated among the last ones (no guarantees or dates have been given). We are therefore continuing our action plan following the 5 points mentioned above, in order to reassemble Lingua Libre ourselves.
So far, we have put together an Alpha version of Lingua Libre in MediaWiki 1.35 (the first task), and tests have been done on the blazegraph (second task). This ensures that the action plan is feasible.
Nevertheless, the final deployment still requires a lot of adjustments, which is why we are now readjusting the expected public launch date of Lingua Libre to 22 April 2021.
Thank you for your support, we will keep you posted as soon as we have more results."
All the best — WikiLucas (🖋️) 18:54, 7 April 2021 (UTC)[reply]
Thanks Lucas, happy to get this update (although rather sad that it's bad news, of course). Julien Baley 20:49, 7 April 2021 (UTC)[reply]

@Adélaïde Calais WMFr Could you please share any update about the restoration of site? Marathi community members are really missing it, eager to start...Subodh (CIS-A2K) (talk) 05:16, 15 April 2021 (UTC)[reply]

Hi @Subodh (CIS-A2K): I posted a message from Adélaïde last week (just above the votes), the announced date for the re-opening is the 22nd of April, and according to the Tech Team working on the project, this deadline will be respected. Only one week to go! All the best — WikiLucas (🖋️) 06:40, 15 April 2021 (UTC)[reply]

Phoenix > Marathi recordings[edit]

@Subodh (CIS-A2K):, the feedback I get under the hood are positive (work is going well) but I don't have more. I would like to plan with you the Marathi lists revamp.

Coming Marathi lists
Lists prefixes Word counts
List:Mar/Letter_स-* 4976
List:Mar/Letter_प-* 4462
List:Mar/Letter_म-* 3745
List:Mar/Letter_क-* 3545
List:Mar/Letter_व-* 3195
List:Mar/Letter_न-* 2201
List:Mar/Letter_ब-* 2183
List:Mar/Letter_अ-* 2134
List:Mar/Letter_र-* 1789
List:Mar/Letter_द-* 1666
List:Mar/Letter_आ-* 1623
List:Mar/Letter_ग-* 1568
List:Mar/Letter_ज-* 1524
List:Mar/Letter_त-* 1507
List:Mar/Letter_श-* 1376
List:Mar/Letter_ल-* 1132
List:Mar/Letter_ह-* 1102
List:Mar/Letter_च-* 1089
List:Mar/Letter_उ-* 1076
List:Mar/Letter_भ-* 1025
List:Mar/Letter_others-* 886
List:Mar/Letter_य-* 809
List:Mar/Letter_फ-* 791
List:Mar/Letter_ख-* 766
List:Mar/Letter_ट-* 652
List:Mar/Letter_घ-* 645
List:Mar/Letter_ए-* 480
List:Mar/Letter_इ-* 456
List:Mar/Letter_ध-* 446
List:Mar/Letter_ड-* 420
List:Mar/Letter_ठ-* 318
List:Mar/Letter_झ-* 273
Total 50,000
  1. Marathi lists for frequent words, split by letters, will be uploaded within 2 days of being back online.
  2. Old Marathi list may need to be purged : deleted if redundant. This will reduce confusion.
  3. The list structure is displayed on the right-side table. You may want to distribute these recording missions to your team-members' according to their efficiency.

Yug (talk) 10:52, 16 April 2021 (UTC)[reply]

@Subodh (CIS-A2K):, reopening is expected for very soon (April 22 as announced or less than a week). Yug (talk) 15:00, 20 April 2021 (UTC)[reply]

Lingua Libre is up and running again ǃ[edit]

After the datacenter fire, and thanks to outstanding team work with WikiValley, Nicolas Vigneron, Yug, Poslovitch and Jitrixis, our baby is rising from the ashes.


We are so proud to present to you Lingua Libre – Phoenix Edition! Same record wizard as before, but functional ExternalTool and dataset page, all running on MediaWiki 1.35.


Have fun with it!


--Adelaide Calais WMFr (talk) 12:55, 22 April 2021 (UTC)[reply]

Thanks and congratulations on the recovery. Blue Rasberry (talk) 14:27, 22 April 2021 (UTC)[reply]
Congratulations! Olaf (talk) 14:57, 22 April 2021 (UTC)[reply]
That's great gift on World Book & Copyright Day! We all salute to them who strived hard for this to happen in such difficult times. Marathi community is very happy to resume their work. Thanks again, --सुबोध कुलकर्णी (talk) 13:49, 23 April 2021 (UTC)[reply]
P.S. - BTW it is showing unsecured site message. One has to enable advance settings to go to site. Could you please clarify?
Notifying @Pamputt, Poemat, Eihel, Titodutta, Subodh (CIS-A2K), LoquaxFR, Vami, and Bicolino34: in case they didn't see the announcement. — WikiLucas (🖋️) 14:34, 23 April 2021 (UTC)[reply]

Lack of Translation Tag[edit]

It is found that the word "Info" inside the infobox is not translate-tagged. I'm not familiar with that part. May someone tag it please? Thanks. -- (Dasze) 10:48, 18 September 2021 (UTC)[reply]

Done, thanks for letting us know! — WikiLucas (🖋️) 21:02, 18 September 2021 (UTC)[reply]