Community Wishlist Survey 2023/Bots and gadgets/A more performant bot to replace ListeriaBot

From Meta, a Wikimedia project coordination wiki

A more performant bot to replace ListeriaBot

  • Problem: Since the release of ListeriaBot version 2, there is a problem with lists with many links to big entities (for example, links to countries), which make the list generation fail. See some previous discussions: 1, 2. The issue is reported in the official repository: magnusmanske/listeria_rs#66
  • Proposed solution: Create a new, more performant bot or fix the memory issues with ListeriaBot. An enhanced version of ListeriaBot, one that can carry us into the 2030s.
  • Who would benefit: ListeriaBot is used to automatically maintain lists of Wikidata items in various Wikipedias. It is used intensively by Women in Red in various languages, as well as other projects (3,204 lists in English, 1,152 in French).
  • More comments: A similar proposal by MarioGom was merged into this one.
  • Phabricator tickets:
  • Proposer: Edelseider (talk) 18:13, 23 January 2023 (UTC)[reply]

Discussion

Tracked in Phabricator:
Task T329380
Also see ListeriaBot returns "Last line: ERROR: Login failed":
M2k~dewiki (talk) 18:18, 10 February 2023 (UTC)[reply]

I think I fixed the problem that the "bot passwords" login did not work via API on some WMF wikis (any insight into that?) by switching the login to OAuth. I also fixed the table column problem, which was caused by a Pull Request that I foolishly merged. Right now the bot is somewhat limited by the Toolforge kubernetes constraints; relaxed constraints (more RAM, more CPUs per pod) would help there. Barring serious malfunction, the bot should now update every list on every wiki once every two days. If that does not work for some tasks, TABernacle provides a possible alternative. --Magnus Manske (talk) 14:37, 13 February 2023 (UTC)[reply]

I often get a Killed by Memory Overload message, if I dont limit the number of entries to 1000 (for example, depending on the number of columns) whith LIMIT 1000. For 1500 or 2000 entries I get a Killed by Memory Overload message. M2k~dewiki (talk) 14:40, 13 February 2023 (UTC)[reply]
@Magnus Manske:@M2k~dewiki: Even a slight change of query (see here), and it's "killed by OS for overloading memory". It has been like this and it still is like this. --Edelseider (talk) 15:50, 13 February 2023 (UTC)[reply]
What is concerning is the bot will load the entire item for each items involved. This is really not a good thing. GZWDer (talk) 18:27, 17 February 2023 (UTC)[reply]
Indeed. I haven't looked into this for a while, but the problem used to be not so much about number of results, but about number of big items present in some fields, such as some geographical entities. MarioGom (talk) 12:38, 18 February 2023 (UTC)[reply]

Voting