Talk:Community Wishlist/W403
Add topicstrongly support, fits the strategy of WMF
[edit]Due to the interactivity, there are no automatisms and in the end it is the human who controls. Presumably you have to cooperate with large providers (reasoning and deep research to map “reality” are not built on the side), which we should do anyway in order to bring our strength of human "knowledge curation" into play as a strength). see "Humans first" as the strategy of Foundation 2025-2028: no KI articles but tools to support authors see strategy here Wortulo (talk) 05:22, 12 July 2025 (UTC)
Following up on your wish about AI-supported article checks for accuracy and updates
[edit]Hello @M2k~dewiki,
Thank you for submitting this thoughtful wish, and for being such a loyal and engaged contributor to the Community Wishlist over the time.
This is a compelling idea, especially given the growing interest in how AI might support content quality and maintenance across Wikipedia. We’re currently looking into which team might be best positioned to evaluate the feasibility of this request, and we’ll get back to you once we’ve identified the right point of contact. ARamadan-WMF (talk) 08:04, 15 July 2025 (UTC)
- Hi @M2k~dewiki! As mentioned above this is well aligned with our strategy longer term. We don't plan to develop such a tool in the current fiscal year, but I'd love to connect with you again when we start making our next annual plan to explore this further. @Samwalton9 (WMF) is the Product Manager of the team who would likely own something like this, so I'm tagging him here as well to get you connected. Thank you again for submitting this wish!SPerry-WMF (talk) 23:07, 16 July 2025 (UTC)
Outdated articles
[edit]
Very interested in this subject – you may want to check out c:Category:Wikipedia updating. Articles becoming outdated, especially in smaller Wikipedias (and German WP is one of the largest) is one of the main reasons for why I think there is a great need for and great benefit a translation that keeps articles in sync with the most-up-to-date article (usually that's the English Wikipedia article except for things that are country-specific). If you translate it the normal way (for example by machine translating an English Wikipedia article and then improving it), it will become quickly outdated. Charts used in articles are also widely outdated and I've created Commons categories like c:Category:Charts showing data through 2010 to enable updating them. I think what is suggested here is a good idea but there are probably many caveats to this – for example it may be difficult to scale and work with 1000 articles but not many times as many and as the news report elaborates there many false positives or correct detections but false suggested corrections. One approach that one could consider is to have AI compare articles to the article equivalent of highest-quality (usually ENWP) and then ask it about which things are missing and where the articles contradict. Prototyperspective (talk) 22:33, 24 July 2025 (UTC)
However
[edit]I really think that some communities would not want to internalize the use of AI as part of their processes. * Pppery * it has begun 15:46, 10 October 2025 (UTC)
- It could also and would more likely be done by external volunteers instead of some fully-integrated internal process. People can use AI on any Wikipedia to produce reports on outdatedness and inaccuracy + errors, whether or not editors act on these or find them useful is the other question along with how well this works (it doesn't work very well currently apparently with many false positives and similar things but was still useful enough to fix some errors etc in real-world practice). Prototyperspective (talk) 16:07, 10 October 2025 (UTC)
Been doing this
[edit]

See en:User:Polygnotus/barfoo (197 articles) and en:User:Polygnotus/barfoo2 (1437 articles) for pre-generated stuff I still have to work through.
We don't need the WMF for this, we can easily have a userscript with a Next button that loads the article and shows the suggestion next to the article.
Claude is far better than these other models, in my testing.
I also have one for on the fly feedback generation, see en:User:Polygnotus/Scripts/AI_Proofreader Polygnotus (talk) 01:15, 22 October 2025 (UTC)
Update: I now have a userscript that loads pregenerated suggestions for improvements from Claude and displays them next to the relevant article, see en:User:Polygnotus/Scripts/Backlog.js. Polygnotus (talk) 04:46, 22 October 2025 (UTC)
@M2k~dewiki: See above. Polygnotus (talk) 04:51, 22 October 2025 (UTC)
@ARamadan-WMF, Samwalton9 (WMF), and SPerry-WMF: We do not want any AI-related tools from the WMF. Also, see above, this already exists. Polygnotus (talk) 08:06, 22 October 2025 (UTC)
- @Polygnotus This is an interesting exploration, thanks for taking a look! I haven't read through the wish fully yet but I did want to ask you about your comment here that "We do not want any AI-related tools from the WMF". When you say "AI" can you clarify a little more what you mean? I ask because there are already a range of Machine Learning-powered tools deployed on Wikimedia projects, for example the ORES models for filtering Recent Changes, and anti-vandalism bots like ClueBot NG. Do you classify this as AI, or are you particularly concerned about usage of LLMs? Samwalton9 (WMF) (talk) 08:53, 22 October 2025 (UTC)
- @Samwalton9 (WMF) Being precise with words is difficult, as a human.

- What I mean is:
- The WMF should at least quadruple the amount of resources it spends on nerds. Since that hasn't happened, the time of nerds is in short supply, and there are a lot of things that need to be done.
- @Samwalton9 (WMF) Being precise with words is difficult, as a human.
- I am afraid that the WMF will spend a large amount of resources doing all kinds of stuff with AI, because the nerds who work for the WMF lack someone who says "you have to do this sucky task before you can play with AI models". As a nerd I know how much it sucks to have to rewrite 20 year old PHP code. I know how much it sucks to improve CirrusSearch/Elasticsearch. I know how much it sucks to make incremental improvements to something that already exists, I would much rather be bolting my shiny new interface on top of a dinosaur codebase. I would much rather be playing around on Liftwing.
- I added an important task to a Kanban board of the relevant WMF team and I was told by a reliable source, and this is a quote, that WMF teams decide for themselves what they work on. This is truly terrible news. No IT companies work like that, because IT companies usually have a responsibility to the shareholders to make money. Usually people get fired if they don't do the work that is necessary but only the fun stuff. If the team decides what to work on then no one has a vision, there is no leader who ensures all the noses are pointed in the right direction, and that the important stuff gets done (first, ideally). The stuff the community wants and needs and has been asking for is not fun, its just work. If teams decide what to do then people can just do the thing they find fun that looks good on their CV (another over-engineered new interface, yet another unwanted LLM project).
- And this is what has been happening for a long time. I have seen Phabricator. I know. One quick example, they actually wasted time creating a new Community Wishlist interface, instead of actually doing the stuff the community wanted.
- As a nerd I need someone who tells me which tasks are important, and what I am working on this week. If I don't have that person, I achieve little but I have a lot of fun. I am one of the few people on this website who is not anti-AI, I see it as a useful tool that saves me a lot of time. But it is also dangerous, especially for organizations like the WMF. We need the WMF to rewrite MediaWiki from the ground up. The amount of workarounds on top of workarounds the community has created to deal with the fact that MediaWiki lacks the stuff we need is amazing, but it is also a mess.
- So we need a full rewrite, with a lot of stuff integrated that is now external. Ideally a system that can differentiate between data and metadata (the mixing of which leads to all kinds of problems).
- Give me a professional JavaScript coder for a day or 3 and I can help them bundle the spaghetti code shown above and make something worth using. We'd combine the AI Proofreader that sends the wikicode of an article to Gemini/ChatGPT/OpenAI with the tool that does AI Source Verification (does this source support this statement?) and the new tool that goes through a list of pregenerated suggestions on how to improve articles. Quick win for you guys. But there is a real danger the WMF goes all-in on fun AI stuff and neglects the important stuff (even more/longer than it already has). Hope that helps, Polygnotus (talk) 09:02, 22 October 2025 (UTC)
- Since you have the "(WMF)" in your username, and I am speculating about a company I know little about, do you think my fear is reasonable? Am I making sense? Or is it completely unfounded? Polygnotus (talk) 12:44, 22 October 2025 (UTC)
- Thanks for taking the time to write all this. I'm sorry for how much I'm about to write, but I wanted to engage from my personal opinion on all of this.
- I totally hear the concern about going "all-in on fun AI stuff", but thankfully that's not what we're doing! There are one or two teams running small experiments - seeing where AI might be useful - such as the Future Audiences team, but most Product teams at WMF are chugging along doing the same kinds of non-AI feature improvements and deployments as we did before recent AI developments. On my team for example, although we're deploying software that makes use of machine learning models, we're not using LLMs. This is why I asked where you drew the line, because "AI" has come to mean many things!
- Regarding putting a ticket on a Kanban, team 'kanban's on Phabricator generally refer to the work that the team has scoped, prioritised, and decided to work on right now (like, this week). Most teams have a bunch of processes that go into making sure that the team is working on the most impactful tasks, and ensuring that when work is prioritised to be actively worked on (i.e. moved onto the kanban), it is well understood, properly scoped, and ready for a software engineer to pick up. That's why you were asked not to put tickets directly on the kanban - if any community member could do that no product teams would ever get anything built - their software engineers would be perpetually confused which of the hundreds of tickets they should be working on! You're more than welcome to file Phabricator tickets and tag them with the relevant team/backlog/product, and those tickets will be reviewed alongside the rest to determine how high priority they are.
- Anyway that's just a clarification about that particular issue - to the broader point, Product & Tech has an annual planning process (the latest iteration is documented here) through which we define high level goals for the year, then define buckets of work and specific projects that can help push towards that goal. For most P+T teams, improving Wikipedia for editors is the main goal they're working towards, with various focuses. For a tangible example again, my team is Moderator Tools - our team focuses on tools for active editors, patrollers, and administrators. This year we're working on a project to build a dashboard for patrollers which can help them review edits. That work is part of the WE1.3 'Key Result', as described on that annual plan page I linked earlier, which says "By the end of Q3, 10% of contributors who were presented with a homepage aimed towards new moderators visited it two weeks in a row." and ladders up to WE1 which says "Contributions increase because volunteers are offered compelling opportunities and understand their impact". That's all part of the Wiki Experiences bucket, which is about the reading and editing experiences on Wikimedia projects. At the team level, we're then constantly talking to users, making suggestions, sharing updates, and doing research to make sure we're building something that will be useful and make users editing experiences better. Everything we decide to build is interrogated internally - is this really the most impactful thing we could spend our time on? How could we verify if it's going to be useful quickly, so we can change focus if it isn't? What data or interviews can we do to check our understanding of what will be helpful to build? It's really hard to get this 100% right all the time, but every product team is doing its best to build genuinely useful experiences and spend its time in the most effective way. We have to balance this with maintaining what's already deployed, of course, which is why my team has picked up projects like Nuke improvements and PageTriage fixes. We can never fix every bug, or make every requested improvement, but teams do already do a significant amount of maintenance work. To take my team as an example again, although we're ostensibly focusing on the Dashboard project at the moment, on our kanban this week we're also ensuring temporary accounts work in PageTriage, reviewing a large number of patches which aim to improve the performance of Special:RecentChanges, enabling users to define different messages for different user groups in Automoderator, fixing a duplicate-entry Watchlist bug, and adding Undo to mobile diffs. It's a slightly more maintenance-heavy week than usual this week, but we've always got something ongoing that's fixing up some existing software.
- I can't comment on the idea of rewriting MediaWiki from the ground up, except to say that I can't imagine it happening - rewrites are a huge undertaking and it seems unlikely to me, personally, that the benefits of that would outweigh all the improvements we could otherwise make in the meantime. Samwalton9 (WMF) (talk) 15:37, 27 October 2025 (UTC)
- @Samwalton9 (WMF) Thanks for your detailed answer. My brain has a tendency to worry, and it is difficult to silence your own brain, you know?
- What we desperately need from the Moderator Tools team is a way to mass-undo edits. See Community_Wishlist/W448. It doesn't have to be beautiful, some quick and hacky userscript would be good enough. We need this yesterday.
- Interesting that you guys work so anti-agile (waterfall). Where I come from the mantra is fail fast, which is incompatible with longterm planning (imo, although opinions differ). Annual plans sound nice until you are stuck.
- Creating a new dashboard is precisely what I am saying should not be done. I described it as bolting my shiny new interface on top of a dinosaur codebase and people can just do the thing they find fun that looks good on their CV (another over-engineered new interface...
- We don't need new dashboards, finding a place where our attention is needed is incredibly easy because it is everywhere. Everyone already knows where to focus their attention (usually the stuff they find interesting/rewarding), and people will recruit other users to join them on their quests. We don't need a dashboard to find places where attention is needed, we got en:WP:MAINT, en:WP:TASKCENTER, en:WP:BACKLOG et cetera.
- And, looking at mw:Moderator_Tools/Dashboard, the idea behind making a dashboard is based on incorrect assumptions.
- Organising your work can't be done through such a dashboard, and is currently not one of the most pressing problems.
- to be presented with the impact of their contributions would be nice, but we don't need a dashboard for that, and a dashboard is a bad way to do that.
- It uses the word 'moderator' but we don't have those (we have administrators). It talks about "newer contributors", but only a tiny percentage of newer users are fit to jump into moderation tasks, and 99.999% should just deal with content first to get a feel of the place and understand what we do.
- Newer contributors who jump into vandalfighting and the like are often a giant problem, and it really sucks to have to tell some overenthusiastic newbie that they caused damage while they thought they were helping (I've had to do that multiple times).
- Looking at mw:Special:MyLanguage/Moderator_Tools/Dashboard#August_2025:
- We hope to introduce users to the idea of reviewing the contributions of others and removing bad content from their wiki. Having new(er) users try to patrol the edits made by others, when they have no clue what is good and what is bad and how to deal with people, is a bad idea. This is why en:WP:CVUA exists, although that also has a bunch of problems. What ends up happening is that a bunch of overenthusiastic young users revert everything on sight, or revert randomly, and then get into problems with those who try to help them or those whose edits they reverted. People are overly eager to police edits made by other people, and tend to overestimate their ability to do that correctly. We have a problem with new potential users being bitten, we certainly don't have a problem getting newer users to review other people's edits (we do have a problem stopping them doing that, and reducing the harm they cause).

- Displays a selection of recent edits which may need to be reverted but have not yet been patrolled or reverted. We don't need a dashboard for that, it is already on its own page which is better because if you are working on that task all that other stuff is just a bunch of distractions. This is way too little horizontal space for the information required to determine if clicking the edit is worth it or not. To give enough information about each edit, while presenting a bunch of them to choose from, you need an entire screen.
- Lists discussions on key noticeboards We don't need a dashboard for that. We have individual pages for each noticeboard and an individual page to look at recent edits, because such tasks need the full width of a screen. And people don't task-switch between patrolling recent changes and participating in noticeboard discussions. Most people are interested in one or two noticeboards, maybe 3, but not all of them. You can see the problem in the sketch already: titles of discussions are far too long to fit in a 3rd of a dashboard. The lead section of any discussion is far too long to fit in a 3rd of a screen, unless you have a giant screen. Webdesigners with giant expensive monitors often produce designs that are unusable for the hoi polloi.
- we are designing a module which provides a brief overview of the critical information necessary to make judgements about reverting bad edits. Again, this is not necessary on a dashboard. Anyone who is able to do moderation tasks should know all that stuff already. A dashboard is a sitrep, it is not the place to give advice.
- We are considering retaining the impact module We don't need this either on a dashboard, we already have it on its own page. It would just be a way to fill wasted space.
- It even shows the amount of users awaiting unblock, which is pointless data to anyone who is not an administrator, and therefore unable to deal with that backlog. It shows pages pending review, but the people who work on unblocking users are not the same as those who work on reviewing pages. If you want to show relevant data you can't use a one-size-fits-all approach, you'd need to determine what to show based on factors such as account age, experience level, advanced permissions, et cetera.
- So it sounds like this dashboard will be a jumbled mess of random stuff with no clear audience in mind. No one needs all that stuff on one page.
- The Annual Plan appears to be a list of vague platitudes.
- Something like Key result WE1.3: By the end of Q3, 10% of contributors who were presented with a homepage aimed towards new moderators visited it two weeks in a row. appears to be worded that way to ensure that target can be hit, not because it is a meaningful metric.
- If you look at how often a page gets visited then you are measuring the quality and quantity of inbound links, and the funnel that brings people to that page.
- If you want to measure if building a specific dashboard was a good idea or a waste of time you need to look at (perceived) value added and things like actual user engagement and CTR and stuff like that.
- In the future, if this idea is validated If? Of course the idea will be "validated" if you don't use a reasonable metric to gauge success or failure. This is why you don't think up such metrics a year in advance; they are only meaningful when you know what you need to measure and how, and how you should interpret the result.
- So to me it sounds like someone came up with the idea to have measurable results (which is a good idea), but demanded that teams put measurable results in the annual plan (bad idea), a year in advance (very bad idea). Of course no team wanted to commit to something unachievable and it is very difficult to predict the future, so the "key result" is meaningless.
- All software I write sucks, but once I reach the 5th iteration of the same idea the code becomes far more elegant (read: less terrible
). MediaWiki is still in iteration 1 after 20 years... Having a team rewrite it from scratch should be a priority (not for you specifically, but for the WMF). Doing a ship of Theseus-style rewrite means we can take advantage of modern techniques and approaches. I also just really hate PHP, especially 20 year old PHP.
- All software I write sucks, but once I reach the 5th iteration of the same idea the code becomes far more elegant (read: less terrible
- Everything we decide to build is interrogated internally Exactly, that is one of the major problems! What you need is someone who interrogates what you decide to build externally. I am happy to help, and there is also the Product and Technology Advisory Council.
- 2 ways you can deal with this feedback: 1) Assume I must be evil/an idiot 2) see if there is value in external feedback from someone who does not work for the WMF.
- I am happy to provide feedback on any plan you may come across, feel free to post on my enwiki talkpage. I am not some luddite who hates everything AI/ML. Even if you disagree with me, and even if it is too late to turn back, it is valuable to get another perspective. I am not sure how far along you guys are with this plan, is it still possible to pivot? If not, can we maybe do the massUndo tool as a side project? To me, it sounds like a quick win. The community will be grateful because it will save them a lot of time and it is not very difficult to code (we'll have to figure out some way to generate lists, but I have some ideas). Polygnotus (talk) 06:28, 28 October 2025 (UTC)
- Can't possibly your whole large reply but I'd like to chime in on one thing you said We don't need new dashboards, finding a place where our attention is needed is incredibly easy because it is everywhere. Everyone already knows where to focus their attention (usually the stuff they find interesting/rewarding), and people will recruit other users to join them on their quests. We don't need a dashboard to find places where attention is needed, we got en:WP:MAINT, en:WP:TASKCENTER, en:WP:BACKLOG et cetera. to me this view is understandable and I may to some degree agree with it as dashboards usually aren't needed and not that useful in practice. However, this depends on which kind of dashboards and you are terribly wrong at Everyone already knows where to focus their attention which is 100% the case for people like you and me and quite likely everyone reading this on the talk page; however there are countless thousands of newly registered and potential users who don't know what they could do as well as some long-standing users who have weird editing styles and have not even 1k edits after a few years. There aren't many things that would be as impactful as connecting people looking to get started or looking for further things to do with tasks, particularly tasks they are interested in. The existing things are not used, not known much, and most importantly not well-designed showing random boring pages to correct typos in etc. But maybe what you said was specific to moderator dashboards; I think the usefulness there may be lower and I'd only develop that as part of a broader personalized (e.g. only show tasks where more experience is needed based on account age & unreverted contributions count) & personalizable tasks center. Prototyperspective (talk) 16:06, 28 October 2025 (UTC)
- @Prototyperspective The 'moderator' dashboard thing is for people who have some experience. thousands of newly registered and potential users who don't know what they could do is a en:WP:NEWCOMERTASK thing. Sam Walton is from the Moderator Tools team, for stuff related to newcomers you should talk to the Growth Team which works on a set of features to encourage newcomers to make edits. Polygnotus (talk) 02:04, 29 October 2025 (UTC)
- Can't possibly your whole large reply but I'd like to chime in on one thing you said We don't need new dashboards, finding a place where our attention is needed is incredibly easy because it is everywhere. Everyone already knows where to focus their attention (usually the stuff they find interesting/rewarding), and people will recruit other users to join them on their quests. We don't need a dashboard to find places where attention is needed, we got en:WP:MAINT, en:WP:TASKCENTER, en:WP:BACKLOG et cetera. to me this view is understandable and I may to some degree agree with it as dashboards usually aren't needed and not that useful in practice. However, this depends on which kind of dashboards and you are terribly wrong at Everyone already knows where to focus their attention which is 100% the case for people like you and me and quite likely everyone reading this on the talk page; however there are countless thousands of newly registered and potential users who don't know what they could do as well as some long-standing users who have weird editing styles and have not even 1k edits after a few years. There aren't many things that would be as impactful as connecting people looking to get started or looking for further things to do with tasks, particularly tasks they are interested in. The existing things are not used, not known much, and most importantly not well-designed showing random boring pages to correct typos in etc. But maybe what you said was specific to moderator dashboards; I think the usefulness there may be lower and I'd only develop that as part of a broader personalized (e.g. only show tasks where more experience is needed based on account age & unreverted contributions count) & personalizable tasks center. Prototyperspective (talk) 16:06, 28 October 2025 (UTC)
- Thinking about the name of your team, you are probably also the right person to talk to about Community_Wishlist/W450, right? This would save a tremendous amount of time and effort. Polygnotus (talk) 13:02, 28 October 2025 (UTC)