Research talk:Language-Agnostic Topic Classification

Hi Isaac (WMF) and Diego (WMF) - can I ask what the difference is between these two tools, and whether we can get more documentation for them?

Also, should the top of this research page point to the list-building.toolforge.org directly? Thanks. - Fuzheado (talk) 13:02, 5 October 2023 (UTC)Reply

Hey @Fuzheado, this is my bad for not having a good central page to describe the various initiatives and how they relate to each other. This meta page was actually intended to encompass the work for what is now this model on LiftWing (documentation) that takes a single article and classifies it according to a basic, established taxonomy of topics. It's obviously available for anyone but the primary motivation was for enabling cross-wiki analyses of pageview/editing/content trends and tools like helping newcomers find relevant tasks by making topics a filter available in Search. I'll update the page to make this clearer.
Then, there's a second line of work that is related to the tool prototypes that you linked to above. The goal of that work is to allow users to build any "topic" by identifying a few examples and then receiving algorithmic assistance in expanding that into a larger list. That work is best described under this meta page for ad-hoc topic models. Some more background that's not well-captured on Meta at the moment but is tracked via task T341988: we've continued to push forward with the tool found at https://list-building.toolforge.org/ which has been updated based on the findings from the meta page I linked to above. The other tool (https://a-list-bulding-tool.toolforge.org/) was another prototype that informed this work but is not currently being updated to the best of my knowledge. https://list-building.toolforge.org/ now incorporates three approaches for finding related articles into a single interface and is a tool that we're working with the WMF Campaigns team and others to get some feedback from campaign organizers on its utility before determining how to best formalize. Which is to say, it's still very much a prototype in that the data can be out-of-date and if it goes down, it depends on me figuring out how to bring it back up. If you have a use-case / interest in it though, I'd love to hear and figure out how we might support though because it's also still in a flexible space. Isaac (WMF) (talk) 14:52, 5 October 2023 (UTC)Reply