Talk:Wikimedia Rust developers user group
Add topicGetting started
[edit]Welcome! Are there other goals people would like to accomplish or tasks to work on? What would you like to see this group do? Legoktm (talk) 07:13, 12 February 2021 (UTC)
- @Legoktm Is this a Wikimedia user group? If so, have this group applied for recognition to the AffCom yet? Thanks. —MarcoAurelio (talk) 11:03, 2 March 2021 (UTC)
- @MarcoAurelio: that's the plan. I was going to wait until after our first meeting (which I'm behind on scheduling..) before starting the application process though. Legoktm (talk) 18:08, 2 March 2021 (UTC)
- @Legoktm Thanks for your reply. I'll categorise the main page accordingly then. Makes sense to wait until the first meeting :-) Good luck. —MarcoAurelio (talk) 18:19, 2 March 2021 (UTC)
- @MarcoAurelio: that's the plan. I was going to wait until after our first meeting (which I'm behind on scheduling..) before starting the application process though. Legoktm (talk) 18:08, 2 March 2021 (UTC)
First meeting
[edit]I'd like to have our first realtime meeting around the end of February (in about 2 weeks). Is there a preference for a video/audio call like on https://meet.wmcloud.org/? Or would people be more comfortable with doing it on IRC? Any day of the week/timezone preference? Legoktm (talk) 07:18, 12 February 2021 (UTC)
- I'm ok with both DCaro (WMF) (talk) 08:27, 12 February 2021 (UTC)
- Personally, I'd prefer video call over IRC. --Magnus Manske (talk) 12:41, 17 February 2021 (UTC)
Advertising this group
[edit]So far at:
- wikitech-l and cloud-l
- m:Tech (also #wikimedia-tech)
- en.wp's bot noticeboard (also #wikipedia-en-bag)
Other places? @Enterprisey: maybe there are some Rust channels you're in (I recall something about Discord) you could spread it in? Legoktm (talk) 07:54, 12 February 2021 (UTC)
- Yeah; they're mostly about Rust primarily, but I'll certainly post where I can. Enterprisey (talk) 23:35, 12 February 2021 (UTC)
- Perhaps we could create a server just for this. Firestar464 (talk) 02:45, 23 February 2021 (UTC)
Anybody interested in collaboration? Max Semenik (talk) 16:13, 1 April 2021 (UTC)
- I want to work on dump processing with data analytics tools. Stealth project on GitHub: Wikidumptools, spoiler:
- Any form of collaboration or general talk about Rust would be great. --Count Count (talk) 10:39, 11 May 2021 (UTC)
- @MaxSem: FYI --Count Count (talk) 13:58, 11 May 2021 (UTC)
What are you working on? (January 2022)
[edit]I vastly understimated how much time I'd have to contribute to organizing this group last year, my apologies. Let's try to kick off 2022 properly :)
So, what are you working on related to Rust this month? Do you want help with something, or are looking for something to do? Want some feedback or just want to show off? Legoktm (talk) 06:27, 12 January 2022 (UTC)
- I'll start! This month I'm working on continuing porting some of my Python/PHP bots to Rust [1], [2]. I've also been hacking away at automatically combining API queries, see this ticket for details, I want to keep improving that and would welcome help/contributions. Legoktm (talk) 06:33, 12 January 2022 (UTC)
- I am currently investgating how Rust can improve the archiving and bare ref citation fixing I do on enwiki.
- One thing I would like to know about is if there is any tools for processing the database dumps in Rust. I use the database dumps for both tasks. Having something like that in Rust would be faster than the current setup.
- Another thing that would be intresting is same page concurrency, for example, somehow taking one page and analyzing multiple links at the same time instead of going in order synchronously. Figuring out how to do that could be helpful.
- Enterprisey has also been very understanding of my work, which I appreciate. When my Rust skills improve I'll have to start contributing. Rlink2 (talk) 03:35, 13 January 2022 (UTC)
- Rlink2, for the dumps, I personally use a fork of the
parse_wiki_text
crate and the partial XML dump parser crate and my SQL dump parser crate. I also have a crate that translates the XML dump into CBOR and other formats, which could be refactored into a general XML dump parser. My crate grabs all fields from the XML dump files, at leastpages-articles.xml
andpages-meta-current.xml
(not sure if I've tested it onpages-meta-history.xml
), whereas the partial XML dump parser retrieves only the more interesting fields, like the title and wikitext. If you findparse_wiki_text
useful, maybe I could publish a new crate on crates.io, because the original maintainer has disappeared, and anybody interested in the group here can be added as maintainers so that hopefully it's never abandoned again. Erutuon (talk) 19:27, 13 January 2022 (UTC)- Republishing the parse_wiki_text crate seems like a good idea since people are clearly using it, you're more than welcome to add it to the mwbot-rs organization to help increase bus factor. Though maybe we can come up with a better name that doesn't have so_many_underscores :)
- Do you also have a copy of the parse_mediawiki_dump repository? If not we can download it from crates.io and import that into Git. Legoktm (talk) 08:24, 14 January 2022 (UTC)
- I do have a copy of
parse_mediawiki_dump
and it's all in a fork on GitHub. Erutuon (talk) 04:49, 15 January 2022 (UTC)
- I do have a copy of
- For concurrency you can use tokio to do something like:
let links = vec![...]; let mut handles = vec![]; for link in links { // Spawn a new task for each link handles.push(tokio::spawn(async move { do_something(link).await })); } for handle in handles { // Wait for each spawned task to finish let result = handle.await.unwrap(); // Do something with each result... }
- I've been meaning to write a blog post on how best to do this in bots... Legoktm (talk) 08:19, 14 January 2022 (UTC)
- As promised, I published "Building fast Wikipedia bots in Rust" on my blog which steps through building a fast+concurrent bot. Legoktm (talk) 08:34, 21 January 2022 (UTC)
- Rlink2, for the dumps, I personally use a fork of the
- I might finally make a search engine for translations in English Wiktionary. Various people have mentioned various translation-related searches that they'd like to do, like finding all translations to a given language, but it hasn't been possible in any kind of systematic way, though I do have a template search engine that can sometimes kind of work. This would involve parsing the translation sections in English entries from the XML dump, figuring out a database schema, and writing an executable to generate it when each dump comes out. Translation sections contain a translation header template displaying one of the definitions of the English word and under it various templates that link to non-English words with that meaning. I will start using my program that generates the CBOR template dumps. Erutuon (talk) 19:27, 13 January 2022 (UTC)
- The database is pretty well filled out and the website (source) has a couple of search queries available (translations using a given language code, all translations listing a given word). Erutuon (talk) 03:06, 23 January 2022 (UTC)
- Hello, I am here because logektm pinged me via IRC. (Thank you!) Unfortunately, I have not had any purpose when I joined the IRC channel and just had heard Rust is something new and good. That's why I post this so late. But just now I've decided to write a bot for creating pages on WikiApiary based on Miraheze's complete wiki list, just for fun, and I think it would be good if I write my first Rust program using mwbot-rs library instead of Python or Node. Lens0021 (talk) 10:48, 29 January 2022 (UTC)
Updates?
[edit]It has been a while since this had any activities. Any updates? @Legoktm @Enterprisey @Magnus Manske 0xDeadbeef (talk) 17:56, 9 August 2022 (UTC)
- @0xDeadbeef: just busy with other things. I've been working on some Rust tools and bots, just haven't had anything exciting worth writing about I think. Do you want to share what you're interested in working on / learning, or have been working on? That would be a good way to restart discussions :) Legoktm (talk) 00:16, 11 August 2022 (UTC)
- @Legoktm: Yes, I have started working on a library that would allow type-safe construction of requests, and responses would either be parsed as JSON values or extracted via generic type combinators. So far this is only a start as many request actions are not written yet, and recently I found less motivation to continue the work mainly because I did not find bot tasks that would be interesting to implement alongside improving the library. 0xDeadbeef (talk) 04:42, 28 August 2022 (UTC)
- https://github.com/fee1-dead/wiki in case anyone is interested. 0xDeadbeef (talk) 04:43, 28 August 2022 (UTC)
- Nice! I'll look at the code a bit more indepth tomorrow, but I know that @Enterprisey was also working on type-safe requests (don't remember the Git link offhand). I was approaching it from the opposite direction, generating type-safe response structs. Legoktm (talk) 05:31, 28 August 2022 (UTC)
- A procedural macro is an interesting approach. I was thinking more about using types with generic params to make it easy to build upon ([1]) and would provide an easier way to extend such an API. 0xDeadbeef (talk) 14:14, 29 August 2022 (UTC)
- @Legoktm: Yes, I have started working on a library that would allow type-safe construction of requests, and responses would either be parsed as JSON values or extracted via generic type combinators. So far this is only a start as many request actions are not written yet, and recently I found less motivation to continue the work mainly because I did not find bot tasks that would be interesting to implement alongside improving the library. 0xDeadbeef (talk) 04:42, 28 August 2022 (UTC)
2022 report
[edit]Pinging some people I know have been working on Wikimedia+Rust things this year: @Magnus Manske, @0xDeadbeef, @Erutuon, @KHarlan (WMF), @EGardner (WMF)
I started putting together a report of Rust things over the past year, mostly based on what I had seen recently and things I personally did, it would be great if you all could also add what you all worked on: Wikimedia Rust developers user group/2022 report. If there's not a proper section for what you worked on, or if it's still in progress, please add it anyways and we can rearrange things as needed. Legoktm (talk) 08:30, 29 December 2022 (UTC)
- I added my projects to that page, thanks for the ping, @Legoktm! KHarlan (WMF) (talk) 14:02, 10 January 2023 (UTC)