Abstract Wikipedia/Updates/2024-01-11

From Meta, a Wikimedia project coordination wiki
Abstract Wikipedia Updates Translate

Abstract Wikipedia via mailing list Abstract Wikipedia on IRC Wikifunctions on Telegram Wikifunctions on Mastodon Wikifunctions on Twitter Wikifunctions on Facebook Wikifunctions on YouTube Wikifunctions website Translate

The Joy of Collaboration[edit]

This Monday, we hosted our monthly Volunteers’ Corner. And, as it has been for the last few times, my personal highlight during this Volunteers’ Corner was the collaborative creation of a new function, following the open question and answer session.

Given that we just released lists, this week we unsurprisingly created a function that works on lists: “is any true”, a function that takes a list of Booleans and figures out if any of them are true. A recording of the session is available on Wikimedia Commons.

I want to share a personal observation from these collaborative sessions.

I have been writing software for more than 30 years, and besides a very small number of pair programming sessions (which I quite enjoyed), I was always considering it a rather solitary activity. When I got into the flow, I could code for hours, building up the system I was working on. This was also true when I was working with others on a system, usually working on an individual task or fixing a bug. And I usually really enjoyed this work.

Creating functions in Wikifunctions feels very different. And my experience is very different depending on whether I am editing Wikifunctions by myself, or whether we are talking about the collaborative sessions.

Let’s talk about the solitary sessions first. When working on Wikifunctions – and here I also include experience on Beta Wikifunctions and on my local development machine – one thing is that I like creating compositions, but at the same time I often notice that I am missing some function that I need for the composition I want to do. So I usually end up creating a whole group of functions at once, and for each, I want to create at least three tests, and three implementations. I quickly end up with creating dozens of objects, and I sometimes get lost a little bit while doing so. This will probably get better once we have more of the basic functions in place.

The more interesting experience, though, was when we were doing the collaborative sessions. And I have to say, these are much more fun than I expected them to be! Discussing together about the details of the function, starting some tests, but then there are more tests already there when we get to the implementations because someone else created other tests, and while we are doing one implementation, more tests are coming in, and we discuss them, and by the time we connect the implementations, sometimes more than one is already ready, and with the tests it feels almost like a kind of ping-pong coding, trying to find edge cases and at the same time ensuring the implementations are robust enough for them. That also really works well with the tests being separate from the implementations and functions, which is conducive for test-driven development. I really enjoy these sessions; they feel a lot more like a collaborative wiki than 'normal' coding.

I am not yet sure what to think about this experience. Is it just me? Is this a general experience? Is this something that should flow into our designs? If so, how? What are your experiences with Wikifunctions so far? Let us know; we would love to hear your feedback. Maybe we should have these collaborations more frequently, and have them volunteer-driven?

One thing we decided to try out now, is to extend the next Volunteers’ Corner to an hour. It was very cramped each time. The next Volunteers’ Corner will be on 5 February 2024 from 18:30 to 19:30 UTC. You are all welcome to join, even for part of the time.

New section: Function of the Week[edit]

I want to start highlighting one function each week in these updates. Suggestions for highlights will be welcome, and if anyone wants to set up a process that is different from “Denny picks something at his own whim”, I'd be happy to take those suggestions. Until then, I will just pick something each week and discuss it here.

Manipulating strings

And I mean not only to present it quickly, but indeed to discuss it: what could we do differently with the function, what can we learn from it, how can we use it?

The function I am kicking this section off with is reverse string. I am doing so because it was my daughter’s favorite function during the development of Wikifunctions, admittedly for a rather childish reason (but then again, she does go to elementary school). She loved it because she could make the computer say “bad words” even though she didn’t enter the bad words into the system. So she would give an argument such as “diputs” and laugh uncontrollably for a while due to the website her daddy is working on showing a naughty word.

Reversing a string using a stack

The function takes a string and returns the string with the first letter last, the second letter being the second last, and so on, all until the last letter of the input is the first letter of the output. Some languages such as C++ have a built-in reverse function for strings, or Java for StringBuilder. However, neither of the two languages we currently support, JavaScript and Python, come with a standard reverse function for strings, which is almost surprising. Both languages offer many different ways on how to accomplish that task, and the current implementations in Wikifunctions offer three different ways:

  • This JavaScript implementation uses the rather new deconstruction syntax to turn the input string into an array of characters, and then uses the reverse function that arrays have, before joining the characters with empty strings to form a result string again.
  • The other JavaScript implementation uses a more traditional approach, a constructor function, to turn the input into an array, and then the same series of reversing and joining to get to the result.
  • The Python implementation uses an idiomatic, albeit arcane syntax for slicing through an array (and, applied to a string, it regards the string automatically as an array of characters).

We also have a new implementation using composition, using a pattern that will likely be used frequently (basically the same pattern we saw for the JavaScript implementations): first we turn the string into a list of codepoints, and then use the brand new reverse a list function created last week, just after lists were made available. Then we turn that list of codepoints back into a string again.

The function has six tests, and had for a short time a seventh test, which was then removed. We’ll get to it in a bit. Three of the tests take simple Latin-character based strings: abc, stressed, and kayak, one features a longish hexadecimal number, and one contains an emoji. One uses the string Q1, which should be quite straightforward, but fails currently due to a bug in Wikifunctions. It would be great to see more tests with other alphabets, particularly also right-to-left scripts, or scripts that have many ligatures.

The test that was removed featured an emoji, but from the perspective of Unicode this is a very different emoji: whereas the test that remains contains “😂”, the one that was removed contains “🚵🏻‍♀️”. What’s so different between these two emojis? This takes us deep into how Unicode works, but in short, the first one is represented by a single codepoint, 128514, whereas the second is represented by a series of five codepoints: 63157, 63157, 8205, 9792, 65039. These five codepoints mean, in turn, “mountain bicyclist”, “modifier skin tone Fitzpatrick scale 1-2” (i.e. white skin), “zero width joiner”, “female sign”, “variation selector: colorful and image like”. All these five codepoints together create a single emoji, or grapheme as the Unicode standard calls them.

If we turn the order of these five codepoints around, we get a different result than what we might expect: “♀‍🏻🚵”. This is being resolved by a new function that reverses the string on a grapheme level, but that doesn’t have a working implementation yet. I think that’s the right approach, but I wonder if the original function should already deal with graphemes, instead of codepoints (which means we could rename the latter function to become the actual reverse string function).

This led to discussions on the Project chat about exactly this topic.

Other considerations could be around capitalization and punctuation. For example, the following sentence:

“In the dance of life, find your rhythm!”

would, with a naïve application of reversion, look like this:

“!mhtyhr ruoy dnif ,efil fo ecnad eht nI”

A different reversion might aim to keep the punctuation and capitalization natural:

“Mhtyhr ruoy dnif, efil fo ecnad eht ni!”

Other considerations could be applied to digraphs such as “ph”, “th” or “ch” in English.

Even a seemingly small and simple function such as reversing strings can take us down several rabbit holes already. And whereas the straightforward reversal of Unicode codepoints is simple, the more complex functions have the potential to provide much more value to the end users, as they match closer with their expectations and are more difficult to find in other places.

Wikifunctions updates moving to Wikifunctions[edit]

These regular Wikifunctions updates will soon be hosted on Wikifunctions itself, moving off from Meta. Meta was a great place to host us, thank you, but it makes sense to actually host the updates ourselves on Wikifunctions. For now, we do not plan on moving all of the archives, but we will point to them.

Recent changes to the software[edit]

Our main focus last quarter was on better support for types (T343469); with the list support that shipped last week, our work in this area will now focus on custom, user-defined types (like number, or datetime, or GPS position, or… whatever the community wants!). This and other bigger ideas will be part of our team's planning over the next few weeks – more to report soon!

We've changed the way the main error type that users will see works. Previously, Z507/Error in evaluation would reply with the request wrapped in a Z99/Quote, and an error for what went wrong, also wrapped in a Quote. Now, we only wrap the first one, which should make for some much more understandable errors in these cases (T349026).

Outside of the big-ticket items, we worked on a community-inspired simplification and re-design of the Function page (coming next week!), some improvements to our documentation and installation guides. We also had to make some emergency fixes so that our code kept working following some upstream changes in MediaWiki.