Abstract Wikipedia/Updates/2023-09-20

From Meta, a Wikimedia project coordination wiki
Abstract Wikipedia Updates Translate

Abstract Wikipedia via mailing list Abstract Wikipedia on IRC Wikifunctions on Telegram Wikifunctions on Mastodon Wikifunctions on Twitter Wikifunctions on Facebook Wikifunctions on YouTube Wikifunctions website Translate

Renderers and parsers for types

Wikifunctions currently supports two types: Strings and Booleans. To make Wikifunctions useful, we need to support many more types, such as numbers, dates, geocoordinates, and eventually Wikidata lexemes and items. Types define what kind of inputs and outputs the functions in Wikifunctions can have.

With Wikifunctions, we don’t want to just repeat what different programming languages have done, but, if possible, gently update the lessons that have been learned from programming language research and experience and make sure that we are as inclusive as possible.

Strings and Booleans were very carefully chosen for the first deployment of Wikifunctions: Strings, because they are just a specific sequence of Characters, and do not depend on the user’s language. Booleans, because they are a key basis of logic flow for programming. Further, they can be fully translated in Wikifunctions – the two values, True and False, are both represented by a Wikifunctions object that can have names in any of the languages we support. Since the initial deployment, more than a dozen translations have been added! If you can add more, that would be great.

One example of a possible next type that would be interesting to introduce would be whole numbers. This raises a big question: how should we represent an integer?

Most programming languages have two answers for that: one, they internally represent it, usually, as a binary string of a specific length, in order to efficiently store and process these numbers. But then there is also their representation in the human-readable source code, and here they are usually represented as a sequence of Arabic numerals, e.g. 4657388. Some programming languages are nice enough to allow for grouping of the numbers, e.g. in Ada you may write 4_657_388, or, if you prefer the Indian system, 46_57_388, making these numbers a bit more readable.

But programming languages where one can write ৪৬,৫৭,৩৮৮ using Bengali numerals, referring to the same number, are rare. For Wikifunctions, we want to rectify this, to make sure that the whole system supports every human language fluently and consistently.

Internally, we will represent numbers - like every other object - as ZObjects. The above number would be represented internally as follows (using the prototype ZID from the Beta, since we don’t yet have the respective type in the real Wikifunctions):

  "Z1K1": "Z10015",
  "Z10015K1": "4657388"

Or, with labels in English:

  "type": "positive integer",
  "value": "4657388"

Even though this solves the internal representation, we would want to avoid displaying this object in the system if possible. Instead, we plan to allow the Wikifunctions community to attach a 'renderer' and a 'parser' to each type. The renderer would be a function that takes an object of the given type (in this case, an object of the type positive integer) and a language, and returns a string. The parser is the opposite of that: it takes a string and a language, and returns an object of type positive integer.

This would allow the Wikifunctions community to create functions for each type and language that would decide how the values of the type are going to be displayed in the given language. In a Bengali interface, the above number can then be displayed in the most natural representation for Bengali, which might be ৪৬,৫৭,৩৮৮.

When entering a number, we will use the parsing function to turn the input of the user into the internal representation. It is then up to the community to decide how flexible they want to be: if they would only accept ৪৬,৫৭,৩৮৮ as the input, or whether ৪৬৫৭৩৮৮ would be just as good - or even also or only 4657388. The decision would be for the Wikifunctions community to make.

Note that we made a lot of assumptions in the above text. For example, using the ZID from the Beta, calling the type “positive integer”, assuming the internal representation of positive integers being Arabic numerals without formatting (instead of say, hexadecimal, base 64 or a binary number, which also could be good solutions), and other assumptions. All of these decisions are up to you, but we used assumptions here to talk concretely about the proposal.

We plan to implement this proposal incrementally, over a few weeks and months. It will likely be the case that we will at first only accept the internal representation (just as it currently works on the Beta), and that we will then add renderers and finally parsers.

We are looking forward to hearing your feedback on this plan.