|Abstract Wikipedia Updates
Selecting the right implementation
Until recently, Wikifunctions selected an implementation at random. Meaning, whenever someone was calling a function and there were multiple implementations available, Wikifunctions would select the implementation to be used randomly.
Implementations of the same function can have wildly different runtime behavior. Some can be very slow, and others can be very fast: sorting a list of 100,000 random numbers using bubble sort can take a minute on a current processor, but with quicksort the same list of numbers can be sorted in less than two hundredth of a second - faster than the blink of an eye. Much faster.
In Wikifunctions, functions should be accompanied by testers. The capitalization function we talked about earlier has only one tester as this is being written, that checks that capitalizing the word “test” returns “Test”. If all goes well, Wikifunctions will run each tester on each implementation. The results of these tests are stored: does the implementation pass, how many resources does it require, and other meta-data. This run-time information is also shown to the user in a pop-up on request, for people interested in the back-end details.
Wikifunctions now ranks the implementations based on this meta-data, and updates the internal order of the implementations. Test failures result in downgrades, and quick results lead to a better ranking. And so, for the last few weeks, instead of selecting an implementation at random, we now select the first implementation based on that ranking. Here is an example of that reordering working in practice (but alas, diffs are not implemented yet).
This should lead to a considerable reduction in used resources, and to a more consistent behavior of Wikifunctions. Function calls should produce timeouts less often. This should also relieve the Wikifunctions community from worrying about inefficient implementations and whether we should accept them or not. Often, algorithms which are simpler are easier to read and verify, but are slower: bubble sort is a good example of this, compared with quicksort. Bubble sort is generally regarded to be much easier to explain and understand than quicksort. Having both allows for the results of the simpler implementation to be compared to results of the more complex implementation, with both passing the same suite of testers, and thus increase our confidence in the overall system. At the same time, we can in practice use the more efficient implementation and thus reduce overall resource usage.
With this, the first version of a major element that will work behind the scenes of Wikifunctions has been put into place, and we have delivered another goal of the current phase.
Maria Keet’s reflection on Abstract Wikipedia so far
Maria Keet has been an active and central part of the Natural Language Generation Workstream. She is a professor at the University of Cape Town, South Africa, and her collaboration with Ariel Gutman on the template language and her arguments have been mentioned in the fellows’ evaluation and the answer. Maria has now written down her own reflections and published them on her blog:
The text is very accessible, gives context, and explains some of the issues that low resource languages face, and makes suggestions on how to proceed. Maria also describes some of the frustrating challenges she encountered in having her voice heard and recognized. That part makes for a painful read, and points to necessary changes.
To repeat her closing words:
The mountain we’ll keep climbing, be it with or without the Abstract Wikipedia project. If Abstract Wikipedia is to become a reality and flourish for many languages soon, it needs to allow for molehills, anthills, dykes, dunes, and hills as well, and with whatever flowers available to set it up and make it grow.
We are thankful to Maria for her ongoing contributions. We hope that we can achieve a more inclusive space, with the goal to have contributing become a more wholesome experience.
Talk about Abstract Wikipedia in Sweden
Professor Aarne Ranta will give a talk on Natural Language Generation and Abstract Wikipedia on Thursday, April 20th, 2023 at 17:30 local time, in the Maritime Museum and Aquarium in Göteborg, Sweden. The in-person event is free for the public. The talk will be given in Swedish.
You can find more information about the talk in Swedish here: