Jump to content

Abstract Wikipedia/Generic function type

From Meta, a Wikimedia project coordination wiki

Generic vs pre-generic function types

[edit]

In the pre-generic function model description, the type of functions is always Z8/Function, and that’s also the case in the current implementation. In the current function model description however, function types are described as being generics, i.e. types that are generated by a function. The idea was that we would eventually do a migration, similar to what we did with List and TypedList.

Given how involved the lists migration was, the question arises whether we should do such a migration or whether we can avoid it.

Option 1: Leave everything as is

[edit]

Pros:

  • Nothing to do.

Cons:

  • In higher-order functions (that is functions that take functions as arguments or return functions as values), the user has no information about what type of function is expected as an argument or return value. For example, the sort function that sorts a list of integers could take a comparison function as an argument. Currently the type of that argument would just be Z8/Function, giving no further information that it should be a function from a pair of integers to a boolean. This holds both when calling the higher-order functions and when implementing it by composition.
  • It is not possible to do static type checking (if we ever want to do that in the future)?
  • Inconsistent with the other generic types (lists, pairs, etc)

Option 2: Migrate to a generic function type

[edit]

Z8/Function becomes a function that takes a function signature (i.e. argument declarations + return type) and returns a Z4/Type (similarly to TypedList being a function that takes a type of elements and returns a Z4/Type).

For example, the comparison function IsLessThan would have type Z8/Function([Integer, Integer], Boolean). That is:

{
  Z1K1: Z7
  Z7K1: Z8
  Z8K1: [Z17, {... Z17K1: ZInteger}, {... Z17K1: ZInteger}]  // Benjamin array notation
  Z8K2: ZBoolean
}

where ZInteger is the ZID of the Integer type and ZBoolean is the ZID of the Boolean type.

The type of Z8/Function would itself be Z8/Function([List(Z17/Argument declaration), Z4/Type], Z4/Type), that is it is itself a function from a list of argument declarations and return type to a return type.

It might even be possible to take in a list of Z4 instead of a list of Z17 as input because the type does not care about the name of the arguments. Alternatively, we could do away with the Z8K1/arguments and Z8K2/return type fields of function objects completely, because those would now live in the Z1K1/type field.

Pros:

  • Function types are proper generic types.

Cons:

  • The migration may be very involved. Might be slightly easier than the lists migration because there is no change in the canonical form syntax (like we had with Benjamin arrays)?

Option 3: Introduce a separate generic function type

[edit]

Because we do not do static type checking, the fields that we require for functions are always the same:

  • Z8K1/arguments: List(Z17/Argument declaration)
  • Z8K2/return type: Z4/Type
  • Z8K3/testers: List(Tester)
  • Z8K4/implementations: List(Implementation)
  • Z8K4/identity: ?

Notice that these fields and types are constant, no matter what the argument declarations / return type of the function are; this is exactly what the current (pre-generic) Z8/Function requires.

We could therefore introduce a separate TypedFunction (similar to what we did with TypedList during the migration) generic type that takes the argument declarations and return type, and always evaluates to Z8/Function.

Note that, without additional custom logic, we would not enforce in any way that the function object that we validate indeed has the same argument lists and return type.

Here is a proof of concept: https://gerrit.wikimedia.org/r/c/mediawiki/services/function-orchestrator/+/822057

(Aside: In a statically typed world, we would require the implementation to be typed, i.e. it would need to be an implementation that returns an object whose type is precisely the return type of the function. That means Implementation would probably also need to be generic, and compositions would need to somehow account for that, but that is a whole other story).

Pros:

  • Can be used to annotate argument types and return types in higher-order functions, giving more information to the user.

Cons:

  • We would not enforce that the function objects actually have the same argument lists and return type without additional logic.
  • It is not clear how robust this solution is or whether it would create other problems.

Type variables

[edit]

In order to truly have polymorphic functions, we need to allow function arguments to appear in the return type of the function signature.

For example, the polymorphic sorting function has type . That is, the function first takes as an argument a type , then a list of elements of type , then a comparison function on objects of type , and returns a list of elements of type .

The first argument to the function, , must appear in the return type of the function signature, . It also needs to appear in the type of the other arguments, and , but this can be seen supported through currying.

This means that no matter whether we choose option 2 or 3, we need to extend the scoping rules to bind the arguments of a function in its return type. Also, if we need to compare types (e.g. to do type checking or to perform faster validation), we need to take care of alpha-equivalence).