Skip to content

Polymorphism

David Pavlík edited this page Sep 6, 2024 · 5 revisions

Polymorphic functions are such that use type variables in their spec. For example lists:map/2 with -spec lists:map(fun ((A) -> B), [A]) -> [B] is a polymorphic function. It takes a function from A to B (where A and B are the type variables) and a list of As, and returns a list of Bs (by applying the function on all elements of the given list). Because lists:map/2 uses these type variables, it can be called with arguments of different shapes (for example with a function that converts integers to booleans together with a list of booleans but also with the function length/1 together with a list of lists).

Gradualizer is quite good at working with polymorphic functions. However, as polymorphic functions increase the abstraction and cognitive complexity, it may be sometimes puzzling to understand what exactly is going on or why is Gradualizer complaining. So let's break it down.

Typechecking polymorphic functions

As with all other functions, we can either check their own definitions or we can check places where they are called. In case of polymorphic functions, the former is easier to understand, so we will start with that.

When a function uses a type variable in its spec, it can make no assumption about it. It is not the polymorphic function itself but its caller who decides which concrete type will be used in place of the type variable. For example, consider the following polymorphic function:

-spec swap_pair({A, B}) -> {B, A}.
swap_pair({X, Y}) -> {Y, X}.

The function needs to return a pair of the type {B, A}. But A and B can be any type, as we can call swap_pair/2 with any pair, consider for instance swap_pair({123, true}) and swap_pair({#{a => b}, {c, d, 42}}. So how does it get values of type A and B? It has no other option than to use elements of the input tuple because these are exactly of the types A and B. Also, because it can make no assumptions about A and B, it can perform no operations on X or Y that would require them to be of any particular shape (for example a map or an integer). Therefore, no definition that would behave differently from the definition above shouldn't pass typechecking (unless it uses dynamic types). In particular all the definitions below would be rejected by Gradualizer:

swap_pair({X, X}) -> {X, X}. %% We cannot assume that A = B
swap_pair({X, Y}) -> {X, X}. %% We cannot assume that A = B
swap_pair({X, Y}) -> {Y + 1, X}. %% We cannot assume that Y is a number (or that B is a numeric type)
swap_pair({X, Y}) -> {Y, 42}. %% We cannot assume that A is 42
swap_pair({X, {Y, _}}) -> {X, Y}. %% We cannot assume any particular shape of B

It works in the same way with the aforementioned map/2 function:

-spec map(fun ((A) -> B), [A]) -> [B].
map(Fun, [Head | Tail]) -> [Fun(Head) | map(Fun, Tail)];
map(Fun, []) -> [].

We can call Fun only with values of type A which we can get only from the list. Every other call to Fun would fail but the call with Fun(Head) is correct because we know that Head is of the type A and Fun inputs A.

As easy as it may sound, it may easily get confusing. Consider for example this function -spec some_fun(fun ((A) -> A)) -> boolean(). That function that it inputs (let's call it F) is just a function that returns the same type that it inputs (i.e., it can be for example fun (integer()) -> integer()). In particular, F does not have to be a polymorphic function that can take anything, so we cannot call F(42) in the body of some_fun/1. There is no way to express that a function takes a polymorphic function, hopefully, this is rarely needed in practice and you can always fallback to any().

To summarize, keep in mind that when writing polymorphic functions, it is always the caller who decides how the type variables will be instantiated (i.e., what will be the concrete types) and the function definition cannot make any assumptions about it.

Implementation details

The implementation is quite simple. We make a distinction between two kinds of type variables: flexible type variables (used when typechecking polymorphic calls) and rigid type variables (used here). When typechecking a function definition, all type variables in its type spec are converted to a special form {rigid_var, erl_anno:anno(), atom()} (which is not present in erl_parse). There are no special subtyping rules for rigid type variables, so only the generic subtyping rules apply to them, namely reflexivity and the rules for none(), gradualizer:top() and any(). This ensures that the function definition can make no assumptions about them and can use them only at places where they are expected. It is important that this form is distinct from the plain {var, erl_anno:anno(), atom()} (that is used for flexible type variables) so that no flexible type variable rules apply to them and no constraints are generated from the use of rigid type variables. Also, because they are not flexible type variables, they can be used as bounds for flexible type variables when typechecking polymorphic calls (see the next section). In essence, Gradualizer treats rigid type variables as some concrete type it knows nothing about.

Typechecking polymorphic calls

Typechecking calls to polymorphic functions is the second half of the polymorphism support. Its job is to checks whether all the arguments given to a polymorphic function together make sense and to determine the result type of the function call. For example in the call lists:map(fun integer_to_binary/1, [1, 7, 42]), it has to check that the types integer() and [1 | 7 | 42] are in some way compatible (and thus integer_to_binary/1 can take [1, 7, 42]) and it must choose what type to give to the call (in this case it's [binary()]).

Before we dive deeper, note that polymorphic calls are properly checked only when the --solve_constraints flag is given to Gradualizer. Otherwise, all type variables are replaced with any() and no polymorphic type information propagate. This flag is still considered experimental so use it with caution.

In the description of how the typechecking of polymorphic calls, it is easy to get lost in technical terms, so let's instead look at a few examples of how it works (all examples assume the --infer flag in addition to --solve_constraints). Consider the following specs:

-spec lists:append([A], [A]) -> [A].
-spec lists:filter(fun ((A) -> boolean()), [A]) -> [A].
-spec positive(number()) -> boolean().

In the call lists:append([1, 7], [42, 7, 12]), the type variable A is instantiated to both 1 | 7 and 42 | 7 | 12, so we take a union of them to get the resulting A. The resulting type of this call is then [1 | 7 | 42 | 12]. Note that we can pass any two lists to lists:append/2 and the call will be still valid (because we can take a union of any two types). Also note that the spec above is equivalent to the following one: -spec lists:append([A], [B]) -> [A | B].

Now let's consider the call lists:filter(fun positive/1, [1, 3, -5]). The resulting type of this call will be [1 | 3 | -5] because the type variable A is instantiated to 1 | 3 | -5. There is the extra -5 which obviously won't be in the resulting list, but this is okay as the list [1, 3] can also be seen as of the type [1 | 3 | -5], and it's the best type we can infer based on the information we have.

And now, let's see a bad guy: lists:filter(fun positive/1, [potato]). We know that positive/1 can take only numbers, therefore Gradualizer checks that all instantiations of A are numbers. And it founds out that they are not, because potato is not a number! Therefore it raises a type error that tells us that A is in this call instantiated to both potato and number(), and that potato should be a subtype of number() but it is not.

These three examples should give a taste of how the typechecking of polymorphic calls in Gradualizer behaves. More details may be found in the next subsection.

Implementation details

In Gradualizer, the typechecking of polymorphic calls is based on the Local Type Argument Synthesis from the paper Local Type Inference by Pierce and Turner (chapter 3). The paper is well written and may be worth reading if you are interested in understanding how Gradualizer works under the hood and already have some understanding of the theory behind typechecking and especially subtyping. There has also been a Master's thesis written about the implementation of LTI in Gradualizer, so it might also be worth checking out (mainly chapters 3 to 5) if you are interested.

The type system used in the LTI paper is very simple, it has only base types, function types, type variables, top, and bottom. Our type system is much more convoluted, in particular by the gradual types (any()), unions, and intersection function types. Therefore, we extend must have extended it beyond the theory presented in the paper.

The inclusion of any() in the type system brings unsoundness whenever any() is used and it's the same here with LTAS. Whenever some type involved in the LTAS contains any(), the algorithm is unsound. If any() is present, it also does not hold that the inferred type is minimal with respect to subtyping, because the subtyping relation is no longer a partial order. However, although these theoretical properties are broken, the algorithm still works well in practice (with the necessary tradeoffs for any()).

In high-level terms, the LTAS algorithm in Gradualizer works like this:

  1. when typechecking a polymorphic call, the function type_check_poly_call/4 is called
  2. if --solve_constraints is not given, all type variables are replaced with any() and we proceed as usual
  3. the types of all arguments to the function call are inferred (even when in the typechecking mode)
  4. we then call subtype_with_constraints/3 verifying that all argument types are subtypes of the parameter types, gathering constraints along the way (we get to constraints in a bit)
  5. if some argument type is not a subtype of the respective parameter type, we throw a (plain) type error
  6. we then combine all the gathered constraints
  7. we verify that the constraints are satisfiable and if not, we throw a polymorphic type error saying that some lower bound of a type variable is not a subtype of a upper bound (this error is a complex one so we try to raise it as rarely as possible)
  8. then we take the constraints and compute a minimal substitution with respect to the result type of the called function (i.e., a substitution that minimizes the resulting type but still obeys the constraints)
  9. we apply the substitution on the result type and return it as the type of the application

The main ingredient of this algorithm are constraints. They are generated in the subtype_with_constraints/3 functions in the clauses for flexible type variables (and in clauses for any() in some edge cases). For example, when we check that boolean() <: A, we return the constraint that boolean() must be the lower bound of the type variable A (written {boolean() <: A}). It is important to note that the constraints (lower and upper bounds) never contain flexible type variables (which simplifies the algorithm). Constraints are combined by making LUB out of corresponding lower bounds and GLB out of corresponding upper bounds. They are satisfiable if all lower bounds are subtypes of their respective upper bounds.

Flexible type variables appear only in the type_check_poly_call/4 function and related functions (like subtyping, computing the substitution and so on). The type_check_poly_call/4 function returns concrete types without any flexible type variables, and at all other places, flexible type variables are replaced with any(), so that the rest of the code doesn't ever have to work with them. Exactly the same holds for constraints, they are solved immediately after their generation and do not propagate outside of type_check_poly_call/4 (thus local type inference). This is also why subtype/3 may assume that no flexible type variables were involved and no constraints are needed.

When referring to a polymorphic function (fun id/1), type_check_expr/2 replaces all type variables in its type to any(). Therefore for example id(fun id/1) is inferred to be of the type fun ((any()) -> any()).

The dynamic type any() is treated as any other type in our version of LTAS. In particular, it may be used as a lower bound for some (flexible) type variable. This is useful when merging constraints. For example, when we generate constraints {any() <: A} and {integer() <: A}, we combine them as {integer() | any() <: A}. We may then choose integer() | any() in the substitution for A and it will mean that the resulting type must contain integer() but may also be greater (thus allowing to for example matching a bool on such type, but disallowing calling atom_to_list on it).

When typechecking a polymorphic call in type_check_expr_in/3, we call type_check_poly_call/4 and then check that the inferred type is a subtype of the expected type. We could instead first call subtype_with_constraints/3 with the result type and expected type and add this constraints to all later generated constraints (as described in section 4 of the paper) but this would present the polymorphic type error even when it's not really needed, so we do it differently.

For a list of known limitations, see test/known_problems/should_fail/poly_should_fail.erl.

Bounded quantification

In type specs, type variables may be upper-bounded by the :: notation. For example lists:keydelete/3 may be specced like this: -spec lists:keydelete(any(), pos_integer(), [Tuple]) -> [Tuple] when Tuple :: tuple(). The spec says that lists:keydelete/3 is polymorphic in a similar way that lists:filter/2 is polymorphic, but it works only on tuples. For example, based on this spec, lists:keydelete(b, 1, [{a, 42}, {b, 43}, {c, 44}] should result in the type [{a, 42} | {b, 43} | {c, 44}].

Gradualizer, however, does not currently support bounded quantification. All type variables that are bounded by any() or term() are used as plain type variables without bounds. And type variables with other bounds are replaced with their bounds. So for example the spec for lists:keydelete/3 gets replaced with -spec lists:keydelete(any(), pos_integer(), [tuple()]) -> [tuple()] .

Nevertheless, these bounds are in practice rarely used as bounds but mostly only as type comments so it's not much of a problem.