Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use-site Invariance #229

Open
eernstg opened this issue Feb 20, 2019 · 4 comments
Open

Use-site Invariance #229

eernstg opened this issue Feb 20, 2019 · 4 comments
Labels
feature Proposed language feature that solves one or more problems variance Issues concerned with explicit variance

Comments

@eernstg
Copy link
Member

eernstg commented Feb 20, 2019

In response to #213, this issue is a proposal for adding use-site invariance to Dart.

[Edit: Note that more recent discussions have typically used the syntax List<exactly int> where this proposal had List<invariant int>. Adjusted the text below to fit that.]

Introducing Use-site Variance

(You can safely skip to the next section if you already know about use-site variance.)

Use-site variance is known from languages like Java (using ? extends and ? super) and Kotlin (using in and out)), and the general idea is that each type parameter of a generic type annotation (e.g., the type of a variable or a formal parameter, or the return type of a method) may have an associated variance.

For instance, with a Java class declaration like public class Foo<X> {...}, it is possible to specify that a given variable foo has a declared type like Foo<? extends Number>, which makes it possible for foo to contain a reference to an instance of Foo<T> for any T which is a subtype of Number. (If we had used a plain Foo<Number> then the variable would only be allowed to refer to an instance of Foo<Number>, not Foo<Integer> etc).

In return for the added flexibility, usages of foo can only use a part of the public interface of Foo: A member can only be accessed if its type is such that X never occurs in any position that is not covariant. In particular, if Foo declares an instance method void bar(X x) then foo.bar(...) is a compile-time error (it does not matter what ... stands for, it's always an error), because bar is not in the accessible subset of the members of Foo for a variable like foo.

The basic idea is that we'll "filter" the members in the static type of the receiver, and only allow access to those members that satisfy some extra constraints. This filtering step could also be seen as a projection (transforming a bigger thing to a smaller but similar thing), which can be seen in the terminology of Kotlin: Use-site variance in Kotlin is called type projections.

In general, covariant positions in a type corresponding to a "read" operation and contravariant positions correspond to a "write" operation, so we could say that if we make a type argument corresponding to a type parameter X covariant then we are projecting the set of members down to the set of members "that may let me read an X, but won't let me write any".

Here is a Kotlin example where out is used to make the type of from covariant in its type argument:

fun copy(from: Array<out Any>, to: Array<Any>) { ... }

The copy function will accept a from argument whose type argument is an arbitrary type (rather than just one whose type argument is Any), but the to argument must be an Array<Any>. In return for the added flexibility on from, we can only read elements from that array, we cannot write them, but we can perform both reads and writes on to.

How would Use-site Variance fit in Dart?

In Dart, type parameters of generic classes are always covariant, which is unsound for certain usages (cf. #213). This means that it makes no sense to add variance annotations to Dart exactly as they occur in other languages, but we can add an annotation that eliminates the covariance which is otherwise admitted for every type argument in every type annotation which is a class type. This is the motivation for using the name 'use-site invariance' for this feature.

This proposal uses the modifier exactly to indicate that a given type argument is required to be invariant, but it is of course possible to give this marker many other syntactic forms.

Use-site invariance is particularly helpful in the case where the relevant type parameter X cannot be declared to be invariant (otherwise we could use declaration-site invariance, which is likely to be much less verbose, and more consistent).

One situation where this occurs is when a type parameter X is designed to "perform both reads and writes" for objects of type X (or any type which is covariant in X). It could also be because the change where X is made invariant would cause widespread breakage (e.g., because one class must now be split into several classes, with suitable subtype relationships among them).

For instance, the Dart List class will probably not be modified to class List<inout E> .., because this would be massively breaking. But it would be possible to apply use-site invariance in order to maintain the stronger typing discipline for each individual variable or parameter where this is considered helpful, and that wouldn't break any other usages of the given type.

The invariance of said variable or parameter would give developers a guarantee against variance related type errors, and the run time cost would also be avoided because the type checks are no longer needed.

Note that the combination of use-site invariance and declaration-site invariance allows for a relatively fine-grained control over the constraints and guarantees provided:

class A<inout X> {
  List<exactly X> l1;
  List<X> l2;
}

class B<X> {
  List<exactly X> l3;
  List<X> l4;
}

A<exactly num> a1; // Redundant invariance.
A<num> a2;
B<exactly num> b1;
B<num> b2;

With these declarations we would have guarantees like the following:

  • We can safely read elements from all the lists l1 .. l4 on the relevant receivers a1, a2, b1, or b2; likewise, code inside the classes can safely read elements from all the lists. Safe today, remains safe.
  • We can safely write a value with static type num to a1.l1, a2.l1, and b1.l3. Safety improved!
  • When writing a value with static type num to a1.l2, a2.l2, b1.l4, and b2.l4, we get the same dynamic checks that we have today—for every write to a list, or with any other generic type that uses a type parameter in a non-covariant position. Safety unchanged (could specify more invariance to get more safety).
  • It is safe for code inside class A to write an element of type X to l1, and similarly for code in B that writes an element to l3. Safety improved!
  • For code inside class A and B, writing to l2 respectively l4 is subject the the same dynamic type checks that we have today. Safety unchanged (could specify more invariance to get more safety).

Note that 'safety' in all cases also means 'good performance', because there is no need to check safe types at run time.

Finally, note that the type annotation on a1 is redundant because A has an invariant type parameter (due to declaration-site invariance), and then we top it up with use-site invariance for this variable. This might be considered to be an error (because redundancy is confusing) or it could be allowed (because that would allow some source code changes to be less breaking, and it doesn't actually cause any technical problems), but in any case it makes no difference: The type of a1 has the same properties as the type of a2.

So the point is that we're able to combine declaration-site invariance and use-site invariance in various ways, in order to obtain the desired type safety and performance.

In this context, use-site invariance works as a kind of "glue" that allows us to obtain invariance also in cases where we are working with classes that need more flexibility than declaration-site invariance can offer, either because of the very design of those classes, or because it would cause massive breakage to change those classes.

The Dynamic Type of an Instance.

It should be noted that we are already used to talk about the run-time type of an instance as a precise type. For example, let C be as follows:

class C<X> {
  X x;
  C(this.x);
}

If we consider the run-time type of the value of a literal like 17, or an invocation of a generative constructor like C(17), we can't respond "it's an Object" even though both instances would yield true if we test them with is Object. The run-time type of an instance o is understood to be the most specific type T such that o is T evaluates to true.

But then we should not say that the value o of C<num>(17) has run-time type C<num>, it is actually C<exactly num>. That is, for a generic class type, the run-time type of an instance will have the modifier exactly on all type arguments.

Reification of Invariance: Two Approaches

We can use a simple approach to use-site invariance where the property of having some amount of invariance is a property of the associated entity (such as a variable, a formal parameter, or a return value), but it can not be used in an actual type argument, and it can not be used in a superinterface.

For example, we can then have a variable whose type is List<exactly num>, and its value could be <num>[], but not <int>[]. But we could not have a variable with type List<List<exactly num>> and we could not create an instance like <List<exactly num>>[[]] or call a generic function like [1].map<List<exactly num>>((_) => <num>[]).

The point is that we could then avoid having any representation of use-site invariance in the types of instances at run-time, and in reified types. If a dynamic check is required then it is always concerned with the values of actual type arguments of the run-time type of an instance (that we are just about to assign to a variable, or pass to a formal parameter, or return). For example:

List<num> xs = ...; // Complex expression, could be a `List<int>`.
List<exactly num> ys = xs; // Check that the actual type argument is exactly `num`.

The reason why we have to constrain the usages of use-site invariance if we do not have reification is soundness:

// Assume that use-site invariance is _not_ reified in types, but
// allowed in type arguments. NB: THIS IS NOT SOUND!

main() {
  List<List<num>> xs = <List<num>>[];
  List<exactly List<exactly num>> ys = xs; // Downcast, implies dynamic check.
  xs.add(<int>[]); // Check dynamically that the argument `is List<num>`: OK.
  ys[0].add(3.14); // Statically safe, fails at runtime (shows unsoundness).
}

At the downcast, the value of xs has run-time type List<exactly List<num>>. We can check that the actual type argument is List<num>, and the representation for this is the same as that for List<exactly num> (remember, we don't reify the 'exactly' bit), so we have to make the downcast succeed.

We might try to use the static type of xs to be more strict, but we'd ultimately need to be able to handle an assignment from an expression of type dynamic, so we cannot generally enforce a typing discipline unless we can enforce it based on information which is reified at run time.

But this means that we can add <int>[] to this list via xs, and that will destroy the heap invariant (that is, we now have a variable with a type T whose value is not of type T). That situation destroys soundness, so we can get a failure when we pass a double in the subsequent invocation of add, even though that's statically safe.

Alternatively, we can use a more powerful approach: We could reify use-site invariance and allow types like List<List<exactly num>>, List<exactly List<exactly num>>, etc.

I have described the more powerful approach below, and the less powerful model where reification is avoided falls out as a special case.

Draft Proposal of Specification

The following is a preliminary version of the specification of this feature.

Grammar

Syntactically, this feature consists of one extra modifier which can be added to the specification of a type argument in a parameterized type. Here is the needed grammar update:

<typeArguments> ::= '<' typeArgumentList '>'

<typeList> ::= <typeArgument> (',' <typeArgument>)*

<typeArgument> ::= 'exactly'? <type>

It is a compile-time error if a <type> T is a parameterized type where one or more of the actual type arguments has the modifier exactly, unless it is used as a type annotation of a variable or a formal parameter, as a function return type, as an actual type argument, or as a bound of a type variable.

In the simple approach: Omit 'as an actual type argument, or as a bound of a type variable'.

Static Analysis

In the superinterface subtype rule, the exactly modifier is preserved. For example, List<exactly num> is a subtype of Iterable<exactly num>.

We add a new subtype rule that is capable of eliminating 'exactly': Consider a parameterized type S1 of the form C<W1, .., Ws>, where C is a generic class and Wj is of the form exactly? Tj, and another parameterized type S2 of the form C<Z1, .., Zs>, where Zj is of the form exactly? Tj. Then S1 <: S2 whenever the following holds for all j: If exactly is absent in Wj then exactly is absent in Zj.

For example:

List<exactly num> <: List<num>
List<int> <: List<num>
// But `List<int>` and `List<exactly num>` are unrelated.

List<exactly List<exactly num>> <: List<List<exactly num>> <: List<List<num>>
List<exactly List<num>> <: List<List<num>>
// But `List<exactly List<exactly num>>` is not a subtype of 
// `List<exactly List<num>>`, and `List<List<exactly num>>` and
// `List<exactly List<num>>` are also unrelated.

With the added subtype rule, some new situations arise where soundness is not guaranteed statically, but the general rules still apply: If assignability turns out to be justified by a supertype rather than a subtype, a dynamic check is generated.

Similarly, the new types enable the omission of many dynamic type checks, e.g., when adding an element to a list and in general when passing an argument to a covariant parameter. But this is again not a change that requires any specification changes, it follows as a consequence of the improved type information.

Discussion

One decision which is implied by this proposal is that we will not support use-site contravariance. I believe that this is not a deep decision, it would be easy to allow marking actual type arguments as contravariant as well as exactly, and the associated static and dynamic checks are well-understood.

However, contravariance is already present in Dart, in the shape of function types. It is often possible to use a function type (even though it is not quite as flexible as a class type), or to use invariance (declaration-site or use-site), which is arguably the most important special case of contravariance. So the decision to omit contravariance support is mainly based on the desire to keep the amount of complexity down. Nevertheless, we can presumably just add it if we want it.

Another decision to make is whether to reify invariance in types at run time. The reification has a certain space cost itself (one bit per type argument), and subtype checks would have to spend the time to check those bits.

On top of this, the use-site invariance mechanism is considerably simpler when we include the constraints that are required in order to maintain soundness in a setting where we do not reify invariance in types.

One possible approach could be to add the simple and constrained variant of use-site invariance to Dart first, and then possibly generalize this to include the full model. This would be a non-breaking extension because the constraints that couldn't be enforced in the simple approach could also not be expressed, so only new code will be able to both express and enforce the more complex typing requirements (such as nested occurrences of exactly).

Finally, the proposal makes it an error to use exactly in the specification of a type argument of a superinterface. This is because the type specified in a superinterface is already similar to the type of an instance, in the sense that its type arguments are the most precise ones that are applicable. Here is an example:

class A<X> {}
class B<X> extends A<exactly X> {} // Error.

The point is that whenever we encounter an instance o whose run-time type is B<T> for some type T, it will also make o is A<T> evaluate to true, and there is no proper subtype T1 <: T such that o is A<T1> is also true. This means that exactly on the type arguments of a superinterface is redundant. So we leave it out everywhere, rather than requiring it everywhere.

However, there is nothing special about using use-site invariance in nested positions in a superinterface (if we use the more powerful approach where exactly is allowed in a type argument at all). For example:

class MyListList<X> implements List<List<exactly X>> {
  ...
}

main() {
  List<exactly List<exactly num>> xs = MyListList<num>();
  xs.add(<num>[]); // Statically safe.
  xs[0].add(3.14); // Statically safe.
}
@eernstg eernstg added the feature Proposed language feature that solves one or more problems label Feb 20, 2019
@matanlurey
Copy link
Contributor

I hope the keyword is in not invariant, but otherwise 👏

@eernstg
Copy link
Member Author

eernstg commented Mar 4, 2019

@matanlurey wrote:

I hope the keyword is in not invariant

For brevity, we could consider = or maybe +-.

Variance has frequently been associated with read-write access (for invariance), read-only access (covariance) and write-only access (contravariance), and the latter two are associated with out and in in C# and Kotlin. Scala and others have used + for covariance and - for contravariance.

So we could use inout in Dart for invariance, in line with C#/Kotlin; or +-, in line with Scala etc; = seems to fit the semantics even though there's no immediate precedent. Those choices are more concise, which I guess would be the main issue with invariant. But you could say that inout, +-, and = are not really connected to any other parts of Dart, except maybe that = for invariance and ! for non-null fit together.

I think a plain in for invariance would be confusing because it means 'contravariant' in other languages (and it tends to evoke the notion of write-only access, at least for me, because that's the situation where you can "put things into the receiver" whose type has a type argument with an in marker).

@lrhn
Copy link
Member

lrhn commented Sep 18, 2019

Why only invariance?
Could we (say, using in/out/inout as text) declare contravariance at the use-site:

List<in num> l1 = <Object>[];
List<out num> l2 = <int>[];
List<inout num> l3 = <num>[];

and it would then be invalid to invoke membes where the type variable occurs contra to the declaration in the member signature. (Or, alteranatively, the only type satisfying the variable in the other direction would be Object? for out/Never for in) That is:

l1.add(42); // fine.
l1[0]; // compile-time error
// or: l1[0] has static type Object.
l2.add(42); // compile-time error - any call to `add` is, it's not about the argument type.
// Or l2.add( ..static type Never ..), which effectively makes it non-callable, but tecnically writable.
l2[0]; // fine, static type num
l3.add(42); fine (because 42 is a num)
l3[0]; // fine, static type num
l3 = <int>[];  // compile-time error.

We'll still have the "unsafely covariant" version just written as List<num>.

This obviously corresponds to Java's List<? extends num>, List<? super num> and List<num>, so the type theory is known. The syntax matches Kotlin's use-site generics, so there is precedence.

@eernstg
Copy link
Member Author

eernstg commented Sep 18, 2019

Could we (say, using in/out/inout as text) declare contravariance at the use-site:

Yes, that's exactly what we've been discussing the past few days. I believe this would not be a problem to support (it's well-understood, and presumably not much different from invariance in implementation effort). But it does pile more stuff on the set of features that developers will need to choose among (and there is a certain cost in being able to do the same thing in several different ways, in terms of developing some suitable rules of thumb about when to use what).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Proposed language feature that solves one or more problems variance Issues concerned with explicit variance
Projects
None yet
Development

No branches or pull requests

3 participants