Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does the Kernel Instantiation node check the type arguments. #31953

Closed
rakudrama opened this issue Jan 20, 2018 · 24 comments
Closed

Does the Kernel Instantiation node check the type arguments. #31953

rakudrama opened this issue Jan 20, 2018 · 24 comments
Assignees
Labels
area-front-end Use area-front-end for front end / CFE / kernel format related issues. area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. customer-dart-2 front-end-kernel P2 A bug or feature request we're likely to work on

Comments

@rakudrama
Copy link
Member

rakudrama commented Jan 20, 2018

Can the Kernel Instantiation node throw?

It would throw if type bounds are checked at instantiation, and potentially every generic method would need different instantiation code.

@leafpetersen said:

Yeah, good question, this has been nagging at me. The only situation in which there is an implied dynamic check on type arguments in a static invocation is when the bound of a type parameter itself involves a covariant type parameter from an outer class:

class A<T> {
  void foo<S extends T> (S x) {};
}

class B extends A<int> {
  void foo<S extends int> (S x) {x.isEven; }
}

A<Object> a = new B();
void Function(Object) foo = a.foo;  // Throws?
a.foo<Object>("hello"); // Throws
a.foo<Object>(3); // Still throws
foo("hello"); // Throws?

So the question is which of the two places should throw?

My assumption has been that implementations would put the check in the body of the method, and so the "convenient" place to throw is at the actual call site, not the instantiation site. This is what DDC does.

It's probably slightly nicer to the user to throw at the instantiation site, but this is a pretty corner case example.

Thoughts from the implementations? Things I'm missing? Lasse, Erik, thoughts?

@eernstg
Copy link
Member

eernstg commented Jan 21, 2018

I think we need to specify this (it is not implied by anything that we have already specified).

First, generic function types have no subtyping relationships to each other if they have different bounds (so Function<X extends num>(X x) and Function<X extends int>(X x) are simply unrelated types rather than subtypes of each other, in any order).

This means that an expression like a.foo is similar to what I've called a "contravariant expression", but it's even worse: Covariance in the type argument to the value of a doesn't impose contravariance on the type of a.foo, it causes a.foo to have completely unrelated types for different values of that type argument (that is, "the actual value of T"). You could say that it is a "wildly variant expression".

Hence, a.foo should be typed as a plain Function --- nothing more specific than that can soundly be assumed about the type of this expression. We do know that it takes exactly one type argument, and given that type argument, call it T0, we know that it is a void Function<S extends T0>(S) or a subtype thereof, but we have no way to express that in a type when we don't know the associated bound.

So we should rejoice in the fact that this situation will not arise very frequently (especially if we make sure to tell developers how unpredictable its typing properties are, and constantly nudge them away from using a class type parameter as a generic function bound!). But when it does arise and it is used for things like generic instantiation, it is basically a dynamic operation (same thing is true for a plain invocation of the function).

The next issue is whether to fail during generic instantiation, or at call sites.

If a generic instantiation (based on inference, in particular the type-from-context) is considered to be an application of a curried function (1, done here: provide type arguments, 2, done later: provide value arguments) then it seems natural to me that the provision of actual type arguments should be subject to static verification or (if static analysis cannot prove that the chosen type arguments will conform to the given constraints) dynamic checking, and failure must happen as a compile-time error, resp. a dynamic error at the time where the generic instantiation expression is evaluated. When the bound is not known we are in the dynamic case, so it's a dynamic error, at generic instantiation.

If a generic instantiation is considered to be syntactic sugar for the creation of a wrapper function whose signature is obtained from the signature of the target function by replacing the formal type parameters by the chosen type arguments (so void foo<S extends T>(S x) is wrapped as "void foo_wrapper(S1 x) => foo<S1>(x)", where S1 is the actual type argument used for the instantiation) then you could argue that we're simply creating that wrapper as requested by the developer, and the possible failure of calling foo<S1>(x) in the body of foo_wrapper will arise every time the wrapper is executed. Of course, that wrapper would presumably be created by an extra, synthetic getter on the class of the value of a, taking the type argument list and returning a closure (because we need access to the actual value of T when we create that closure, and that's a bit inconvenient at the call site).

I don't have a strong preference for one or the other conceptual model, but the second approach (the non-curried one, where the actual type arguments are provided together with value arguments rather than as a separate invocation) does readily admit a natural semantics for invocations where the type arguments are omitted: We just treat formal type parameters like optional formal value parameters and provide a default value for them based on instantiate-to-bound.

If we take that route then I think we can be justified in delaying the error to the call site.

@sjindel-google
Copy link
Contributor

Currently the VM is implementing the second approach.

@sjindel-google
Copy link
Contributor

Why is an Instantiation node is being placed here instead of an implicit downcast?

I thought the Instantiation node would be used for cases of type inference where we are sure
that we are passing in correct type arguments.

The kernel currently being generated from this examples is:

  class A<T extends core::Object> extends core::Object {
    default constructor •() → void
      : super core::Object::•()
      ;
    method foo<generic-covariant-impl generic-covariant-interface S extends test::A::T>(test::A::foo::S x) → void {}
  }
  class B extends test::A<core::int> {
    default constructor •() → void
      : super test::A::•()
      ;
    method foo<generic-covariant-impl S extends core::int>(test::B::foo::S x) → void
      return x.{core::int::isEven};
  }
  static method main() → dynamic {
    test::A<core::Object> a = new test::B::•();
    (core::Object) → void foo = a.{test::A::foo}<core::Object>;
    a.{test::A::foo}<core::Object>("hello");
    a.{test::A::foo}<core::Object>(3);
    foo.call("hello");
  }

The front-end is inserting an instantiation node with Object as the type argument without any reason
as I can tell.

@sjindel-google sjindel-google added customer-dart-2 front-end-fasta area-front-end Use area-front-end for front end / CFE / kernel format related issues. and removed front-end-fasta labels Jan 22, 2018
@sjindel-google sjindel-google added this to the 2.0 milestone Jan 22, 2018
@stereotype441
Copy link
Member

@sjindel-google The reason an instantiation node is being placed here rather than an implicit downcast ((core::Object) → void foo = a.{test::A::foo}<core::Object>) is because generic function types and non-generic function types are not related, and we don't insert implicit casts between unrelated types. Rather, when a generic function is torn off in a context where a non-generic function is expected, we insert an instantiation node to convert between the two unrelated types, and we infer the type parameter for the instantiation based on the context. Since the context in this case is (core::Object) -> void), and the tear-off's static type is <S>(S) -> void, instantiating with to <Object> is what's required to make the types match.

@kmillikin kmillikin added P2 A bug or feature request we're likely to work on and removed area-front-end Use area-front-end for front end / CFE / kernel format related issues. labels Jan 24, 2018
@kmillikin kmillikin assigned kmillikin and unassigned stereotype441 Jan 24, 2018
@sjindel-google
Copy link
Contributor

We need a verdict from the language team before we can proceed.

@lrhn
Copy link
Member

lrhn commented Feb 12, 2018

My personal opinion is that

void Function(Object) foo = a.foo;

throws.

We are in a situation where a type argument to a class occurs contravariantly (the bound of type parameter is similar to the type of a parameter, it's supplied when the function is called or type-specialized), so we do need a runtime type check. I prefer to check early.

You are doing a tear-off of a method with static type void Function<S extends Object>(S) to a type of void Function(Object). The coercion is based on static types, so the method will be torn off and instantiated with Object as type argument. That's an invalid type argument for the actual function declaration, so the instantiation fails.

While we could delay until you call the method, it will just make the error harder to debug - you get a warning that a type argument is wrong at a position where there are not type arguments in the code.

This is also the most restrictive view (throws early), so we are not allowing something that can come back and bite us later.

So:

void Function(Object) foo = a.foo;  // Throws!
a.foo<Object>("hello"); // Throws!
a.foo<Object>(3); // Still throws!
foo("hello"); // Not valid if the tear-off threw.

@leafpetersen WDYT?

@eernstg
Copy link
Member

eernstg commented Feb 12, 2018

I argued that the dynamic error can occur at each invocation if we use the model where type arguments and value arguments are passed in one step. This corresponds to an approach where the tear-off operation produces a wrapper which will perform the complete invocation, passing both type arguments and value arguments:

  // Inference reveals that a generic function is assigned to a 
  // variable whose type is a non-generic function type, hence
  // we perform generic instantiation, guided statically.
  Function(Object) f = a.foo;
  // The tear-off works similarly to `(Object o) => a.foo<Object>(o)`,
  // and each invocation will incur a dynamic error when `a` has type
  // `A<T>` for some `T` which is not a top type.

In contrast, we may also prefer to use a curried model (where it takes one step to pass type arguments and another step to pass value arguments):

  // Inference reveals that a generic function is assigned to a variable
  // whose type is a non-generic function type, hence we perform
  // generic instantiation, guided statically.
  Function(Object) f = a.foo;
  // This amounts to `.. = a.foo<Object>`, which is an invocation of the
  // instance method `foo` of the given `a` where the type argument
  // `Object` is passed --- so it fails at the tear-off, never reaching any
  // invocations.

With the curried model I find it difficult to justify a delayed error, it should happen at the location where we can meaningfully claim that the generic instantiation takes place, that is, "early".

With the uncurried model, I find it difficult to justify an early error, because we otherwise don't raise dynamic errors in the situation where a function literal is created, even if it happens to be statically knowable that the given actual types involved will create a dynamic error if/when that body is executed.

class C<T> {
  // We are _not_ going to raise a dynamic error when the initial
  // value of `f` is computed, even though it is known for some `T`
  // that every execution of the body of the function literal will incur
  // a dynamic error.
  var f = () => <T>[] as List<int>;
}

Similarly, we should never raise a dynamic error at the time where the tear-off wrapper (Object o) => a.foo<Object>(o) is created, even though we could check the actual type argument of a and detect that it is some non-top type.

That said, nobody seemed to have a particularly strong opinion about whether we should use the curried or the non-curried model. ;)

@sjindel-google
Copy link
Contributor

Can we specify that the location of the error may be chosen by the implementation?

@eernstg
Copy link
Member

eernstg commented Feb 12, 2018

Maybe it would be useful to know a bit more about the implications for the implementation. Is one choice hugely more desirable than the other?

(Of course, desirability has been considered from a language design perspective all the time, but if that doesn't provide a very obvious answer then maybe implementation considerations would step forward..)

@sjindel-google
Copy link
Contributor

Today the VM implements the "throw-at-call" behavior. This is more efficient that the "throw-at-instantiation" behavior given the current implementation of bounds checking. However, if we optimize bounds checks in the future, the "throw-at-instantiation" behavior may allow us to optimize more calls to generic functions which are also the target of an Instantiation.

Therefore, we would prefer if implementations are given the flexibility to throw the error at either location.

@eernstg
Copy link
Member

eernstg commented Feb 12, 2018

I believe that the usage of a type that covariantly depends on a type parameter from an enclosing entity (class or generic method) as the bound of a type parameter of a generic function will always be a weird corner case. Developers just shouldn't write code doing that (unless they are exclusively going to use that generic function in a scope where said type variable is still in scope), so we shouldn't be hugely friendly to them when they do that. ;-)

This basically means that the bounds on type arguments to generic functions should be compile-time constant types, rather than existential types for which it is only known that they have a certain upper bound (like S extends T in the declaration of foo in the initial text of this issue).

With existential open, the situation may change:

  A<T> a = ...;
  if (a is A<?X>) {
    // Here, `X` denotes the exact type argument of the
    // dynamic type of `a` at `A`.
    void Function<S extends X>(S) f = a.foo; // Safe.
    void Function(X) g = a.foo; // Instantiation with type argument `X`: Safe.
    ...
  }

But this would still work fine with both the curried and the uncurried model, it just happens to be safe so we don't have to worry so much about where to get the error: there ain't any.

@eernstg
Copy link
Member

eernstg commented Feb 13, 2018

Actually, the consistent approach to the generic instantiation of a function whose type argument bounds are existential would be to incur a compile-time error:

class A<T> { void foo<S extends T> (S x) {}; }
class B extends A<int> { void foo<S extends int> (S x) { x.isEven; }}
class C extends B { void foo<S extends int>(Object o) {}}
A<Object> a = new B();

// Assume `S`. Compile-time error for all `S` except `Null`: not a subtype of `a.T`.
// If `S` is `Null` then the initialization is a dynamic error (the downcast from
// `void Function(Null)` to `void Function(Object)` will fail at run time). The dynamic
// error will occur here because it's a plain (non-generic) function type subtype check
// that fails. Class `C` shows that it cannot be a static error: It might work. It's a
// non-issue whether a dynamic error associated with passing the actual type
// argument to the generic function will happen here or at each invocation, because
// we are now following the general rule which requires such an actual type
// argument to be known statically to satisfy the bound.
void Function(Object) fooNoInference = a.foo<S>;

// Error: infer `Null` (or raise error during inference), but then it is a dynamic
// error (here!) to initialize `foo`, as above.
void Function(Object) foo = a.foo;

// Compile-time errors: `Object` is not a subtype of the bound.
a.foo<Object>("hello");
a.foo<Object>(3);

// `foo` is null or has type `void Function(Object)`: Safe statically and
// dynamically to call it like this. But we won't reach this point, of course,
// because the initialization of `foo` will be a dynamic error (assuming that we
// get rid of the compile-time errors, too). If `a` had been an instance of `C`
// then we would reach this point and the invocation would run without problems.
foo("hello");

The reason why it should be a compile-time error to pass a type argument which is not statically known to be appropriate is that this is what we do otherwise:

class A<X extends num> {}
class B<Y> {
  A foo() => new A<Y>(); // Error: `Y does not extend num`.
}

It is not impossible for Y to satisfy the bound, it's just possible that it doesn't, and that's an error. So why would it suddenly not be an error when we are passing a type argument to a generic function?

@leafpetersen
Copy link
Member

@rakudrama @sigmundch Can you weigh in on implications for dart2js?

On usability grounds, I have a small preference for specifying that the check happens at the instantiation point (early). But it's a very small preference, and if there are negative performance implications I'm happy to defer to those.

It seems to me that as currently specified, you always know statically whether or not a covariance check on the bound is required, since we only do these for tearoffs. In this case, I would expect performance/code size implications to be small since you only need to generate code to do the check for the rare cases that this actually happens (instead of for any possible universal function).

On the other hand, if we generalize this feature by adding explicit syntax for doing this (f<int>), then this would not be limited to tearoffs, and in that case every generic function would have to be prepared for this. For methods, again, you know statically whether this is required or not, but for functions you won't. So doing the check eagerly here feels like it could have a noticeable code size and potentially performance cost.

On the gripping hand, if we specify the check as happening early now, we can change our minds later in an essentially non-breaking way.

If the implementation folks basically agree with my analysis above, then I would say that lets go with an early (instantiation time) error for now, which allows us to move it later if we need to.

If there is a belief that this will incur a non pay-as-you-go code size/performance cost, then I don't think it's worth it, and we should go with the late error.

Whichever we decide, changing an existing implementation to conform seems fairly low priority to me relative to other pressing issues at the moment.

@rakudrama
Copy link
Member Author

I think check-at-instantiation is the most usable version (avoiding re-checks on repeated calls, e.g. a sort predicate or fold-reducer).

It does mean larger code, since we have to be able to check some functions and not others. The pay-as-you-go cost is a special method on generic functions that require a run-time bounds check. I think we can make this pay-as-you-go in size at the runtime expense of a per-instantiation check for the presence of the bounds-check method.

Since dart2js will likely be used to ship code that has this kind of check disabled, lets go for the most usable version for the development.

@mit-mit
Copy link
Member

mit-mit commented Jul 9, 2018

Moving to area-language as it seems like we have not decided on the correct behaviour yet.

cc @leafpetersen @lrhn is this a blocker for stable?

@dgrove dgrove added this to the Dart2.1 milestone Jul 9, 2018
@eernstg
Copy link
Member

eernstg commented Jul 11, 2018

The feature specification generic-function-instantiation.md has been landed, and it includes the requirement that the above-mentioned dynamic error must occur at instantiation (that is, during tear-off). Hence removing the 'blocked' label.

@eernstg eernstg removed the status-blocked Blocked from making progress by another (referenced) issue label Jul 11, 2018
@eernstg eernstg removed their assignment Aug 27, 2018
@eernstg
Copy link
Member

eernstg commented Aug 27, 2018

Removing myself as an assignee, because the feature specification has been landed.

@sjindel-google
Copy link
Contributor

We have the wrong behavior for this in the VM, and need to fix it.

@sjindel-google sjindel-google self-assigned this Aug 27, 2018
@sjindel-google sjindel-google added area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. area-kernel and removed area-kernel labels Aug 27, 2018
@sjindel-google sjindel-google removed their assignment Aug 27, 2018
@kmillikin kmillikin added area-front-end Use area-front-end for front end / CFE / kernel format related issues. front-end-kernel and removed area-front-end Use area-front-end for front end / CFE / kernel format related issues. area-kernel labels Sep 19, 2018
@kmillikin
Copy link

We don't plan to change anything about this for the Dart 2.1 release.

@a-siva
Copy link
Contributor

a-siva commented Nov 4, 2022

Closing this issue as the current behaviour matches the specification.

@a-siva a-siva closed this as completed Nov 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-front-end Use area-front-end for front end / CFE / kernel format related issues. area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. customer-dart-2 front-end-kernel P2 A bug or feature request we're likely to work on
Projects
None yet
Development

No branches or pull requests