Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix IDL subtyping #832

Closed
wants to merge 8 commits into from
Closed

Fix IDL subtyping #832

wants to merge 8 commits into from

Conversation

rossberg
Copy link
Contributor

@rossberg rossberg commented Nov 6, 2019

An attempt to address #364.

AFAICT, our previous discussions have established that we cannot have a subtyping semantics that simultaneously

  1. allows upgrading a record with additional optional fields,
  2. allows the same such record as inbound and outbound (and therefore round-tripping),
  3. is transitive,
  4. is coherent.

So some compromise seems necessary, even if slightly ugly. Based on our last discussion in Zurich, this PR proposes what I think is the minimal relaxation: it essentially keeps all the properties, except that (3) and (4) only hold separately.

That is, subtyping is coherent (there is only a single, unambiguous definition to convert the value given two types), it also is transitive declaratively, but composing the resulting conversions is not necessarily coherent with a direct conversion.

This works because the rules are formulated as inherently transitive, i.e., without a separate transitivity rule. With that, transitive composition is only relevant when dealing with multiple consecutive upgrades, and it is acceptable if the behaviour is not entirely coherent in that case, since users have sufficient control.

The only way to observe incoherence is if the user removes an optional field in one upgrade and reintroduces it with the same type in a later one, while some client is still multiple versions behind. It suffices to avoid the removal of optional record fields to avoid this corner case.

See the doc itself for some motivating examples.

I won't claim that this is super-pretty (the semi-compositional nature of the opt-field subtyping rules is not very puristic), but at least it seems fairly simple and practical. And AFAICT, we have not found a solution that would be closer to the hypothetical ideal.

Please try to shoot holes into it, I may well have missed something!

Edit: This PR also adds the type any as a top type and changes the meaning of reserved to be opt any, and dually adds deprecated as opt empty.

@nomeata
Copy link
Collaborator

nomeata commented Nov 6, 2019

This would be much easier to review if we had removed the broken stuff first and had a diff against a known-good state ;-). Guess I’ll look at https://github.com/dfinity-lab/motoko/compare/joachim/idl-remove-unsoundness..idl-sub2?expand=1

First thought:

It suffices to avoid the removal of optional record fields to avoid this corner case.

Can we then just forbid this?

I don't think it’s transitive: I have

record { a : opt Bool } <: record {} -- width subtyping
record {} <: record { a : opt Int } -- magic dust

but not

record { a : opt Bool } <: record { a : opt Int }

(Due to your “let’s fix transitivity rule” premise not (<datatype> <: opt <datatype'>))

There is some prose about that, so maybe the rule wasn’t quite what you intended, or there was to be a rule for that case too?

The only way to observe incoherence is if the user removes an optional field in one upgrade and reintroduces it with the same type in a later one, while some client is still multiple versions behind. It suffices to avoid the removal of optional record fields to avoid this corner case.

I think this is not true, we need transitivity not just when one canister upgrades two times, but also when two canisters upgrade once, I think. Consider three canisters A, B, C.

  • A changes iterface from service A { foo : () -> {} } to service A { foo' : () -> {a : opt Int} }
  • B changes iterface from service B { bar : (() -> {}) -> () } to service B { bar' : (() -> {a : opt Int}) -> () }
  • C composes B.bar(A.foo)

The fact that C is not broken requires transitivity of <:, and we would get accidential passing of an int value here, wouldn’t we? (Maybe it ain’t so bad, but “only way” does not seem to be correct.)

@nomeata
Copy link
Collaborator

nomeata commented Nov 6, 2019

If I read the rules correctly then we get

record { a : t1 } <: record { a : opt t2 } 

for all t1, t2, with a coercion that is f if t1 <: opt t2 ~> f, and const null otherwise? So essentially no static typechecking, and just dynamic best-effort casts?

(Please excuse my polemics :-))

@rossberg
Copy link
Contributor Author

rossberg commented Nov 6, 2019

but not

record { a : opt Bool } <: record { a : opt Int }

(Due to your “let’s fix transitivity rule” premise not ( <: opt <datatype'>))

Hm, I don't follow. Why would that premise make the rule inapplicable?

I think this is not true, we need transitivity not just when one canister upgrades two times, but also when two canisters upgrade once, I think.

Hah, interesting. Yes, I think you're right. But I'd also argue that such a higher-order usage doesn't make sense in practice without assuming some coordination between the evolution of the two canisters.

So essentially no static typechecking, and just dynamic best-effort casts?

It certainly has a best-effort characteristic to the opt field case (unavoidably?). But does it mean that there is no static type-checking? You still can't undermine the IDL types.

@rossberg
Copy link
Contributor Author

rossberg commented Nov 7, 2019

So essentially no static typechecking, and just dynamic best-effort casts?

It certainly has a best-effort characteristic to the opt field case (unavoidably?). But does it mean that there is no static type-checking? You still can't undermine the IDL types.

I added clarifications pointing out that the purpose of the auxiliary rule is mostly for ensuring dynamic compatibility transitively and that tooling should warn about static uses of this rule. Also added some explanation of using reserved and deprecated (a.k.a. empty).

@nomeata
Copy link
Collaborator

nomeata commented Nov 7, 2019

But I'd also argue that such a higher-order usage doesn't make sense in practice without assuming some coordination between the evolution of the two canisters.

Ugh. I thought that was precisely the goal here: Safe evolution of interfaces, even in the presence of higher order use!

It certainly has a best-effort characteristic to the opt field case (unavoidably?). But does it mean that there is no static type-checking? You still can't undermine the IDL types.

Sure, it’s still type-safe (like python ;-))

Hm, I don't follow. Why would that premise make the rule inapplicable?

My bad, I misread the premise as “not any opt type”. Guess it works then.

design/IDL.md Outdated Show resolved Hide resolved
Type annotations can be given to disambiguate.

Note: It is expected that reading IDL values (in diagnostics or logs) is by far the dominant use case.
For that reason, the syntax is designed for easy readability instead of convenient writability. To that end, it intentionally matches the type syntax.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sneakily sneaking controverial changes in with unrelated PRs… ;-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a change, just a clarification/rationale for what's there already. ;) But I'm happy to do that in a separate PR if you think that's worthwhile.

@nomeata
Copy link
Collaborator

nomeata commented Nov 7, 2019

Ok, the rules seem to make sense to me now.

I’m musing if piggy-bagging this on opt is right thing, though: In all other places, opt is really the opt you would expect… but as a type of a record field, it behaves quite differently. I am not sure where that thought leads me to (a different, second-level type constructor that only exists in record fields? a flag on the record field itself?) … but do you see where this is coming from?

Co-Authored-By: Joachim Breitner <[email protected]>
@rossberg
Copy link
Contributor Author

rossberg commented Nov 7, 2019

Ugh. I thought that was precisely the goal here: Safe evolution of interfaces, even in the presence of higher order use!

Define safe. ;)

What's the scenario? Some provider defines a field of type opt A and some client accidentally uses it at type opt B, where A and B are unrelated. As mentioned in the clarification, that should still be diagnosed with a warning to the client. The only situation in which the client doesn't get any diagnosis is when they never actually recompile their code -- and how could they? In that case, defaulting to null dynamically at least is safe, semantically meaningful, and still better than crashing the app.

So, yeah, it's clearly a compromise, but it doesn't break type safety or static checking in general.

I'm actually more concert about the opposite case, where the two field types match up accidentally.

It suffices to avoid the removal of optional record fields to avoid this corner case.

Can we then just forbid this?

How? It's just ordinary width subtyping in negative position. I don't see how you can forbid that without breaking record subtyping in general.

I’m musing if piggy-bagging this on opt is right thing

Yeah, I actually sympathise. But the special rules already only apply to fields. Having a separate second-class type constructor or flag would not actually improve anything, like, making it any more compositional. AFAICS, it would just duplicate rules and prevent certain (valid) uses. Edit: Plus, it's not clear how you would introduce the distinction from Motoko.

@nomeata
Copy link
Collaborator

nomeata commented Nov 7, 2019

In that case, defaulting to null dynamically at least is safe, semantically meaningful, and still better than crashing the app.

Right! I am not concerned about this case, and default to null is the right behavior. (“right” = “transitively coherent”)

I'm actually more concert about the opposite case, where the two field types match up accidentally.

Me too! E.g. where one party adds a new return field opt bool where the bool indicate whether the moon was full during the request, and the other part a new optional argument opt bool where the bool indicates whether to launch the missiles, and a third party is composing these two so that transitivity applies.

How? It's just ordinary width subtyping in negative position. I don't see how you can forbid that without breaking record subtyping in general.

Depends on what you mean with “breaking record subtyping”. Yes, you’d no longer have { a : opt t } <: {}, i.e. you cannot use a record with optional fields in a context that doens’t know about these fields. Would that be worse than we what we are building here? (genine question, I don't think I see all the implications of that decision).

would not actually improve anything, like, making it any more compositional.

It may not improve much, but it improves a little: If you have an argument of type launch_missiles : variant { #none; #some : t} and you decode some value x:t you know that whoever produced that x meant that to be passed to lauch_missiles. If you have launch_missiles : opt t you don’t get that guarantee. (I am not sure if it is significant, though).

Another interesting musing (this is not to propose any change, it is merely to isolate a core aspect of this proposal here): Consider

type quux t = record { field : opt t }

If we look at this type constructor abstractly, and its subtyping rules, we have

─────────────────────────
quux t1 <: quux t2

and with elaboration

─────────────────────────────────────────────────────────────────
quux t1 <: quux t2 ~> (if (t1 <: t2 ~> f) then map_quux f else quux null

what would we call this thing? dynamic maybe?

@rossberg
Copy link
Contributor Author

rossberg commented Nov 7, 2019

I'm actually more concert about the opposite case, where the two field types match up accidentally.

Me too! E.g. where one party adds a new return field opt bool where the bool indicate whether the moon was full during the request, and the other part a new optional argument opt bool where the bool indicates whether to launch the missiles, and a third party is composing these two so that transitivity applies.

Yes, but that case isn't a type safety problem as much as a semantic problem. It is more or less the usual argument against structural typing in general. Standard example is a an object type Shape with a draw method being confused with a type Gun with the same method.

You can generally only prevent such type confusion by some form of generative naming, but that is pretty much incompatible with strong modularity as in our scenario. shrug

How? It's just ordinary width subtyping in negative position. I don't see how you can forbid that without breaking record subtyping in general.

Depends on what you mean with “breaking record subtyping”. Yes, you’d no longer have { a : opt t } <: {}, i.e. you cannot use a record with optional fields in a context that doens’t know about these fields.

That means you could not add an optional field to a record, even an outbound one, doesn't it? Unless you are trying to differentiate co- from contra-variant uses -- but we already went through multiple attempts of doing that, and haven't found a way that doesn't fundamentally break the higher-order case where polarities switch. I suspect it is impossible, but would love to be proven wrong. ;)

─────────────────────────────────────────────────────────────────
quux t1 <: quux t2 ~> (if (t1 <: t2 ~> f) then map_quux f else quux null

what would we call this thing? dynamic maybe?

Not sure I understand. The condition is a static one, i.e., part of the elaboration, not a dynamic test in the resulting term.

design/IDL.md Outdated Show resolved Hide resolved
design/IDL.md Outdated
See the discussion of record subtyping for more details.
```
<constype> ::= ....
| reserved := opt any
Copy link
Contributor

@crusso crusso Nov 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the defs of rreserved and deprecated should be swapped. opt any carries no useful information and never will (deprecated), opt empty can be replaced by opt T, any T, so it's a reserved field inhabited initially only by null but by other non-null values on further refinement. Or have I got myself confused again?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some pondering, I think you are correct, although it is a bit more nuanced. Once more, everything depends on direction.

In particular, for outbound records, refinement goes to subtypes. For normal fields you can go any :> T :> empty over time. Consequently, the natural progression for options also is opt any :> opt T :> opt empty.

But of course, it's the inverse for inbound records. So which should we pick? Ultimately, the subtyping rules for optional fields are such that both ways happen to "work".

But I believe that reserving a field name is primarily relevant for inbound records, because it ought to prevent clients from passing a record with additional fields that may be incompatible with intended future extensions. Such considerations are not necessary for records that are purely outbound, so for outbound-only records there is no point in reserving a field.

So, I agree that it should be inverted. Done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was definitely thinking from the inbound perspective: reserving an new argument or deprecating an old argument in an existing operation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the difference between

reserved    :=  opt any
deprecated  :=  opt empty

is rather small with our subtyping rules.

  • Evolving a field of that type forward behaves almost the same:

    For all t ≠ any, we have

    { foo : reserved }  <:  { foo : opt t } ~> fun _ = { foo = null }
    

    and for all t we have

    { foo : deprecated }  <:  { foo : opt t } ~> fun _ = { foo = null }
    

    So the forward-evolution works fine.

  • Deprecating an existing field t
    For all t ≠ opt t', we have

    { foo : t }  <:  { foo : reserve } ~> fun _ = { foo = null }
    

    (If t was an opt field, we still learn whether it was null or some).

    and for all t we have

    { foo : t  }  <:  { foo : deprecated } ~> fun _ = { foo = null }
    

So I am not sure there is a meaningful distinction between reserved and deprecated, and maybe we need to only use opt empty?

(opt any is a bit odd anyways, as it looks like bool, and we do not want to carry information in reserved or deprecated fields, do we?)

design/IDL.md Outdated Show resolved Hide resolved
Co-Authored-By: Claudio Russo <[email protected]>
@nomeata
Copy link
Collaborator

nomeata commented Nov 7, 2019

That means you could not add an optional field to a record, even an outbound one, doesn't it?

Yes. But one may argue that there is no use case for adding an “optional” field in an outbound record: If you are adding the field, why make it optional? You are producing the record, so you can add a guaranteed field.

The condition is a static one, i.e., part of the elaboration, not a dynamic test in the resulting term.

But we pick the elaboration function based on the type as described in the received message. I would consider this somewhat dynamic.

But I can rewite, I was just lazy:

──────────────────
quux t1 <: quux t2

and

t1 <: t2 ~> f                       not (t1 <: t2)
────────────────────────────────    ─────────────────────────────
quux t1 <: quux t2 ~> map_quux f    quux t1 <: quux t2 ~> λ_.null

@rossberg
Copy link
Contributor Author

rossberg commented Nov 7, 2019

But one may argue that there is no use case for adding an “optional” field in an outbound record: If you are adding the field, why make it optional?

Because an optional field is the only thing you can add when the same record type is also used inbound.

But we pick the elaboration function based on the type as described in the received message. I would consider this somewhat dynamic.

Yes, in that sense you are right. Not sure what conclusion to draw from that, though.

Still pondering over @crusso's comment regarding the reversed vs. deprecated...

@nomeata
Copy link
Collaborator

nomeata commented Nov 7, 2019

Because an optional field is the only thing you can add when the same record type is also used inbound.

Ah, right. Sorry. I keep forgetting the requirement to use the same record type in both positions.

@nomeata
Copy link
Collaborator

nomeata commented Nov 7, 2019

But we pick the elaboration function based on the type as described in the received message. I would consider this somewhat dynamic.

Yes, in that sense you are right. Not sure what conclusion to draw from that, though.

Me neither. But it has some unpleasant implementation implications. Previously, I can decode a message in on pass through the value, while (lazisly) looking at the types to understand the message. If the types don't line up, i trap.

Now, when decoding an opt t, I have to either

  • decide the subtyping relation, which, in the presence of recursion, is a non-trivial algorithm.
  • opportunistically start decoding as t, and when that fails because of a subtyping violation, backtrack to the last such thing, discard partially decoded values, and set to null.

Both certainly doable.

@rossberg
Copy link
Contributor Author

rossberg commented Nov 8, 2019

Ah, right, that sucks. Damn, I think you brought that up before. Hm... Is there any possible simplification to this distinction? (The only thing I can think of is restricting subtyping on opt to be invariant, such that you'd only need to check for equality. But I don't think that simplifies anything.)

design/IDL.md Outdated Show resolved Hide resolved
@matthewhammer
Copy link
Contributor

matthewhammer commented Nov 22, 2019

Okay, I have to be honest: I still can't make sense of this lattice-based, subtyping view on evolution. I really want to understand these ideas, but I get stuck each time. Let me try to expand.

For example:

a service of type T is upgradable to a version with another type T' if and only if T' is structural subtype of T, written T' <: T

...snip....

When evolving record types over time (see below), it sometimes is useful to reserve fields for future use, or deprecate old fields so that they are no longer used.

...snip....

reserved := opt empty
deprecated := opt any

Okay, so far so good.

But these definitions do not seem consistent with the larger evolution story, or I'm (still) very confused.

Let me explain my thinking with the example that I always use mentally: the lifetime of a single field of some record mentioned in an interface, in both argument and result positions.

Suppose this field type evolves from reserved (opt empty), to later be used at a precise type (opt Foo), and then sometime later still, replaced with a different field and marked as deprecated (opt any). More complex evolution are possible I think, but this 3-stage one suffices for now.

So I want to think of some binary relation of IDL types ("evolution") as relating the three stages with two steps:

  • reserved (opt empty) evolves-to opt Foo
  • opt Foo evolves-to deprecated (opt any)

Except that this is backwards from the standpoint of subtyping, is it not?

(With lattices, I can never tell if it's just my own feeling turned around.)

We have that any is the topmost (most general) type, and that empty is the (least general) bottommost type, and that evolution is defined as going "down" the lattice, to more specialized subtypes.

I currently do not see how the definitions here permit the single-field story above. What am I missing?

Aside: In my imagined use case, there is no "polarity" to this structure field, meaning that once it does exist, it is both consumed and produced by clients and by servers, like a piece of state or something, such as a counter, etc.

@nomeata
Copy link
Collaborator

nomeata commented Nov 22, 2019

I agree that deprecated := opt any is not really useful under the latest version of the proposal (the “in an opt field, use the present value if you can”), see #832 (comment). But that is just a side-point of Andreas’ overall plan here.

@matthewhammer
Copy link
Contributor

This rule allows the consecutive removal and later re-addition of a field with different types over multiple upgrades

I agree that this use case makes sense, as a matter of record evolution.

But again, how does this evolution pattern have anything to do with moving (monotonically) through a typing lattice? (Does it for anyone? Honest question. I can't see it.)

@matthewhammer
Copy link
Contributor

is there any difference between e1 and e4 or between e2 and e3? If yes, what is it? If not, how is the relation not isomorphic to subtyping?

Good question. I'm trying to read the rules in this PR to answer that.

I will look for a place where this holds:

S $> T ~> (f, g)

But where one or the other of these two is not derivable with your rules:

T <: S ~> f
S <: T ~> g

I'm not convinced this case exists; we'll see.

My sense is that it could, since the first (four-place) judgement is saying that S is permitted to evolve to T, and that there are ways of inter-converting between the two "times" of the type pair in that orientation (S as older, and T as newer). FWIW, I always imagine that the type in question is a record and that's both inbound and outbound, in your terms.

The second (three-place) relation is fundamentally different, since it poses the conversion question in a unidirectional way that flips the roles of S and T in time with each direction. That (still) seems unnecessarily confusing, and (perhaps?) wrong to me.

I'll look for a concrete case that distinguishes the rules; I'll have to write the four-place relation, in the process.

@nomeata
Copy link
Collaborator

nomeata commented Nov 23, 2019

But again, how does this evolution pattern have anything to do with moving (monotonically) through a typing lattice?

It doesn't! At least not monotonically. The rules as proposed by Andreas allow an record in argument position to evolve as follows

rec {} <: rec { foo : opt nat } <: rec { foo : opt empty } <: rec { } <: rec { foo : opt text } <: rec { foo : opt nat }

In other words: „For fields of type opt t or absent fields, all bets are off statically”. Dynamically, if a value is there and can be used, it will, otherwise its none.

(I am not sure if you have already appreciated that but don’t like it, or if it wasn’t phrase clearly enough, so excuse me if I am telling you nothing new here.)

@nomeata
Copy link
Collaborator

nomeata commented Nov 25, 2019

Andreas is out for the week, so I guess we won't resolve this this wekk. I think one open question is whether we require decoders to implement a full subtyping check, or if it is ok to do a weaker one like the one proposed in #870. Or do something else completely.

@matthewhammer
Copy link
Contributor

matthewhammer commented Dec 3, 2019

(I am not sure if you have already appreciated that but don’t like it, or if it wasn’t phrase clearly enough, so excuse me if I am telling you nothing new here.)

It's a chaotic mixture of the two, and I definitely appreciate people restating things that I'm trying to understand, even if I think I understand them already. Thanks!

So, when I commented last above I had poured over the (subtyping) rules as written here, trying to view them as an ordering that enforces some "one-way direction", unlike the pattern that we've been discussing, where opt fields do not seem to indicate any kind of movement along an ordering.

The cases of an actual ordering (in "evolution time") that seem to pop out to me:

  • Naturals may evolve to integers (adding negative possibilities), but not vice versa
  • Variants (sum types) can extend one way, but never the other.

So, what if (just as with records), we want a common sum type that appears in both inbound and outbound positions of the service's type?

IIUC, this seems like a fatal issue with this subtyping-based conception.

At first glance, it appears as though inbound "flags" are fundamentally different from outbound "return codes", except when you have real systems. For instance, @crusso suggested in one example that having variants means that we can log the calls of a service, and in so doing, place certain argument flags into returned result records. What happens when those types evolve under this subtyping model? As I've said many times, service type evolution is not related to subtyping, and it should support something that seems forbidden here. In particular seems like we've (maybe?) overcome the evolution issue for inout-bound records, but not inout-bound "flags" / "return codes" or other uses of variants in service APIs.

To me, it all seems like the same story again, but in a dual land of sums instead of products. (Again, am I missing something? Feel welcomed to restate what may feel obvious, but don't feel pressured to do so.)

@nomeata
Copy link
Collaborator

nomeata commented Dec 4, 2019

So, what if (just as with records), we want a common sum type that appears in both inbound and outbound positions of the service's type?
To me, it all seems like the same story again, but in a dual land of sums instead of products.

That’s a very good point, and shows that the “use same time in both positions and still evolve” is really tricky to attain.

As I've said many times, service type evolution is not related to subtyping, and it should support something that seems forbidden here.

Can we just put quotes around subtyping here? Essentially, we are trying to define a relation on the types of services (and thus, by extensions, other types) that guarantees safe upgrades. Whether we call this subtyping or evolution doesn’t bring us closer to a solution. We can call it “the damn relation” during the discussion, if that helps. In the end, we need to specify that relations (including semantics on values) formally. I don’t care what we call it, and wether it will look like conventional subtyping or not at that point.

@matthewhammer
Copy link
Contributor

matthewhammer commented Dec 4, 2019

Can we just put quotes around subtyping here?

If that's all that we can do, that's still an improvement.

But I'd prefer to permanently choose another word, like "safe evolution" or "valid evolution" or "evolution", etc. The word itself isn't that important to me, so long as it's not conflatable with subtyping, a problem that seems to persist (for me). The quotes don't go far enough for me. :)

BTW: I genuinely admire your ability to alpha-convert away meaningless naming distinctions. I cannot do that very well, and I constantly feel drawn into this (often subconscious) confusion where I start conflating the idea of subtyping as we know it with evolution, which we both agree is distinct. @rossberg still may disagree with us somewhat; that uncertainly makes me uncertain still, since I respect his view and often suspect that I'm overlooking something important.

Evolution ⊥ Subtyping

FWIW, Here's how I relate these distinct ideas, in one place (no need to reread my diatribe from October).

Both notions are about relaxing an otherwise rigid substitution principle, which assumes that nothing changes, and all types match perfectly for each small-step evaluation step. That's the simple ideal world of "intro PL theory" (e.g., STLC).

We don't want just the STLC, since that's too restrictive in the real world, where we need more freedoms. The nature of those extra substitution freedoms seem fundamentally different, because of how they relate to time, and what kinds of interactions they concern.

  • Subtyping is about permitting a more flexible way to compose a single value with a single evaluation context (e.g., arguments and results of a canister call, or anything else). Because of decentralization, there's the possibility of some "version time" passing as the small steps occur, which is novel, but that dimension of flexibility is different, and in the simplest case, we can assume it doesn't actually happen (or always fails, etc.). The point is that time is practical here, but is not essential.

  • But time is essential to evolution. In the simplest case, this only concerns permitting a more flexible way to evolve an actors for all future calls, but not any "current one" or any past ones.

The orderings are different because of these differences:

  • Evolution is about relating service behavior as it changes meaningfully over time, guided by human design and choices about how to proceed,
  • Subtyping is about relating values in a timeless lattice of all values, organized in a timeless ordering of "meanings". This lattice is totally independent of any human choices; it is Platonic.

These orderings do not relate directly, and using a new word would help reinforce that fact.

@matthewhammer
Copy link
Contributor

That’s a very good point, and shows that the “use same time in both positions and still evolve” is really tricky to attain.

I wanted to mention that in this realization, I was immediately reminded of @nomeata's design from April in Zurich, when he was advocating a way to collapse some of the distinctions between sums and products and make their common structure more apparent, and easier to intermix. I feel like that's worth revisiting, if not to use directly, just to learn from again. :)

@rossberg
Copy link
Contributor Author

rossberg commented Dec 6, 2019

@matthewhammer:

The orderings are different because of these differences:

  • Evolution is about relating service behavior as it changes meaningfully over time, guided by human design and choices about how to proceed,
  • Subtyping is about relating values in a timeless lattice of all values, organized in a timeless ordering of "meanings". This lattice is totally independent of any human choices; it is Platonic.

These orderings do not relate directly, and using a new word would help reinforce that fact.

Sorry, but I still I don't follow the distinction you are making. In most languages, subtyping even has to be declared, so is based on explicit human choices as well. And evolution is not just about which functions you can call, but more interestingly, about the values flowing along these calls. And that has all the usual properties and characteristics of subtyping, including co/contra-variance, higher-orderness, the question of coercive vs non-coercive representations, etc. So it looks like a duck and quacks like a duck...

The main issue we have is that we want a particularly lenient subtype relation here.

So, what if (just as with records), we want a common sum type that appears in both inbound and outbound positions of the service's type?

Yes, I agree, and I believe this is mentioned as an open question / future direction somewhere in the doc. To be honest, I don't have a good idea, regardless how you prefer modelling or naming the relation. It doesn't seem like other data formats bother much.

@nomeata
Copy link
Collaborator

nomeata commented Dec 6, 2019

Maybe, just a short in the dark to make things more concrete, one difference between a subtyping based solution and a “evolution” based solution could be that in an “evolution” based solution, you (maybe) know the history of a type (t2 evolved from t1 and s2 from s1) , and you (maybe) have the ability to have a function of type t2 -> s2 look at the argument, see if it s t1 or t2, and then behave differently when responding. Such a mechanism may be useful, or maybe it is horrible anyways, but it seems that thinking in subtypes prevents us from considering something like that.

@rossberg
Copy link
Contributor Author

rossberg commented Dec 6, 2019

Well, the upgraded type and similar variants we discussed expressed that as subtyping, I think. Not sure how the space would magically get larger with a different framing. Ultimately, it's all about "what set of types am I allowed to use in a context expecting this type?", which is what subtyping is.

As a practical matter, any additional dimension of annotation would also have to be expressible in a source language like Motoko if you want to allow inferring IDL from source.

@nomeata
Copy link
Collaborator

nomeata commented Dec 6, 2019

Well, the upgraded type and similar variants we discussed expressed that as subtyping, I think.

None of them made added a coercion to the result of a function based on the version of argument. And maybe that is simply a stupid idea. In fact it probably is. Just feebly trying to see the shape of the box that we might want to think outside of…

@matthewhammer
Copy link
Contributor

matthewhammer commented Dec 6, 2019

@rossberg :

In most languages, subtyping even has to be declared, so is based on explicit human choices as well.

Agreed. However, we also agree that in Motoko, this is not the case.

That fact is a key aspect of its type system design, AFAIK. So, why would we expect this lesson from other languages to (directly) apply? In fact, should we not expect the opposite, as is the case here (as it still seems to me)?

In Motoko, value types form a nice a lattice where subtyping is just about structure, and that structure has nothing to do with time or human authorship, at least directly (we are in a single frame of time, for a single instant, when we think about a single Motoko program's typing derivation, I claim; evolution does not directly apply; it's areadly been reconciled somehow outside this frame).

Actor types, as concerning this IDL upgrade/evolve question, are different than this single frame view. They exist outside this, at an "outer view" that contains many time instants, and many versions of each actor, potentially. In this view, we view actors as IDL services, described with IDL types that evolve over time. We can (and should) store this full history, as Joachim suggests above.

FWIW, I think Joachim's intuitions above about an implementation smell right to me.

And, even if this history is not directly represented (I think it should be, and it should be part of the static assets, perhaps enforced by the system to be correct somehow), it's there conceptually when we define those composed telescoping adaptor functions in my four place relation idea.

@matthewhammer
Copy link
Contributor

if this history is not directly represented...it's there conceptually when we define those composed telescoping adaptor functions in my four place relation idea.

To illustrate this picture, I can try to recover a small example from my October brain dump, when I get some time to better explain it.

@matthewhammer
Copy link
Contributor

And that has all the usual properties and characteristics of subtyping, including co/contra-variance, higher-orderness, the question of coercive vs non-coercive representations, etc. So it looks like a duck and quacks like a duck...

We are seeing different animals. I do not see a duck, I see a duck telescoping through spacetime.

Kidding aside, I strongly disagree with all the assertions above: We all agree that this leniency, as you call it, fundamentally breaks the typical contravariant structure of arrow typing any time arguments and results that telescope in time have any evolving algebraic (product or sum) structure!. That's basically all the interesting cases, as far as I can see now.

Given this fact, I still cannot see how to view any of this as subtyping. I also honestly cannot see why it still feels right to you. What am I (still) missing?

@nomeata
Copy link
Collaborator

nomeata commented Dec 6, 2019

Can we try to make concrete proposals, and then maybe discuss what to name them? Yes, my gut feeling sees many parallels with conventional subtyping, and some aspects where upgrade may be different. But I feel we are not making progress by discussing what we feel (heh), or how we call things.

This PR has a concrete proposal. It don’t find it pretty, it doesn’t make me proud and it feels more like dynamic typing than subtyping to me. But in a way, it is the best we have at the moment… so can we focus on tweaking it or, alternatively, provide concrete alternatives?

@matthewhammer
Copy link
Contributor

matthewhammer commented Dec 6, 2019

@nomeata

This PR has a concrete proposal...it is the best we have at the moment… so can we focus on tweaking it or, alternatively, provide concrete alternatives?

@rossberg, from the PR description:

Please try to shoot holes into it, I may well have missed something!

I suppose that I was doing that a little extensively, I guess. I agree that it'll be helpful to be more constructive, and more concrete, going forward. : )

@nomeata
Copy link
Collaborator

nomeata commented Dec 6, 2019

If you shoot concrete holes, that’s also very helpful!

@knl
Copy link
Contributor

knl commented Jan 6, 2020

I updated the description, to replace Zürich with Zurich -- it turns out that Unicode characters in the description break our CI :( We're fixing it, and this was just a quick-fix to allow other PRs to be built.

@dfinity-ci
Copy link

In terms of size, no changes are observed in 2 tests.
In terms of gas, no changes are observed in 2 tests.

@nomeata
Copy link
Collaborator

nomeata commented Nov 23, 2020

Closing, Candid discussion can happen at the Candid repo.

@nomeata nomeata closed this Nov 23, 2020
@nomeata nomeata deleted the idl-sub2 branch April 30, 2021 16:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants