-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[IDL] Optimistic subtyping #1959
Comments
Thanks for pickin up that ball again. So this ties into #870 (semantics of deserialization), and would have to be solved with that. The original motivation was record fields, but you don’t talk about record fields here… I assume you’d also add a rule that treats a missing field as My gut feeling is that this still amounts to “try to decode, turn failue into silent Also, so far we use the same relation for subtying for decoding, and subtyping for upgrades. You give evidence why during decoding, I could imagine that we keep the previous subtyping rules for Might be nice to write these properties down in terms of the Does this proposal has the “coherence” and “transitive coherence” properties? I guess coherence is by construtions in the way you phrase the rules. Transitive coherence … no, that’ can’t hold. We have |
Right.
Yes, I'm only focussing on the problematic rule.
That is correct. I was considering the history of how we arrived at the typed format. IIRC, our original motivation wasn't actually more checking, but a more compressed representation, e.g., saving the repetition of label indices in a vector etc. So from that perspective, a less typeful semantics could be argued for. It can also well be argued against, but then we need a feasible alternative, which we so far have failed to produce. :(
Probably not. In an actual specification, I'm sure we'll want to define two relations for these different use cases.
Yes, agreed, this property may require some qualification (or even removal).
Right, coherence cannot hold under this proposal. Desirable as it would be, I don't think coherence is possible given such a typing rule, regardless of what dynamic semantics we assign to it. |
I get the feeling that there is an underlying shift on implicit goals, which I can’t phrase fully formally. Maybe we have been too ambitious previously. But under these goals, it seems to make sense. Maybe one way to phrase it formally in a pleasant way is the following:
My gut feeling says that this framing might give us a better framwork to express some of the properties you list above? |
Yes, that might make sense. The underlying goal shift is that the type annotation in the value is no longer viewed as prescriptive, but only as descriptive (it's like a transparent type annotation, not a coercive one). Although we never explicitly stated the former as an intention, I think we all implicitly assumed it and feel that losing it is somewhat unfortunate. But as you say, it might actually align better with implementation reality. An alternative, btw, would be to require decoders to always produce principal types, so that t1 and t3 always coincide. But that's significant extra work on both encoders (who must compute it) and decoders (who must check it), so seems undesirable. But if you want to sell this to a type-inclined person, you could describe that as a the core model minus the requirement to enforce it. |
Makes sense and seems like a reasonable compromise to me. As for helping implementors, how do the Candid tools relate to that question (cc @chenyan-dfinity)? E.g., perhaps we can have a use case for |
Closely related, we have so far glossed over coming up with a definition of when a serialised value is actually well-typed, i.e., abstractly, when It seems like specifying this requires a separate, more restrictive, non-coercive notion of subtyping. Which may reenforce @matthewhammer's earlier point that upgradability is not exactly the same as subtyping -- at least we need to differentiate between the general and the smaller non-coercive subrelation. Notably, the larger relation does not actually have anything like "principal" types, since it lacks an antisymmetry property (there may be minimal types in some sense, but they are not unique). But the smaller relation ought to have principal types for the values it ascribes. |
Humbly accepting this gracious validation with great inner joy. Thank you. : ) |
I believe this has turned into dfinity/candid#110 and is done. |
After going through our various discussions and non-solutions to the upgradability problem again (see #1523), I am tempted to revisit one option that we only briefly discussed under the name of “optimistic” deserialisation or “backtracking”.
Here is the problem again. To maintain transitivity in the presence of record extension that can work both in contra-variant position and in higher-order cases (see below for summary of design constraints), we have concluded that it seems inevitable to have a rule
that works for all t1, t2. The big question, then, is what operational behaviour to assign to this.
Previously, I had naively proposed the “ideal” semantics, which, as @nomeata pointed out, amounts to a runtime test:
The problem with that is that every deserialiser would have to implement subtyping on IDL types, which is an extra cost and arguably unreasonable burden, and chances are that many implementations will take shortcuts.
Another option that we only briefly discussed but dismissed earlier was the “optimistic” approach, i.e., deserialisation will try to deserialise the optional value as t2, but if that fails it aborts and returns null for the option. (Perhaps not really “backtracking” since it never explores a second option but just aborts.)
For an implementation, this is a fairly natural way of handling things, but semantically, it raises at least two questions:
For the former, the following might work:
In other words, if there is a “more principal” type t3 for the value, with which we can successfully deserialise, then we do so.
This corresponds to the observation that the only way in which an extra type-check in the ideal approach could fail (I believe) is by having a serialisation type that is not principal. For example, by containing empty vectors or variants with cases whose types don’t match between t1 and t2 but that never actually occur in the value. These pairs of types would never be explored for actual deserialisation, so would require an extra subtype check to maintain the semantics. In the above formulation, we simply allow these “redundant” type components to be ignored.
Consequently, the semantic property of deserialisation generalises in a corresponding manner:
deser(v:t1, t2) = v’ if and only if there exists t3 with t3 <: t1 and t3 <: t2, such that v:t3.
Now, this is a bit odd, but do we think this property is “good enough” for our purposes? Notably, it is no worse than the property we’d get if we didn’t type-annotate serialised values in the first place, but made them untyped (as we originally had). In a way, you could argue, we have brought this whole problem on ourselves only by extracting structural information about the interpretation of the serialised value into a “type”, a feature that other formats do not have. The intended use is that that type is principal, but to make it easier for implementations, our semantics does not enforce that.
WDYT?
For the record, here is a summary of requirements for the upgradability relation and the operational semantics of deserialisation, which was previously scattered across various discussions:
Soundness: The upgradability relation implies that no deserialisation can fail.
Completeness: The upgradability relation covers all cases of successful deserialisation.
Transitivity: The upgradability relation implies that deserialisation cannot fail even across multiple upgrades.
Record Extensibility: Records are upgradable by adding new (optional) fields, even when they appear in both positive and negative positions, such that round-trips always remain possible.
Higher-order Extensibility: The upgradability relation extends to higher-order cases, where there is no obvious owner of the input/output contract.
Language Injectivity: The upgradability relation does not depend on version or role annotations in interfaces that have no counterpart in source languages, i.e., an interface description can be generated from source-level types without special features or cross-version knowledge.
Simple Deserialisation: Deserialisation does not require (sub)type checks other than the type-directed traversal of the value blob.
Non-coercive Deserialisation: Deserialisation of a value is invariant across super- and sub-types.
Type Erasure: Deserialised values do not require carrying dynamic type information on the language side.
The text was updated successfully, but these errors were encountered: