-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix IDL subtyping #832
Fix IDL subtyping #832
Changes from 3 commits
5aa8380
54bb5e0
d30b03b
0350be2
f769d05
79d0e3a
a13bfdd
d864753
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8,7 +8,7 @@ To document, discover, and interact with actors on the platform, we need an inte | |
exchange (names, parameter and result formats of actor methods) | ||
* Simple and canonical constructs (C-like; algebraically: sums, products, exponentials) | ||
* Extensible, backwards-compatible | ||
* Well-formedness is checkable and guaranteed by the platform | ||
* Well-formedness is checkable | ||
* Deterministic mapping to serialised representation | ||
* Human-readable and machine-readable | ||
* Declarative, usable as input to binding code generators | ||
|
@@ -77,8 +77,7 @@ This is a summary of the grammar proposed: | |
| bool | ||
| text | ||
| null | ||
| reserved | ||
| empty | ||
| any | empty | ||
|
||
<constype> ::= | ||
| opt <datatype> | ||
|
@@ -274,22 +273,22 @@ The type `null` has exactly one value (the *null* value) and therefor carries no | |
<primtype> ::= ... | null | ... | ||
``` | ||
|
||
#### Reserved | ||
#### Any and Empty | ||
|
||
The type `reserved` is a type with unknown content that ought to be ignored. Its purpose is to occupy field ids in records in order to prevent backwards/forwards compatibility problems, see the description of record types below. | ||
The type `any` is a type with unknown content (i.e., it is the type of all possible values) that ought to be ignored. Its primary purpose is to occupy yet unused field ids in records in order to prevent backwards/forwards compatibility problems, see the description of record type subtyping below. | ||
``` | ||
<primtype> ::= ... | reserved | ||
``` | ||
**Note:** This type has a similar role as *reserved fields* in proto buffers. | ||
|
||
|
||
#### Empty | ||
|
||
The type `empty` is the type of values that are not present. Its purpose is to mark variants that are not actually there, or -- as argument types of function reference -- indicate that they will not be called. | ||
Dually, the type `empty` is the type of unused or absent content (i.e., it is the type with no values). Its primary purpose is to occupy no longer used field ids in records in order to prevent backwards/forwards compatibility problems, see the description of record type subtyping below. | ||
It can also be used to mark variants that are not actually there, or -- as argument types of function reference -- indicate that they will not be called. | ||
``` | ||
<primtype> ::= ... | empty | ||
``` | ||
|
||
Technically, the types of the IDL form a lattice by subtyping, where `any` and `empty` are the top and bottom element. | ||
|
||
|
||
### Constructed Data | ||
|
||
*Constructed types* describe compound or aggregated forms of values. | ||
|
@@ -301,6 +300,17 @@ An *option* is a value of a specific data type that may be absent. | |
``` | ||
<constype> ::= opt <datatype> | ... | ||
``` | ||
##### Shorthands: Reserved and Deprecated | ||
|
||
When evolving record types over time (see below), it sometimes is useful to *reserve* fields for future use, or *deprecate* old fields so that they are no longer used. Defining the fields or keeping them around as "occupied" can prevent backwards/forwards compatibility problems. | ||
The types `reserved` and `deprecated` can be used for this purpose. | ||
See the discussion of record subtyping for more details. | ||
``` | ||
<constype> ::= .... | ||
| reserved := opt any | ||
| deprecated := opt empty | ||
``` | ||
|
||
|
||
#### Vectors | ||
|
||
|
@@ -537,16 +547,16 @@ That is, outbound message results can only be replaced with a subtype (more fiel | |
|
||
Subtyping applies recursively to the types of the fields themselves. Moreover, the directions get *inverted* for inbound function and actor references, in compliance with standard rules. | ||
|
||
**TODO: unsound, fix** | ||
|
||
To make these constraints as flexible as possible, two special rules apply: | ||
|
||
* An absent record field is considered equivalent to a present field with value `null`. Moreover, a record field of type `null` is a subtype of a field with type `opt <datatype>`. That way, That way, a field of option (or null) type can always be added to a record type, no matter whether in co- or contra-variant position. If an optional field is added to an inbound record, and he client did not provide it, the service will read it as if its value was null. | ||
rossberg marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
- in an outbound record, a field of option (or null) type can also be removed in an upgrade, in which case the client will read it as if its value was null; | ||
- in an inbound record, a field of option (or null) type can also be added, in which case the service will read it as if its value was null. | ||
|
||
Future extensions: defaults, including for variants? | ||
Taken together, the rules allow record types to support round-trips, i.e., occur in both inbound and outbound positions, while still remaining upgradable via new optional fields. | ||
|
||
Possible future extensions: defaults for options, including for variants? | ||
|
||
|
||
### Rules | ||
|
@@ -593,7 +603,7 @@ vec <datatype> <: vec <datatype'> | |
``` | ||
Furthermore, an option type can be specialised to either `null` or to its constituent type: | ||
``` | ||
------------------------ | ||
---------------------- | ||
null <: opt <datatype> | ||
|
||
not (null <: <datatype>) | ||
|
@@ -616,23 +626,20 @@ record { <fieldtype'>;* } <: record { } | |
|
||
<datatype> <: <datatype'> | ||
record { <fieldtype>;* } <: record { <fieldtype'>;* } | ||
---------------------------------------------------------------------------------------------- | ||
record { <nat> : <datatype>; <fieldtype>;* } <: record { <nat> : <datatype'>; <fieldtype'>;* } | ||
-------------------------------------------------------------------------------------------- | ||
record { <fieldtype>;* <nat> : <datatype> } <: record { <fieldtype'>;* <nat> : <datatype'> } | ||
``` | ||
|
||
**TODO: Rules below are unsound as is, need fixing!** | ||
|
||
In addition, record fields of `null` or option type can be removed, treating the absent field as having value `null`. | ||
``` | ||
record { <fieldtype>;* } <: record { <fieldtype'>;* } | ||
------------------------------------------------------------------- | ||
record { <fieldtype>;* } <: record { <nat> : null; <fieldtype'>;* } | ||
------------------------------------------------------------------ | ||
record { <fieldtype>;* } <: record { <fieldtype'>;* <nat> : null } | ||
|
||
record { <fieldtype>;* } <: record { <fieldtype'>;* } | ||
----------------------------------------------------------------------------- | ||
record { <fieldtype>;* } <: record { <nat> : opt <datatype>; <fieldtype'>;* } | ||
---------------------------------------------------------------------------- | ||
record { <fieldtype>;* } <: record { <fieldtype'>;* <nat> : opt <datatype> } | ||
``` | ||
TODO: What we want to achieve: Taken together, these rules ensure that adding an optional field creates both a co- and a contra-variant subtype. Consequently, it is always possible to add an optional field. In particular, that allows extending round-tripping record types as well. For example, | ||
Taken together, these rules ensure that adding an optional field creates both a co- and a contra-variant subtype. Consequently, it is always possible to add an optional field. In particular, that allows extending round-tripping record types as well. For example, | ||
``` | ||
type T = {}; | ||
actor { f : T -> T }; | ||
|
@@ -644,29 +651,65 @@ actor { f : T' -> T' }; | |
``` | ||
for all first-order uses of `T`, because both of the following hold: | ||
``` | ||
upgrade T' <: T | ||
upgrade T <: T' | ||
T' <: T (by usual width subtyping) | ||
T <: T' (by option field subtyping) | ||
``` | ||
And hence: | ||
``` | ||
upgrade (T' -> T') <: (T -> T) | ||
(T' -> T') <: (T -> T) | ||
``` | ||
for some version of a type `upgrade T`. | ||
Moreover, this extends to the higher-order case, where e.g. a function of the above type is expected as a parameter: | ||
``` | ||
actor { g : (h : T -> T) -> ()} | ||
``` | ||
Upgrading `T` as above contra-variantly requires | ||
``` | ||
upgrade (T -> T) <: (T' -> T') | ||
(T -> T) <: (T' -> T') | ||
``` | ||
which also holds. | ||
|
||
Note: Subtyping still needs to be transitive . We must not allow: | ||
Finally, one auxiliary rule is needed to ensure that subtyping is still transitive: | ||
``` | ||
not (<datatype> <: opt <datatype') | ||
rossberg marked this conversation as resolved.
Show resolved
Hide resolved
|
||
record { <fieldtype>;* } <: record { <fieldtype'>;* } | ||
------------------------------------------------------------------------------------------------ | ||
record { <fieldtype>;* <nat> : <datatype> } <: record { <fieldtype'>;* <nat> : opt <datatype'> } | ||
``` | ||
This rule allows the consecutive removal and later re-addition of a field with different types over multiple upgrades. For example: | ||
``` | ||
record {x : text} <: record {} <: record {x : opt nat} | ||
``` | ||
The field will also read as `null` in this case. | ||
Transitivity is necessary so that data compatibility is guaranteed in all cases under the earlier rules, even across multiple upgrades. | ||
|
||
Note that this rule does not cover the case where the new version of the field has the same type as the old: | ||
``` | ||
record {x : opt nat} <: record {} <: record {x : opt text} | ||
rossberg marked this conversation as resolved.
Show resolved
Hide resolved
|
||
``` | ||
In this case, a direct access of the old value under the new interface, without going through the intermediate version, will be handled according to the ordinary rule for options, keeping its value. | ||
The semantics has no way of knowing whether that is intended or not, but this behaviour avoids any semantic ambiguity. | ||
|
||
However, this implies that -- although subtyping is transitive and the respective conversions compose -- the composition of conversions is not always *coherent*, i.e., it may return values different from a direct conversion. | ||
Unfortunately, a fully coherent semantics seems impossible to achieve for this case. | ||
|
||
In practice, users are not recommended to remove optional fields in an upgrade, avoiding the scenario. | ||
Instead, the preferred way to avoid backwards compatibility issues is to *deprecate* such fields by giving them type `deprecated`, a shorthand for `opt empty`. | ||
Accordingly, both the following hold: | ||
``` | ||
record {x : opt text} <: record {} <: record {x : opt nat} | ||
record {x : deprecated} <: record {x : opt T} | ||
record {x : opt T} <: record {x : deprecated} | ||
``` | ||
**TODO: Sanity continues from here.** | ||
The first is for outbound uses and holds by usual subtyping, the second is for inbound uses, by the auxiliary rule above. | ||
|
||
Dually, forward compatibility can be ensured by *reserving* a field via the type `reserved`, a shorthand for `opt any`. | ||
Accordingly, both the following hold: | ||
``` | ||
record {x : opt T} <: record {x : reserved} | ||
record {x : reserved} <: record {x : opt T} | ||
``` | ||
The first use again is for outbound uses, by usual subtyping, the second for inbound ones, by the auxiliary rule above. | ||
|
||
Note: The auxiliary subtyping rule above should not be invoked in a static check for upgradability of a service, with the exception of these two cases. We expect tooling to warn about any such dependency on the rule. The only purpose of allowing the other cases is for dynamic coercions, i.e., to make sure that subtyping is transitive and therefore avoids unexpected runtime failure after multiple upgrades in all cases. | ||
|
||
|
||
#### Variants | ||
|
@@ -679,8 +722,8 @@ variant { } <: variant { <fieldtype'>;* } | |
|
||
<datatype> <: <datatype'> | ||
variant { <fieldtype>;* } <: variant { <fieldtype'>;* } | ||
------------------------------------------------------------------------------------------------ | ||
variant { <nat> : <datatype>; <fieldtype>;* } <: variant { <nat> : <datatype'>; <fieldtype'>;* } | ||
---------------------------------------------------------------------------------------------- | ||
variant { <fieldtype>;* <nat> : <datatype> } <: variant { <fieldtype'>;* <nat> : <datatype'> } | ||
``` | ||
|
||
#### Functions | ||
|
@@ -714,6 +757,7 @@ service { <name> : <functype>; <methtype>;* } <: service { <name> : <functype'>; | |
### Elaboration | ||
|
||
To define the actual coercion function, we extend the subtyping relation to a ternary *elaboration* relation `T <: T' ~> f`, where `f` is a suitable coercion function of type `T -> T'`. | ||
To express this function, we assume a simple lambda calculus over IDL values. | ||
|
||
|
||
#### Primitive Types | ||
|
@@ -724,15 +768,15 @@ To define the actual coercion function, we extend the subtyping relation to a te | |
<primtype> <: <primtype> ~> \x.x | ||
|
||
|
||
------------------ | ||
Nat <: Int ~> \x.x | ||
------------------- | ||
Nat <: Int ~> \x.+x | ||
|
||
|
||
------------------------------ | ||
<datatype> <: reserved ~> \x.x | ||
--------------------------------- | ||
<datatype> <: reserved ~> \x.null | ||
|
||
|
||
------------------------------ | ||
------------------------------------- | ||
empty <: <datatype> ~> \_.unreachable | ||
``` | ||
|
||
|
@@ -768,20 +812,23 @@ record { <fieldtype'>;* } <: record { } ~> \x.{} | |
<datatype> <: <datatype'> ~> f1 | ||
record { <fieldtype>;* } <: record { <fieldtype'>;* } ~> f2 | ||
---------------------------------------------------------------------------------------------- | ||
record { <nat> : <datatype>; <fieldtype>;* } <: record { <nat> : <datatype'>; <fieldtype'>;* } | ||
record { <fieldtype>;* <nat> : <datatype> } <: record { <fieldtype'>;* <nat> : <datatype'> } | ||
~> \x.{f2 x with <nat> = f1 x.<nat>} | ||
``` | ||
|
||
TODO: Fix | ||
``` | ||
record { <fieldtype>;* } <: record { <fieldtype'>;* } ~> f | ||
------------------------------------------------------------------- | ||
record { <fieldtype>;* } <: record { <nat> : null; <fieldtype'>;* } | ||
------------------------------------------------------------------ | ||
record { <fieldtype>;* } <: record { <fieldtype'>;* <nat> : null } | ||
~> \x.{f x; <nat> = null} | ||
|
||
record { <fieldtype>;* } <: record { <fieldtype'>;* } ~> f | ||
----------------------------------------------------------------------------- | ||
record { <fieldtype>;* } <: record { <nat> : opt <datatype>; <fieldtype'>;* } | ||
record { <fieldtype>;* } <: record { <fieldtype'>;* <nat> : opt <datatype> } | ||
~> \x.{f x; <nat> = null} | ||
|
||
not (<datatype> <: opt <datatype') | ||
record { <fieldtype>;* } <: record { <fieldtype'>;* } ~> f | ||
------------------------------------------------------------------------------------------------ | ||
record { <fieldtype>;* <nat> : <datatype> } <: record { <fieldtype'>;* <nat> : opt <datatype'> } | ||
~> \x.{f x; <nat> = null} | ||
``` | ||
|
||
|
@@ -1081,6 +1128,10 @@ Note: | |
|
||
To enable convenient debugging, we also specify a text format for IDL values. | ||
The types of these values are assumed to be known from context, so the syntax does not attempt to be self-describing. | ||
Type annotations can be given to disambiguate. | ||
|
||
Note: It is expected that reading IDL values (in diagnostics or logs) is by far the dominant use case. | ||
For that reason, the syntax is designed for easy readability instead of convenient writability. To that end, it intentionally matches the type syntax. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sneakily sneaking controverial changes in with unrelated PRs… ;-) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's not a change, just a clarification/rationale for what's there already. ;) But I'm happy to do that in a separate PR if you think that's worthwhile. |
||
|
||
``` | ||
<val> ::= | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the defs of r
reserved
anddeprecated
should be swapped.opt any
carries no useful information and never will (deprecated),opt empty
can be replaced byopt T
, anyT
, so it's a reserved field inhabited initially only bynull
but by other non-null values on further refinement. Or have I got myself confused again?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After some pondering, I think you are correct, although it is a bit more nuanced. Once more, everything depends on direction.
In particular, for outbound records, refinement goes to subtypes. For normal fields you can go
any :> T :> empty
over time. Consequently, the natural progression for options also isopt any :> opt T :> opt empty
.But of course, it's the inverse for inbound records. So which should we pick? Ultimately, the subtyping rules for optional fields are such that both ways happen to "work".
But I believe that reserving a field name is primarily relevant for inbound records, because it ought to prevent clients from passing a record with additional fields that may be incompatible with intended future extensions. Such considerations are not necessary for records that are purely outbound, so for outbound-only records there is no point in reserving a field.
So, I agree that it should be inverted. Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was definitely thinking from the inbound perspective: reserving an new argument or deprecating an old argument in an existing operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the difference between
is rather small with our subtyping rules.
Evolving a field of that type forward behaves almost the same:
For all
t ≠ any
, we haveand for all
t
we haveSo the forward-evolution works fine.
Deprecating an existing field
t
For all
t ≠ opt t'
, we have(If
t
was anopt
field, we still learn whether it wasnull
orsome
).and for all
t
we haveSo I am not sure there is a meaningful distinction between
reserved
anddeprecated
, and maybe we need to only useopt empty
?(
opt any
is a bit odd anyways, as it looks likebool
, and we do not want to carry information in reserved or deprecated fields, do we?)