Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix IDL subtyping #832

Closed
wants to merge 8 commits into from
Closed
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
143 changes: 97 additions & 46 deletions design/IDL.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ To document, discover, and interact with actors on the platform, we need an inte
exchange (names, parameter and result formats of actor methods)
* Simple and canonical constructs (C-like; algebraically: sums, products, exponentials)
* Extensible, backwards-compatible
* Well-formedness is checkable and guaranteed by the platform
* Well-formedness is checkable
* Deterministic mapping to serialised representation
* Human-readable and machine-readable
* Declarative, usable as input to binding code generators
Expand Down Expand Up @@ -77,8 +77,7 @@ This is a summary of the grammar proposed:
| bool
| text
| null
| reserved
| empty
| any | empty

<constype> ::=
| opt <datatype>
Expand Down Expand Up @@ -274,22 +273,22 @@ The type `null` has exactly one value (the *null* value) and therefor carries no
<primtype> ::= ... | null | ...
```

#### Reserved
#### Any and Empty

The type `reserved` is a type with unknown content that ought to be ignored. Its purpose is to occupy field ids in records in order to prevent backwards/forwards compatibility problems, see the description of record types below.
The type `any` is a type with unknown content (i.e., it is the type of all possible values) that ought to be ignored. Its primary purpose is to occupy yet unused field ids in records in order to prevent backwards/forwards compatibility problems, see the description of record type subtyping below.
```
<primtype> ::= ... | reserved
```
**Note:** This type has a similar role as *reserved fields* in proto buffers.


#### Empty

The type `empty` is the type of values that are not present. Its purpose is to mark variants that are not actually there, or -- as argument types of function reference -- indicate that they will not be called.
Dually, the type `empty` is the type of unused or absent content (i.e., it is the type with no values). Its primary purpose is to occupy no longer used field ids in records in order to prevent backwards/forwards compatibility problems, see the description of record type subtyping below.
It can also be used to mark variants that are not actually there, or -- as argument types of function reference -- indicate that they will not be called.
```
<primtype> ::= ... | empty
```

Technically, the types of the IDL form a lattice by subtyping, where `any` and `empty` are the top and bottom element.


### Constructed Data

*Constructed types* describe compound or aggregated forms of values.
Expand All @@ -301,6 +300,17 @@ An *option* is a value of a specific data type that may be absent.
```
<constype> ::= opt <datatype> | ...
```
##### Shorthands: Reserved and Deprecated

When evolving record types over time (see below), it sometimes is useful to *reserve* fields for future use, or *deprecate* old fields so that they are no longer used. Defining the fields or keeping them around as "occupied" can prevent backwards/forwards compatibility problems.
The types `reserved` and `deprecated` can be used for this purpose.
See the discussion of record subtyping for more details.
```
<constype> ::= ....
| reserved := opt any
Copy link
Contributor

@crusso crusso Nov 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the defs of rreserved and deprecated should be swapped. opt any carries no useful information and never will (deprecated), opt empty can be replaced by opt T, any T, so it's a reserved field inhabited initially only by null but by other non-null values on further refinement. Or have I got myself confused again?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some pondering, I think you are correct, although it is a bit more nuanced. Once more, everything depends on direction.

In particular, for outbound records, refinement goes to subtypes. For normal fields you can go any :> T :> empty over time. Consequently, the natural progression for options also is opt any :> opt T :> opt empty.

But of course, it's the inverse for inbound records. So which should we pick? Ultimately, the subtyping rules for optional fields are such that both ways happen to "work".

But I believe that reserving a field name is primarily relevant for inbound records, because it ought to prevent clients from passing a record with additional fields that may be incompatible with intended future extensions. Such considerations are not necessary for records that are purely outbound, so for outbound-only records there is no point in reserving a field.

So, I agree that it should be inverted. Done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was definitely thinking from the inbound perspective: reserving an new argument or deprecating an old argument in an existing operation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the difference between

reserved    :=  opt any
deprecated  :=  opt empty

is rather small with our subtyping rules.

  • Evolving a field of that type forward behaves almost the same:

    For all t ≠ any, we have

    { foo : reserved }  <:  { foo : opt t } ~> fun _ = { foo = null }
    

    and for all t we have

    { foo : deprecated }  <:  { foo : opt t } ~> fun _ = { foo = null }
    

    So the forward-evolution works fine.

  • Deprecating an existing field t
    For all t ≠ opt t', we have

    { foo : t }  <:  { foo : reserve } ~> fun _ = { foo = null }
    

    (If t was an opt field, we still learn whether it was null or some).

    and for all t we have

    { foo : t  }  <:  { foo : deprecated } ~> fun _ = { foo = null }
    

So I am not sure there is a meaningful distinction between reserved and deprecated, and maybe we need to only use opt empty?

(opt any is a bit odd anyways, as it looks like bool, and we do not want to carry information in reserved or deprecated fields, do we?)

| deprecated := opt empty
```


#### Vectors

Expand Down Expand Up @@ -537,16 +547,16 @@ That is, outbound message results can only be replaced with a subtype (more fiel

Subtyping applies recursively to the types of the fields themselves. Moreover, the directions get *inverted* for inbound function and actor references, in compliance with standard rules.

**TODO: unsound, fix**

To make these constraints as flexible as possible, two special rules apply:

* An absent record field is considered equivalent to a present field with value `null`. Moreover, a record field of type `null` is a subtype of a field with type `opt <datatype>`. That way, That way, a field of option (or null) type can always be added to a record type, no matter whether in co- or contra-variant position. If an optional field is added to an inbound record, and he client did not provide it, the service will read it as if its value was null.
rossberg marked this conversation as resolved.
Show resolved Hide resolved

- in an outbound record, a field of option (or null) type can also be removed in an upgrade, in which case the client will read it as if its value was null;
- in an inbound record, a field of option (or null) type can also be added, in which case the service will read it as if its value was null.

Future extensions: defaults, including for variants?
Taken together, the rules allow record types to support round-trips, i.e., occur in both inbound and outbound positions, while still remaining upgradable via new optional fields.

Possible future extensions: defaults for options, including for variants?


### Rules
Expand Down Expand Up @@ -593,7 +603,7 @@ vec <datatype> <: vec <datatype'>
```
Furthermore, an option type can be specialised to either `null` or to its constituent type:
```
------------------------
----------------------
null <: opt <datatype>

not (null <: <datatype>)
Expand All @@ -616,23 +626,20 @@ record { <fieldtype'>;* } <: record { }

<datatype> <: <datatype'>
record { <fieldtype>;* } <: record { <fieldtype'>;* }
----------------------------------------------------------------------------------------------
record { <nat> : <datatype>; <fieldtype>;* } <: record { <nat> : <datatype'>; <fieldtype'>;* }
--------------------------------------------------------------------------------------------
record { <fieldtype>;* <nat> : <datatype> } <: record { <fieldtype'>;* <nat> : <datatype'> }
```

**TODO: Rules below are unsound as is, need fixing!**

In addition, record fields of `null` or option type can be removed, treating the absent field as having value `null`.
```
record { <fieldtype>;* } <: record { <fieldtype'>;* }
-------------------------------------------------------------------
record { <fieldtype>;* } <: record { <nat> : null; <fieldtype'>;* }
------------------------------------------------------------------
record { <fieldtype>;* } <: record { <fieldtype'>;* <nat> : null }

record { <fieldtype>;* } <: record { <fieldtype'>;* }
-----------------------------------------------------------------------------
record { <fieldtype>;* } <: record { <nat> : opt <datatype>; <fieldtype'>;* }
----------------------------------------------------------------------------
record { <fieldtype>;* } <: record { <fieldtype'>;* <nat> : opt <datatype> }
```
TODO: What we want to achieve: Taken together, these rules ensure that adding an optional field creates both a co- and a contra-variant subtype. Consequently, it is always possible to add an optional field. In particular, that allows extending round-tripping record types as well. For example,
Taken together, these rules ensure that adding an optional field creates both a co- and a contra-variant subtype. Consequently, it is always possible to add an optional field. In particular, that allows extending round-tripping record types as well. For example,
```
type T = {};
actor { f : T -> T };
Expand All @@ -644,29 +651,65 @@ actor { f : T' -> T' };
```
for all first-order uses of `T`, because both of the following hold:
```
upgrade T' <: T
upgrade T <: T'
T' <: T (by usual width subtyping)
T <: T' (by option field subtyping)
```
And hence:
```
upgrade (T' -> T') <: (T -> T)
(T' -> T') <: (T -> T)
```
for some version of a type `upgrade T`.
Moreover, this extends to the higher-order case, where e.g. a function of the above type is expected as a parameter:
```
actor { g : (h : T -> T) -> ()}
```
Upgrading `T` as above contra-variantly requires
```
upgrade (T -> T) <: (T' -> T')
(T -> T) <: (T' -> T')
```
which also holds.

Note: Subtyping still needs to be transitive . We must not allow:
Finally, one auxiliary rule is needed to ensure that subtyping is still transitive:
```
not (<datatype> <: opt <datatype')
rossberg marked this conversation as resolved.
Show resolved Hide resolved
record { <fieldtype>;* } <: record { <fieldtype'>;* }
------------------------------------------------------------------------------------------------
record { <fieldtype>;* <nat> : <datatype> } <: record { <fieldtype'>;* <nat> : opt <datatype'> }
```
This rule allows the consecutive removal and later re-addition of a field with different types over multiple upgrades. For example:
```
record {x : text} <: record {} <: record {x : opt nat}
```
The field will also read as `null` in this case.
Transitivity is necessary so that data compatibility is guaranteed in all cases under the earlier rules, even across multiple upgrades.

Note that this rule does not cover the case where the new version of the field has the same type as the old:
```
record {x : opt nat} <: record {} <: record {x : opt text}
rossberg marked this conversation as resolved.
Show resolved Hide resolved
```
In this case, a direct access of the old value under the new interface, without going through the intermediate version, will be handled according to the ordinary rule for options, keeping its value.
The semantics has no way of knowing whether that is intended or not, but this behaviour avoids any semantic ambiguity.

However, this implies that -- although subtyping is transitive and the respective conversions compose -- the composition of conversions is not always *coherent*, i.e., it may return values different from a direct conversion.
Unfortunately, a fully coherent semantics seems impossible to achieve for this case.

In practice, users are not recommended to remove optional fields in an upgrade, avoiding the scenario.
Instead, the preferred way to avoid backwards compatibility issues is to *deprecate* such fields by giving them type `deprecated`, a shorthand for `opt empty`.
Accordingly, both the following hold:
```
record {x : opt text} <: record {} <: record {x : opt nat}
record {x : deprecated} <: record {x : opt T}
record {x : opt T} <: record {x : deprecated}
```
**TODO: Sanity continues from here.**
The first is for outbound uses and holds by usual subtyping, the second is for inbound uses, by the auxiliary rule above.

Dually, forward compatibility can be ensured by *reserving* a field via the type `reserved`, a shorthand for `opt any`.
Accordingly, both the following hold:
```
record {x : opt T} <: record {x : reserved}
record {x : reserved} <: record {x : opt T}
```
The first use again is for outbound uses, by usual subtyping, the second for inbound ones, by the auxiliary rule above.

Note: The auxiliary subtyping rule above should not be invoked in a static check for upgradability of a service, with the exception of these two cases. We expect tooling to warn about any such dependency on the rule. The only purpose of allowing the other cases is for dynamic coercions, i.e., to make sure that subtyping is transitive and therefore avoids unexpected runtime failure after multiple upgrades in all cases.


#### Variants
Expand All @@ -679,8 +722,8 @@ variant { } <: variant { <fieldtype'>;* }

<datatype> <: <datatype'>
variant { <fieldtype>;* } <: variant { <fieldtype'>;* }
------------------------------------------------------------------------------------------------
variant { <nat> : <datatype>; <fieldtype>;* } <: variant { <nat> : <datatype'>; <fieldtype'>;* }
----------------------------------------------------------------------------------------------
variant { <fieldtype>;* <nat> : <datatype> } <: variant { <fieldtype'>;* <nat> : <datatype'> }
```

#### Functions
Expand Down Expand Up @@ -714,6 +757,7 @@ service { <name> : <functype>; <methtype>;* } <: service { <name> : <functype'>;
### Elaboration

To define the actual coercion function, we extend the subtyping relation to a ternary *elaboration* relation `T <: T' ~> f`, where `f` is a suitable coercion function of type `T -> T'`.
To express this function, we assume a simple lambda calculus over IDL values.


#### Primitive Types
Expand All @@ -724,15 +768,15 @@ To define the actual coercion function, we extend the subtyping relation to a te
<primtype> <: <primtype> ~> \x.x


------------------
Nat <: Int ~> \x.x
-------------------
Nat <: Int ~> \x.+x


------------------------------
<datatype> <: reserved ~> \x.x
---------------------------------
<datatype> <: reserved ~> \x.null


------------------------------
-------------------------------------
empty <: <datatype> ~> \_.unreachable
```

Expand Down Expand Up @@ -768,20 +812,23 @@ record { <fieldtype'>;* } <: record { } ~> \x.{}
<datatype> <: <datatype'> ~> f1
record { <fieldtype>;* } <: record { <fieldtype'>;* } ~> f2
----------------------------------------------------------------------------------------------
record { <nat> : <datatype>; <fieldtype>;* } <: record { <nat> : <datatype'>; <fieldtype'>;* }
record { <fieldtype>;* <nat> : <datatype> } <: record { <fieldtype'>;* <nat> : <datatype'> }
~> \x.{f2 x with <nat> = f1 x.<nat>}
```

TODO: Fix
```
record { <fieldtype>;* } <: record { <fieldtype'>;* } ~> f
-------------------------------------------------------------------
record { <fieldtype>;* } <: record { <nat> : null; <fieldtype'>;* }
------------------------------------------------------------------
record { <fieldtype>;* } <: record { <fieldtype'>;* <nat> : null }
~> \x.{f x; <nat> = null}

record { <fieldtype>;* } <: record { <fieldtype'>;* } ~> f
-----------------------------------------------------------------------------
record { <fieldtype>;* } <: record { <nat> : opt <datatype>; <fieldtype'>;* }
record { <fieldtype>;* } <: record { <fieldtype'>;* <nat> : opt <datatype> }
~> \x.{f x; <nat> = null}

not (<datatype> <: opt <datatype')
record { <fieldtype>;* } <: record { <fieldtype'>;* } ~> f
------------------------------------------------------------------------------------------------
record { <fieldtype>;* <nat> : <datatype> } <: record { <fieldtype'>;* <nat> : opt <datatype'> }
~> \x.{f x; <nat> = null}
```

Expand Down Expand Up @@ -1081,6 +1128,10 @@ Note:

To enable convenient debugging, we also specify a text format for IDL values.
The types of these values are assumed to be known from context, so the syntax does not attempt to be self-describing.
Type annotations can be given to disambiguate.

Note: It is expected that reading IDL values (in diagnostics or logs) is by far the dominant use case.
For that reason, the syntax is designed for easy readability instead of convenient writability. To that end, it intentionally matches the type syntax.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sneakily sneaking controverial changes in with unrelated PRs… ;-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a change, just a clarification/rationale for what's there already. ;) But I'm happy to do that in a separate PR if you think that's worthwhile.


```
<val> ::=
Expand Down