Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AIP-180 default breaking changes #1076

Merged
merged 5 commits into from
Jun 21, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
87 changes: 87 additions & 0 deletions aip/general/0180.md
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,93 @@ code, as such changes will be seen as breaking by those users.
whether a proposed change is likely to break users, and an expansive reading of
this guidance could ostensibly prevent _any_ change (which is not the intent).

#### Default values must not change

Default values are the values set by servers for resources when they are not
specified by the client. This section only applies to static default values within
fields on resources and does not apply to dynamic defaults such as the default IP
address of a resource.

Changing the default value is considered breaking and **must not** be done. The
slevenick marked this conversation as resolved.
Show resolved Hide resolved
default behavior for a resource is determined by its default values, and this
**must not** change across minor versions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wanted to sanity check with @loudej on this one.

The motivation and client-friendliness makes total sense to me. At the same time, I'm wondering about how we do enable use cases like modifying the default disk type / instance type for compute APIs. New instance types come out all the time, and using them as the default helps GCP customers get more performance / $.

One way I could imagine that being solved is by having clients always intern the defaults, and having servers never set the default. WDYT?

Maybe if that was the solution then this guidance would still hold (don't change the server-side default).

Copy link

@rileykarson rileykarson Apr 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are direct API clients relying heavily on defaults? Changes [outside of controlled windows] are probably more harm than good for IaC. I would consider having gcloud/Console change defaults clientside at will, and updating IaC during breaking change windows (or never).

If volatility across all clients is desirable, a double simplex (whether client or server side, although server side would be a little nicer imo. with obvious bias as a client owner.) could be viable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One way I could imagine that being solved is by having clients always intern the defaults, and having servers never set the default. WDYT?

So essentially mark all fields with defaults as required instead and bake the defaults into clients? It could work, but may have other usability issues like requiring a much larger set of values to be specified if a user is working with an API outside of a client.

Overall I think the suggestion of a double simplex can improve this situation greatly, as it would remove the direct connection between the unset value in an IaC client and the server-set default. Without that connection I expect we wouldn't see the issues that cause data loss when these defaults change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are multiple points here.

  1. Default values can't really change without a new major version unless the default value was always specified to be a dynamically chosen default, in which case the value must be pinned for a specific resource. It shouldn't change once the resource has been created. Double simplex is probably the best option for such cases, but this approach shouldn't be applied to every field.

  2. The values need to be easily discoverable, such as for policy enforcement. In Kubernetes we didn't want admission controllers, asynchronous controllers, and other clients to need to try to figure out what default values were or how they were set or whether they were even defaults. Thus, Kubernetes explicitly sets all values that need to have values. However, that requires clients to gracefully accommodate that behavior. Server-side apply was necessary for Terraform integration. One idea we've discussed is (optionally) returning these values in validate_only responses and with a special Get request parameter.

  3. Always setting the defaults in clients seems like a recipe for inconsistency and maintenance problems to me. Plus it doesn't work for attributes that are newer than the clients. Effectively it creates a "blueprint", regardless of what type of client/surface was used to create the resource and whether it was hardcoded or data-driven.

We need to provide more complete guidance around unset values, default values, dynamically generated / allocated / assigned values, and other values transformed by the service (semantic strings, associative lists, etc.), but I'm comfortable with the statement that changing default values or explicitly setting default values when they weren't previously populated explicitly are non-backward-compatible changes within the same major version.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll raise my hand and say I don't know what's meant by "double simplex" here.

Just to establish baselines, can we all agree that @bgrant0607's statement of:

It shouldn't change once the resource has been created.

... is correct? If a resource has been created, and the server has populated the field, it mustn't change to some other value later. I suspect we're all happy with that.

My guess is that the other side of Brian's point 1 (the viability of "if the client doesn't specify a value for this field, the server may choose a default, and that default may vary over time) will be less unanimously agreed. Would it be okay if the server guaranteed to always use the same value if the client specifies the same API version? (Looking to future work.)

I don't quite follow the point around policy enforcement, but I do agree it would be nice to be able to access the defaults... although that could be tricky if "default for field X depends on field Y".

Putting everything client-side does feel problematic - if we can agree on that much, we don't need to investigate the details of that further.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to summarize my stance: I don't agree with the principle, I just worry it's not defensible because this happens often enough to be a practice. e.g. default instance type for VM creation for compute.

That said it looks like GCE hard-codes disk types in the client. And instance types don't have defaults at all. I guess I was thinking of AWS.

I'm still fairly skeptical of enforcement but I agree with the rationale here. We can try enforcing this guidance for now and see how it goes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GCE doesn't follow the AIPs. They explicitly set defaults in the resources, more like Kubernetes.

Something like the machine type or disk interface is actually in the category of "dynamic defaults". It's dynamically selected, allocated, computed/generated, etc. The "default" may change slowly in practice, but it could be different across regions or set via project-level settings or something as well as changing over time.

"double simplex" was used in Greg's exploratory doc to describe the pattern of 2 fields, one input and one output. The output field would always return the effective value used by the system. The input value would respect the rules about not changing what was provided by the client.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And if the server could change a default value without a version change, then it MUST somehow persist the selected value in the resource, and return that value somewhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jskeet re:

Would it be okay if the server guaranteed to always use the same value if the client specifies the same API version?

Likely yes, but it may depend on the exact details of the API versioning scheme.

The double simplex idea is an interesting way to mitigate these problems going forward, but I don't think it has much bearing on this proposal which is targeted at preventing breaking changes in existing APIs. We do need to come up with a plan for better handling of default values overall for new APIs.

For existing APIs though we have limited options and I believe the best path is to stop changes in server-side default values for existing APIs. Clients can modify client-set default values as appropriate to accommodate changing preferences and more efficient instance types in a non-breaking way. APIs can also change defaults during a major version increment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having reread all of this, I'm pretty sure that:

  • We need a way of defaults changing over time (think encryption algorithms etc)
  • Some clients (e.g. Terraform) need a level of stability

Personally I'd be okay with having this guidance for now, with an understanding that when client-specific versioning happens, we change the guidance to "changing the default requires a new version, and the server will still honor the previous default when the client specifies the old version". Terraform can then consider "we've changed default" to require a new major Terraform provider version.

I think that's the best compromise we'll be able to reach, but I'd like to check that both @slevenick and @toumorokoshi are okay with it.


For example:

```proto
message Book {
string name = 1;
// The genre of the book
// If this is not set when the book is created, the field will be given a value of FICTION.
enum Genre {
UNSPECIFIED = 0;
FICTION = 1;
NONFICTION = 2;
}
}
```

Changing to:

```proto
message Book {
string name = 1;
// The genre of the book
// If this is not set when the book is created, the field will be given a value of NONFICTION.
enum Genre {
UNSPECIFIED = 0;
FICTION = 1;
NONFICTION = 2;
}
}
```

would constitute a breaking change.

#### Serializing defaults

APIs **must not** change the way a field with a default value is serialized. For
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems reasonable - I suspect it'll all get muddier when we've thought more about field presence, but I'm okay with this description as it is for now.

Changing it to "resource field" might be worth while though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of the difficulty here comes from inconsistent behavior for serializing a default value (default as in runtime-set that is not defined in the proto) so I actually think that field presence would fix it. In a world where field presence is meaningful the problem goes away because it's clear that what unset means.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And you can't have default values for fields where presence is meaningful. At least that's my understanding

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... I can imagine a case where field presence is useful in create/update (so that the client can say whether they're trying to update it) but where you always want the server to specify a value when fetching, so the consumer can know what value is in effect. That makes it sort of "semi-optional" - which can't easily be expressed just in proto terms, of course.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If field presence is meaningful in create/update then declarative clients need to know what value was actually set (or unset) when they retrieve the resource. "semi-optional" is not a feasible solution, and must be prohibited from existing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you go into more detail about this? Suppose we have a situation where the client doesn't specify a value, and the server populates it with a default and returns that value. So long as the result is exactly the same as if the client specified that value to start with, is there still a problem? I may well be missing something. If you could give an example of a problematic flow, I'll see whether we're talking about different things, or whether there's an aspect I was missing.

My understanding of this "semi-optional" field is as such:

If the field is specified by the client it works as normal.
If the field is unspecified by the client, a server-set default is returned

There are a couple potential behaviors for when a client sends an empty value during a PATCH request:

  1. Server ignores the field
  2. Server sets the field to the empty value
  3. Server sets the field to the default value it would have chosen if the resource was created with the field unspecified

1 is just a normal field where presence is not meaningful
2 is inconsistent with the behavior of the field on create, which doesn't make sense. In order to create a resource with an empty value for this field the client would need to create and then update immediately to remove the field value
3 is implementing a sort of "reset" functionality for the field, which doesn't seem to fit standard CRUD. It could make sense to have on a custom method if the functionality is needed for the API, but doesn't fit into standard resource behavior that should be implemented in clients

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 isn't quite the same as a normal field - because if the field has the optional keyword (i.e. presence is available) then the server can tell the difference between the client sending a value of 0 and not sending anything at all. I think it would be reasonable to treat not sending anything as "don't change this", if the field is always effectively present.

(There's a harder decision around how to send an update which explicitly wants to clear a field when that's a meaningful state.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yeah that's an odd one. It's similar to a normal field with a server-set default, but adds the zero value as a settable value. I'm concerned with how we would communicate this to a client, as the optional keyword will generally signify that a field can be unset, but in this situation sending unset would have an entirely different behavior.

What would the overlap be between the unset value meaning "don't change" and setting an updateMask for that field? What if these conflict, where a user sends unset but sets the updateMask

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would the overlap be between the unset value meaning "don't change" and setting an updateMask for that field?

Not as well defined as I'd like, basically. (That tends to be the case for anything to do with FieldMask, unfortunately.)

What if these conflict, where a user sends unset but sets the updateMask

This could be the signal used for "please clear the field" - which would be invalid in the case I'm thinking of, where the field always must have a value, it's just it doesn't have to be specified by the user.

Taking a step back, "The user doesn't have to set this explicitly, but 0 is a valid value" sounds like a reasonable pair of requirements to me. It's unclear how important it is as a pair of requirements, mind you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking a step back, "The user doesn't have to set this explicitly, but 0 is a valid value" sounds like a reasonable pair of requirements to me.

Yes I agree this is a reasonable behavior for a field. I think the problem is that we don't have a clear way to handle these types of fields currently. We would need a revamp of FieldMask & existing functionality before this can be widely adopted

example if a field does not appear in the response if the value is equal to the
default, the serialization **must not** change to include the field with the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think the examples would be easier to read as bullets and not as several paragraphs. But I won't block this PR on that.

default. Clients may depend on the presence or absence of a field in a resource
as semantically meaningful, so a change to how serialization is done for absent
values **must not** occur in a minor version.

Consider the following proto, where the default value of `wheels` is `2`:

```proto
// A representation of an automobile
message Automobile {
// The name of the automobile.
string name = 1;

// The number of wheels on the automobile.
// The default value is 2, when no value is sent by the client.
int wheels = 2;
}
```

First the proto serializes to JSON when the value of `wheels` is `2` as follows:

```json
{
"name": "my-car"
}
```

Then, the API service changes the serialization to include `wheel` even if the
value is equal to the default value, `2` as follows:

```json
{
"name": "my-car",
"wheels": 2
}
```

This constitutes a change that is not backwards compatible within a major
version.

## Further reading

- For compatibility around pagination, see [AIP-158][].
Expand Down