-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KEP-2896: OpenAPI V3 #2898
KEP-2896: OpenAPI V3 #2898
Conversation
|
||
- Support publishing and aggregating OpenAPI v3 for all kubernetes types | ||
- Published v3 spec should be a lossless representation of the data and no fields should be stripped for both built-in types and CRDs | ||
- Instead of serving the entire OpenAPI spec at one endpoint, separate the spec by group and only serve the resources required by each group |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this will come further down: by group is unfortunate as this means that the aggregator has to call out to each APIService and merge specs again. By group-version would remove this need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have any strong opinions about this, and will just list the benefits of each. It does seem like group/version reduces the complexity on the server side while potentially trading some performance on the client side.
Group:
- Less calls needed to download a subset of the schema
- Total size of schema downloaded is smaller if the entire schema needs to be aggregated client side
Group/version:
- APIService does not need any additional aggregation
- CRDs schemas do not need to be merged
@apelisse did a size analysis for group vs group/version for built-in resources.
Split | Cardinality | Avg Size (stddev) | Aggregate Size |
---|---|---|---|
No split | 1 | 131kB | 131kB |
Group | 22 | 14kB (12kB) | 308kB |
Group/Version | 44 | 11kB (10kB) | 484kB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which clients do you have in mind that care about more or less calls? Is there a client we see that needs the whole spec? I could see some client generators that would, but they don't care about 22 or 44 requests as they are batch-like jobs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know any latency sensitive clients that would care about the actual number of calls, it seems more of a nice to have. Different specs also have overlap that would be reduced by splitting only per group (total aggregate size is 30% smaller), but given the advantages of a omitting an aggregation step with group-version, let's adopt splitting by group-version.
Maybe late to the party, maybe not... I have been studying the API specification for several years and I want to dump my thoughts here for consideration. While I understand the benefits of generating API spec from source code, maybe we can/should switch to a different working model when we are migrating to OpenAPI v3. There are some OpenAPI spec features that are difficult to generate from source code. For example:
What I'd propose is a standalone API specification that is OpenAPI v3 format, SEPARATELY MAINTAINED rather than generated from the source code. We can use the 1.22 spec as a starting point, for example. The API validator logic validates API objects based on this specification. If we have some requirements that are difficult to express using OpenAPI spec, we either add some extensions to the spec, or we at least mention such constraints in the 'description'. API specification is the single source of truth. We do conformance tests based on the spec. Source code comments are for developers. They can put whatever comments they need. They don't need to worry about the fact that the comments are "user-facing". They can put 'TODO's there when needed. We can remove those special tags in the comments rather than pushing hard on adding more tags. I'm not quite aware of the drawbacks of this working model, so I'm open to all thoughts. |
I don't think this is realistic to turn around the complete development model of Kube APIs and center it around OpenAPI, at least for native APIs, not without giant effort. It's definitely out of scope of this KEP. Also note that for CRDs we basically follow what you suggest in the schema definition, technically. In practice though, nobody defines CRD schemas via OpenAPI first, and compliance of the Golang types via testing, although CRDs are totally capable of this way of working. For me this is a strong sign that this is not more viable or elegant than the approach we have.
As we do today in CRD schemas, same approach.
They can today. We have two blocks of comments, separated by an empty line. If you see TODOs in API docs, it's a bug in the source code. Am not sure whether kubebuilder supports that, it is used everywhere in k/k types.
This sounds more like a comment about kubebuilder than about the mechanisms to publish OpenAPI v3 in this KEP.
The core drawback is that the proposal has no proof that it works better. I would suggest to build a prototype of this model for CRDs, i.e. basically an alternative tool chain to kubebuilder that is centered around OpenAPI. I feel like this is a very othorgonal problem to solve than this KEP. About the more specific questions from the top of the comment:
Why can't we?
I don't see how this is a problem of the approach of generating from code. We admittedly don't use example, but could. I don't think there is a deep philosophical reason that we don't in our specs. |
I'd try find some time slots for this, although I'm not confident the "shortcomings" I mentioned can be fixed if we don't switch the development model. |
The PRR looks good. I am concerned about telling clients that they need to look at openapi/v3 and openapi/v2 to have a complete view of content. I think doing that means that we will be unable to remove openapi/v2. if instead we find a way to show all the data (even in a slightly lossy way) via openapi/v3, then we have the ability to deprecate and eventually remove the openapi/v2 endpoint because we'll be able to direct clients to consume all the schema information via openapi/v3. |
v2->v3 is fine as a beta requirement. We will attempt that. I think it is ugly, but feasible. |
Updated v2->v3 as a beta requirement. |
@tengqm previous ideas to switch to some sort of IDL instead of go structs haven't gotten very far (e.g. see this one). I'm not sure how we could align on a specific IDL at this point. I am pretty sure, however, that there's little chance people would agree to hand-write OpenAPI and make that the source of truth. |
/label tide/merge-method-squash holding to prevent instant merge. I think we've got all comments in, but I'll quickly solicit to be sure. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: deads2k, Jefftree The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold cancel |
Thanks everyone for getting this merged, I'm very excited about it! |
* OpenAPI v3 * Add heuristics note
|
||
Based on the provided group, clients can then request `openapi/v3/apis/apps/v1`, | ||
`/openapi/v3/apis/networking.k8s.io/v1` and etc. These leaf node specs are self | ||
contained OpenAPI v3 specs and include all referenced types. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it is late, but why don't we publish under /apis/networking.k8s.io/v1/openapi
? It will remove all wiring into the aggregator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @liggitt @deads2k for opinions.
This feels like a natural step and considerably simplifies the wiring and architecture within kube-apiserver. Basically apiextensions-apiserver will serve the spec directly. kube-aggregator will stop internally downloading the json blob and caching it in the aggregation layer. In the same way, kube-aggregator will not need special handling for openapi because /apis/foo/v1 is proxied to the aggregated apiserver already. That was the original reason why we switch to per-group-version specs here in the enhancement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/apis/networking.k8s.io/v1/openapi
conflicts with the endpoints for an openapi
resource
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(and matches the pattern for a resource URL instead of a non-resource URL... it would get authorized as verb=list,resource=openapi,group=networking.k8s.io,scope=cluster-wide
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can still proxy the requests to the apiextensions server even if the URL is different?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can proxy them, yes. Am fine with that approach. In the implementation PR, please make sure we avoid coupling apiextensions-apiserver with kube-aggregator (the downloader cames to mind). Rather serve the per-gv specs directly from the crd handler.
* OpenAPI v3 * Add heuristics note
References: