-
Notifications
You must be signed in to change notification settings - Fork 556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sketch of an extension manifest #1221
Conversation
Signed-off-by: Kuat Yessenov <[email protected]>
0cc15d9
to
2d9e2cd
Compare
}; | ||
|
||
// Extension description for a data plane filter. | ||
message FilterExtension { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@htuch : I'm curious to hear your thoughts on this API sketch to globally apply a filter across xDS configs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would also be valuable to hear from @alexburnos here, as he has experience with similar systems.
// variable (ISTIO_META_ISTIO_VERSION) in the Istio proxy docker | ||
// image. Custom proxy implementations should provide this metadata | ||
// variable to take advantage of the Istio version check option. | ||
string proxy_version = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we consider making this semver instead of regex?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some more context - back when we made this in EnvoyFilter we had versions like master-latest-daily. We now call these 1.5-dev.SHA
for example (technically its not semver without 3 numbers but close enough to work) so it should work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a requests in the original envoyfilter API as well, but we did not have correct proxy version that time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Semver in UDPA is just a numeric version. Don't we need to handle ASM qualifier in addition to semver?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that may be an issue, may want to select on 1.4.2-custom
vs 1.4.2
. Or we could say that is overloading the version and should just set another metadata field and use that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Envoy BuildVersion
in envoyproxy/envoy#9301 includes metadata capable of expressing these labels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ISTIO_VERSION is part of node metadata, so we would be regex matching on a field in metadata. We could express it as a map from a field to a string matcher, but that feels too level in istio, since it does not convey the purpose of using the version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Essentially you need the version to check capabilities of the proxy itself (will it accept the extension or not). What is the reason not to rely on BuildVersion, which would make Istio more UDPA compliant, vs Istio-specific API such as using ISTIO_VERSION metadata?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we have any reason other than that build version doesn't exist yet as far as I can tell? We would likely need to fallback to istio_version for old proxies even when it does though
@kyessenov can you update the design doc and discussion about the analysis of this API, otherwise the same question will have to be rehashed here. |
Design doc updated. Please comment there https://docs.google.com/document/d/1jFuZM4UCKw6AMDmXMIYqbdlyJk95ut2zMPLSNdrUtW8/edit# |
// variable to take advantage of the Istio version check option. | ||
string proxy_version = 1; | ||
|
||
// Match on the node metadata supplied by a proxy when connecting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this interact with the match criteria enabled by envoyproxy/envoy#9301? Specifically the ability to match on the presence of a given filter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we need that match criterion in istio. We generally know what filters are packaged, and don't select on presence or absence of compiled filters. In the future, we expect filters to be dynamically loadable.
|
||
// Modification operation (function is optional in case of replacement or | ||
// removal). | ||
Operation operation = 6; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not an Istio API expert, but this strikes me as providing an imperative vs. a declarative API. I was under the impression we tend to describe what the system should be like, rather than how to get there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part is debatable. We have a choice here.
We can forgo these operations, and then leave “how to get there” to CI/CD
In that case you install a whole new named manifest in your namespace, so someone else must do copy and update operation outside the istio API.
Please take a look at the linked doc where we have listed the use cases this needs to support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@htuch We really only need the removal operation since there is no other way to provide an override for a filter that disables it. Think of a per-namespace declaration that disables a root default. .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be cleaner (and very simple) to add a per-filter disable override in Envoy. We've discussed this before and it just needs O(1 day) of effort to add if someone is inclined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is a bool inside the filter proto, it would mean merging the default proto with the override. We want to avoid proto merges.
a “conditional” dispatch of filters is a useful feature, and we have an open feature request in envoy for that.
However wouldn’t it be better to not send a filter configuration at all instead of sending it with a disabled flag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@htuch Are you thinking of adding a bool field to filter config protos to disable it? I think that might be useful with filter chain discovery, to be able to dynamically disable filters. If so, we could mirror it here with a bool field instead of operation enum.
Signed-off-by: Kuat Yessenov <[email protected]>
Updated to provide an override semantics based on the name of the filter as the primary key / URL of a filter config. This allows a more fine-grained selection instead of the whole manifest override, and is in fact, required to express config canarying (e.g. use config A if label is matched, but then fallback to config B). |
I like the direction of this PR. That said, I am not fully convinced that it warrants a new API. There are several incremental usability improvements that are directly applicable to the EnvoyFilter API [this needs renaming]. Secondly, the notion that
is short sighted. Most people who use some external authZ callout may also need a cluster to talk to the external service wont they? So, simply defining the extension without defining the cluster seems moot. Also,
In the current API, we do merge ONLY if the user wants to merge. We do a simple list append if the user wants to add/insert. So, I dont see this as an improvement. It feels like simply removing a feature as it is expensive. Thirdly, extension is IMO subjective: people who want to tweak XFF settings are also extending the system because Istio does not provide a way to do so. Same goes for people who are tweaking some stream timeout (one of the N different timeouts) in a route because Istio exposes just one single knob for the common case use. If we go with a dedicated API for just filters (as was the case with envoyFilter v1), then you still have the gap for people who want to control these basic parameters. Incidentally, the entire reason for a merge operation is to support this feature (ability to tweak the listener/cluster/hcm) - most people use merge only for this, not for filters. Finally, how will you handle unknown type URLs in Pilot? We need to pre-process all the extensions into their respective go protobuf structs so that we can insert into the list of filters. Is there some magic that allows us to read in the json config and ship it as is to envoy? |
They should be using native APIs to define services and service entries to add an endpoint. I don't think we need a third way to add clusters to Envoy.
Yes, it's a categorical rejection of merging functionality since it makes it imperative and difficult to write an admission controller for. It's hard to tell the effects of a merge without execution, so difficult for an admin to ensure that bad things are not allowed. We don't want to allow arbitrary scripting changes to xDS.
I think it's a known issue that HTTP connection manager is a monolith and is becoming unwieldy to configure (cc @htuch). The solution is to break it apart around features, and allow those features to be applied independently from one another. This API is not aimed to solving that problem, however. The use case to replace a HCM config or any other core config (tcp_proxy) seems out of scope.
Yes, it's called UDPA typed struct. |
Signed-off-by: Kuat Yessenov <[email protected]>
Operation operation = 6; | ||
|
||
// Purpose of the extension used to place the filter in a chain. | ||
Function function = 7; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name: filter_phase
// Opaque configuration for the filter. | ||
oneof configuration { | ||
// Inlined extension configuration. | ||
Extension config = 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Supporting both ref and inline means that k8s rbac cannot be used. We need to use a more sophisitcated admission control.
UNSPECIFIED_FILTER_TYPE = 0; | ||
HTTP_FILTER = 1; | ||
NETWORK_FILTER = 2; | ||
UPSTREAM_NETWORK_FILTER = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the difference between NETWORK_FILTER and UPSTREAM_NETWORK_FILTER?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NETWORK_FILTER is a downstream network filter, UPSTREAM_NETWORK_FILTER is an upstream network filter a.k.a cluster filter.
// Opaque configuration for an extension (wire and JSON compatible with UDPA). | ||
// The extension resource allows white-listing pre-defined configuration for | ||
// extensions. | ||
message Extension { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naming nit: maybe "ExtensionConfig"?
|
||
// The required name of the filter instance. The name should generally be | ||
// unique in a filter chain. | ||
string name = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume it is not a UDPA Filter name field (as it has uniqueness requirement)? In this case, where is UDPA filter name will be specified, in the Extension?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm working to change UDPA filter name to be free-form in envoy. UDPA filter type URL is sufficient to determine an extension.
// application of an extension in a chain of extensions. Istio recognizes the | ||
// following categories of extensions and applies them in the order specified | ||
// here. | ||
enum Function { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be the mechanism for two independent providers of filters to coordinate relative order?
I.e. I have filters A and B, is there a way to specify that B should go before A?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't want to allow specific total order of filters. A partial order constraint like this enum is all we want to support. That is we can express that authorization filters should come before post-processing filters, but no specific order between authorization filters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about that. Having at least a relative ordering within each category might help with inter-dependant filters. I know it sounds like a long shot at this point but limitations like this also cause problems with more complex setups of e.g. MutatingWebhooks or CNI plugins - as for CNI, we've hit this problem in Istio even.
Could be solved with a simple priority
integer, where if you have filters 1,2 with type==Authorization
and priority(1) > priority(2)
, filter 1 is guaranteed to be injected earlier in the filter chain. priority(1) == priority(2)
could be undefined order, caught using our ValidatingWebhook.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no impl under way yet. We are still in the process of getting agreement.
Re: ordering, integers priority helps, but what is easier is to preserve the manifest order in the actual output within a filter type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, yeah I remember now that this was talked about in the last meeting. I'm not sure that's very user-friendly- it's non-obvious that this implicit order matters, as the filters could be matching on completely separate sidecars. It also leaves out how this would behave if different ExtensionManifests
add filters to the same Envoy config. Which one wins?
I am not fully convinced. This PR is effectively rolling back EnvoyFilter to its older incarnation (that allowed only add/remove of network and http filters). We improved it solely based on user feedback that such an API was insufficient for their needs (including the idea of adding clusters as many complained that using Service Entries to force cluster creation was unweildy and created unwanted listeners/routes, etc.). Editing HCM or tcp proxy should be well within the scope because they are also filters. Special casing these two filters will lead to a brittle solution as we would have to add more special cases in future (thrift, dubbo, etc.) with no end in sight. Nor is it useful to create one api for one set of filters and another API for a different set of configs in envoy. Fwiw, writing an admission controller to deny any EnvoyFilter with merge op is not that hard either. So, I dont see why it is necessary to remove it from the API when people who dont want it can write their own admission controllers to reject such objects.
LOL! So we are back to shipping structs. So much for all the work around using any typed filters. |
This PR is not trying to solve every problem with extensions. We can add additional extensions since the top-level field is a list of filter extensions. Each API needs a purpose, not an un-scoped fits-all solution.
I beg to disagree. It's not easy to write an admissions controller for operations that include protobuf merge. You would need to have a very solid idea what pilot generates (which is also changing per version of Istio). Merge operations are not commutative, so you would also need to think through sequences of operations which is against the principle of a declarative API. Finally, for core filters, it's simply not future-proof since we don't specify the number of HTTP connection managers (one per listener, one per chain, one across all listeners?) or how they are partitioned.
Struct is the only way to ensure that both Pilot and k8s can ingest config without a descriptor. We can add an optimization later for Pilot to load additional descriptors and encode struct as any before shipping to Envoy. |
This is just for the extension anyways, right? I don't think its horrible to have extensions as struct, but everything being a struct would be horrible |
When you start adding Clusters, Listeners and route edits to this extension manifest, you get the existing EnvoyFilter API. And there is no escaping from adding clusters and listener edits if you want to solve the basic use case of wanting to enable certain fields that Istio does not care about, or needing to add clusters used by the extensions with custom tls settings or dns settings. Put another way, if you move the spec.Filters under the existing EnvoyFilter and tweak some existing semantics like no editing filters in listeners, you get the same output. You can still write a very simple admission controller that says only admin namespaces can have listener edits while consumer namespaces can have extensions, along with the approved/banned extensions stuff talked about earlier. So the question again, is why a new CRD? I dont mind renaming the EnvoyFilter to Extensions, but I certainly do not want to different ways of editing the envoy config and all the associated confusion that it creates. |
// be in the namespace/name format. Use this field in conjunction with the | ||
// portNumber to accurately select the Envoy route configuration for a | ||
// specific HTTPS server within a gateway config object. | ||
string gateway = 5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a gateway can have many Servers with its own hosts. Each of which ends up in its own filter chain if using TLS. So you need a way to specify the hostnames here as well, so that you can match a filter chain based on the sni.
The match as it stands is insufficient as people have most commonly wanted to add a lua filter only for certain paths and not for all routes (like /user/.. and not for /index.html). I have seen this way too many times to keep track. This then requires route.PerFilterconfig in envoy. You cant do that with the way you have expressed the matches as it stand today. You would also now need to match on a specific route by name so that you can add this config to the perFilterConfig.
Finally, since we are in the land of designing for future, we also need to take into account the passthrough listeners. People will want to write loggers/trackers/enforces for the traffic that is passthrough while using standard rbac for the rest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't per filter config be part of virtual service or other service/route-specific resource? This API is written from an administrator point of view (MUST enable telemetry and authorization) so opting out should be an override.
also, fwiw, here is the deprecated EnvoyFilter API https://github.com/istio/api/blob/1.2.10/networking/v1alpha3/envoy_filter.proto that is essentially doing the same thing as the extension stuff |
I don't understand why listeners edits are essential. Again, the purpose of this effort to define a declarative intent-based API for extensions. EnvoyFilter is a scripting language in disguise which I hope you agree is not amenable to static analysis (or you have some magic way to prove programs are safe). That's the primary reason for this new CRD, to make something more declarative that is also hopefully not entirely tied to a specific istio and envoy versions. I also don't understand about the different way to edit envoy config argument. Every single networking API is editing envoy config in some ways. The reasons why I'm restricting to extensions is because I'm not convinced we need to model cluster patches the same way. They more naturally belong to other istio resources, with some opaque extensions. Same for route per-filter config, it naturally belongs to virtual service extension similar to ingress v2. |
The previous iteration is close but it's not handling several issues:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks really good so far, is there already work underway on a demo implementation in pilot code?
// application of an extension in a chain of extensions. Istio recognizes the | ||
// following categories of extensions and applies them in the order specified | ||
// here. | ||
enum Function { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about that. Having at least a relative ordering within each category might help with inter-dependant filters. I know it sounds like a long shot at this point but limitations like this also cause problems with more complex setups of e.g. MutatingWebhooks or CNI plugins - as for CNI, we've hit this problem in Istio even.
Could be solved with a simple priority
integer, where if you have filters 1,2 with type==Authorization
and priority(1) > priority(2)
, filter 1 is guaranteed to be injected earlier in the filter chain. priority(1) == priority(2)
could be undefined order, caught using our ValidatingWebhook.
|
my suggestions, derived from the problems outlined in my previous comment would therefore be:
|
This is stale. |
API to add envoy filters to istio:
cc @mandarjog @howardjohn @louiscryan @rshriram
Signed-off-by: Kuat Yessenov [email protected]