-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clean up default socket factory in TransportSocketMatcherImpl #19149
Comments
Can you provide a more complete Envoy config of what is causing this issue (no Istio details please). It's hard for me to understand the actual problem. |
Sure: e.g when applying below filter for enabling case preserve using auto_config vs explicit_http_config,
envoy continuously fails for error/rejecting any new changes seen from proxy logs below:
|
Can you put the full config dump in a public gist? |
Also can you describe the desired flow at a high level? I'm still confused about what "auto" functionality you are trying to achieve without using ALPN. |
Sure: dump-gist |
If we have n services where n/2 services are http1 and n/2 services are http2, need to skip http1.1 case preserve if the service is http2 automatically vs defining case preserve explicitly for all n/2 services . Hence, trying to leverage auto_config by creating just one envoy filter and let case preserve by default kick in for http1 outbound services, but it seems use_alpn is mandatory if auto_config is chosen. Also, envoy cluster match is exact string match and no regex. So will end up creating n/2 filters for n/2 http1 services using case preserve which can be cumbersome so want to know if we can optimize here. Also, not sure if we have limit on number of envoy filters that we can create/support ;e.g. if we have 1k services with http1 needing case preserve, will there be a big performance hit if we create 1k envoyfilters with cluster match on those services. Wanted to double confirm |
That config dump is huge. Can you provide some kind of simplified example of what you want to achieve? I still don't understand the goal here. "auto" config basically means to use ALPN to figure out what protocol to use. If you aren't doing auto how are you choosing whether to use H1 or H2? |
Sure. amended for a sample cluster/service in the same gist to shorten the dump file.
Hence, asked if using auto_config, why alpn is mandatory ? or rather shouldnt it not auto translate to select without alpn whether h1/h2. |
Because this is what auto does. It determiens what to use based on ALPN. Given that you are already deciding whether to use H1 or H2 based on some label, I think you just want to populate an explicit_http_config for h1 for h1 services. |
Agree as thats the current behavior. Issue is to configure explicit_http_config with case preserve, we will have to populate explicit_http_config for h1 services explicitly for all clusters. Also cluster match on name only support exact name and hence if we have n clusters from same namespaces, leveraging regex would be good too (but cluster match doesnt support regex) vs n filters for all those clustes. Can we add regex support too? e.g.
Hence, in case of gateway where it allows passing both h1 and h2 traffic, using explicit_http_config with preserve case disables h2 traffic so ask is to use something like
So if services that use h2, should auto ignore preserve_case as its explicitly for h1 services only. So is it doable or should we support this case and see further? |
You are already doing this for h2 services. Fundamentally, you have to tell envoy how to pick a protocol. Either it's prior knowledge or you have to use ALPN. |
Thanks. Yes thats how its done by istio for now based on protocol .Will take a note that autoconfig will solely rely on alpn and we cannot tweak it and rather keep it as is. So it seems may be we will have to define an option in istio/k8s service itself ; say if service is h1 and needs case preserve, add an option and it should generate relevant config with preserve_case option on top of h1. Lastly, can we consider extending cluster name match to regex as an add on in envoy ? I am not sure of the original intent on why it was kept as exact match and no regex being added. Would love to see if it can be done as an add on. |
I believe I am experiencing a similar issue: The config that is erroring out for me:
This config works fine with explicit http2 but changing it to auto breaks. I've tried removing the alpn_protocols but with no luck. Am I missing something? I tried digging into the code but it seems that the validation thinks my socket doesn't support alpn but it should. |
Changing from |
Hmm, it seems like that might be a bug but I would need to look at the code. The issue is probably that if there is a matching system the config can't verify what will end up getting matched so it can verify that it supports ALPN. In your case if you only have a single match I think that is the right thing to do. |
This is being generated from istio in our case when we define the filter as we dont specify transport_socket as it always selects socket match explicitly.
|
This issue also manifests itself when wrapping a TLS socket with PROXY protocol, and I think the tap socket also same issue. |
@mattklein123 Please advice; what is the proposal/suggestion here to move forward for changes/fix further when applying to multiple clusters? |
given that auto_config only works for transport sockets with ALPN I'm not convinced allowing it for transport socket match is going to be safe. We could arguably change the config validation to runtime failures, but I don't think that would be desireable for the common case. |
OK for the matcher case, I think the problem is actually the inverse - the matcher ALPN is not considered at all, only the "default socket config". I'll go fix this. This makes the validation more strict rather than more permissive and will not help. Reading ClusterImplBase though, it looks like the default socket matcher is always config.transport_socket() and if no transport socket is configured, it defaults to a raw buffer socket. Arguably if your matcher config has its own catch-all such that the default will never be used we could allow auto_config but that's complicated enough I'm not inclined to implement it. I think there ought to instead be a way for Istio to override the default socket if there isn't already (cc @lizan and @lambdai for that) |
Creating a passthrough factory for consistent handling of wrapped characteristics for proxy_proto, tap, and tcp_stats socket factories. Risk Level: low Testing: new unit testing Docs Changes: n/a Release Notes: inline Part of #19149 Signed-off-by: Alyssa Wilk <[email protected]>
Thanks. So it seems transportSocket under transportSocketMatches can now be read correctly with this fix at-least ; pending tests on istio for this. However for second match for raw socket ,
are you proposing to even skip/over-ride rawsocket/default injectiom from istio itself so that we can leverage auto_config transportsocket from transportmatches or rather set the ones under transportmatches as default from istio? |
I don't think Istio is likely to make changes to our core logic to support this use case as we explicitly plan to use HTTP2 everywhere in the very near future, which means no more header case preservation.
Please keep Istio discussions out of Envoy issues, it adds confusion and wastes Envoy maintainers time. |
Thanks @howardjohn : Noted so that we can plan accordingly on our side.
Ack. Yes. Its more generic/healthy conversation as to what layer to propagate the changes to as we have case preserve for http1 which is valid case and some services will continue to use so; more of how vs who/what layer should do it. |
For users who want to use both |
It's possible to use transport_socket_matches and auto_config today. If you have a reasonable use case I think we'd be up for supporting it but
I don't have cycles for (2) but if you find someone who would pick this up I can give them tips for how to implement and test. |
I believe this says otherwise or are you suggesting in addition to using
Sorry I wasn't clear enough but that isn't what I was suggesting. I don't have a use case for needing to use auto without an ALPN socket. |
Ah, gotcha. I had misunderstood, thanks for making your request more clear :-) If you look here: https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/cluster/v3/cluster.proto "If an endpoint metadata’s value under envoy.transport_socket_match does not match any TransportSocketMatch, socket configuration fallbacks to use the tls_context or transport_socket specified in this cluster." If you don't configure a transport_socket, it defaults to raw_buffer, so the configuration is guarding you here against missing metadata, and your "ALPN" request going out in the clear where protocol can't be negotiated. Your "default" match is not really a default unless you clone the config into the transport_socket in that cluster, because transport_socket is always the default even if it's not explicitly configured :-( |
This is also from the docs :)
This makes it seem that if someone configures In the our use case, since we do supply a transport socket with an empty match to |
Was just thinking on this some more. We can move away from an empty match and rely on no matches defaulting to It does seems like a mistake to have two ways to default a transport socket and users should opt to not use the empty match and instead rely on the default being |
Thanks . Because
Hence, may be we can fix this bug in the first place. Since default raw socket is always there, if match entry as null{} under transportsocketmatches,
when using auto_config skip erroring (or may be warn vs exception) since non-empty match exists which has alpn details as per sample config dump. If alpn settings are not detected in any layer, error out. That way as per autoconfig usage in docs, upstream cluster with h2 will never react to case preserve as it doesnt exist for h2. |
oh interesting, I'd missed that (scanning too fast). All of which is to say I agree with the above - it's way confusing having two ways to configure defaults, especially given one of them will create an unconfigured and unnecessary socket factory which will never be used. I would say given all the above it would make sense for matchers with a default match to not consider the socket_factory's ALPN-status but really what should happen is that whole section of code should be refactored to not create the "default socket factory" in the case where 1) it's not the default and 2) it can't actually be used. |
(and/or clean up the APIS so there's not 2 ways to configure defaults) |
Making sure that socket matchers on alpn-required clusters support ALPN. Risk Level: Medium Testing: new unit tests Docs Changes: n/a Release Notes: inline Runtime guard: envoy.reloadable_features.correctly_validate_alpn Part of #19149 Signed-off-by: Alyssa Wilk <[email protected]>
Creating a passthrough factory for consistent handling of wrapped characteristics for proxy_proto, tap, and tcp_stats socket factories. Risk Level: low Testing: new unit testing Docs Changes: n/a Release Notes: inline Part of envoyproxy#19149 Signed-off-by: Alyssa Wilk <[email protected]> Signed-off-by: Josh Perry <[email protected]>
Making sure that socket matchers on alpn-required clusters support ALPN. Risk Level: Medium Testing: new unit tests Docs Changes: n/a Release Notes: inline Runtime guard: envoy.reloadable_features.correctly_validate_alpn Part of envoyproxy#19149 Signed-off-by: Alyssa Wilk <[email protected]> Signed-off-by: Josh Perry <[email protected]>
Hi @alyssawilk, i can work on this. |
Title: Allow skipping use_alpn when auto_config is set for httpprotocoloptions
Description:
For raw transport socket, if there is no alpn support where its simple tls for upstream, can we allow setting use_alpn to false as there is no over-ride option?
Goal is to ensure allowing http2 traffic to be passed via istio ingressgateway and at the same time cater to http1.1 case preserve for clusters using http1. Is there any trivial fix or suggestions to solve it optimally?
[optional Relevant Links:]
The text was updated successfully, but these errors were encountered: