-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add envoy SO_KEEPALIVE settings #71
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for creating this PR @jackkleeman. I had one question for clarification. +1 for merging after resolving it.
@@ -285,6 +285,8 @@ data: | |||
lb_policy: CLUSTER_PROVIDED | |||
# assume http2 in upstream | |||
http2_protocol_options: {} | |||
upstream_connection_options: | |||
tcp_keepalive: {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will default to the OS settings for the TCP keepalive configuration, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How long would the keep alive timeout be? I found that by default it only kicks in after 2 hours. Won't that mean that we'll run into the TCP retransmission timeout before (after ~ 15 minutes) and therefore, the keepalive timeout won't have any effect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, that is the default - but bear in mind that a tcp connection is fairly likely to be open for many hours before a failure occurs, so we would expect keepalives every 75s for most conns
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand things correctly, then the keep alive will start after 2h of idleness. Hence, whenever there is a package arriving it will be reset. Therefore, I believe that the default settings won't help here and if we recommend enabling upstream tcp keepalive with envoy, then we should set concrete values here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slightly related question: Francesco concluded on restatedev/restate#626 to use HTTP/2 keep alives instead TCP keep alives. He also mentioned that Envoy should be able to do the same. Should we rather recommend using this mechanism instead of TCP keep alive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, sounds good to me, ill revert this and implement http2
No description provided.