Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(firehose): Set connection window size to the maximum #3818

Merged
merged 2 commits into from
Aug 8, 2022

Conversation

leoyvens
Copy link
Collaborator

@leoyvens leoyvens commented Aug 5, 2022

We currently multiplex all firehose connections on a single http2 connection, and under load we're seeing possible flow control issues, possibly a hyper bug. The default per-connection http2 window is small and equal to the per-stream window. Enabling adaptive windows should give us a much bigger connection window, and possibly solve the issue at the current loads. This sets the connection window to the maximum value, effectively disabling it.

@leoyvens leoyvens force-pushed the leo/firehose-http2-adaptive-window branch from d77f190 to 381781c Compare August 5, 2022 22:16
.connect_timeout(Duration::from_secs(10));
.connect_timeout(Duration::from_secs(10))
.http2_keep_alive_interval(Duration::from_secs(30))
.http2_adaptive_window(true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This causes disconnections when connecting to firehose through a GCP Load Balancer. (error reading a body from connection: unexpected end of file)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for testing that! I've removed this, and took a different approach of setting a very large window size.

@@ -56,7 +56,9 @@ impl FirehoseEndpoint {
.expect("TLS config on this host is invalid"),
_ => panic!("invalid uri scheme for firehose endpoint"),
}
.connect_timeout(Duration::from_secs(10));
.connect_timeout(Duration::from_secs(10))
.http2_keep_alive_interval(Duration::from_secs(30))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is too low IMHO. While I didn't experience disconnections with this setting, some backends are very restrictive on this (at 10secs, it DOES generate similar disconnections as with http2_adaptive_window)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed the keep alive for now, to focus this PR on the stalling bug.

@leoyvens
Copy link
Collaborator Author

leoyvens commented Aug 7, 2022

This now sets the connection window to the maximum value, effectively disabling it. I was able to locally reproduce that this fixes the block stream stalling issue. I'm also convinced this is not a hyper bug but is http2 flow control working as designed, for our use case we should opt out of the connection level flow control.

@leoyvens leoyvens merged commit c75e0d7 into master Aug 8, 2022
@leoyvens leoyvens deleted the leo/firehose-http2-adaptive-window branch August 8, 2022 10:59
@leoyvens leoyvens changed the title fix(firehose): http2_adaptive_window fix(firehose): Set connection window size to the maximum Aug 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants