AsyncHTTPConnection for HTTP2 connections can lead to performance issues on deterministic connection failures #969
Unanswered
EldarSehayekZenity
asked this question in
Potential Issue
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Looking at the implementation of
AsyncHTTPConnection
, I see the following:is_available
will return True if HTTP2 is enabled and connection not yet successfully established (self._connection is None
) and connection failure not yet detected (self._connect_failed is False
)handle_async_request
locks onself._request_lock
and in the lock attempts to connect if connection not yet successfully established (self._connection is None
). Note it doesn't checkself._connect_failed
to know if a connection was already attempted and failed.If the connection failure is deterministic, this behavior can lead to performance issue in case the connection pool attempts to use the connection for multiple requests before the connection failure is detected and the connection
is_available
starts returning False. This is definitely possible in the scenario where the connection pool creates a new connection it's immediately available for use by the next requests to send, and there's no limit on how many requests the connection pool can assign to this connection since it will always returnis_available = True
as long as the connection failure wasn't detected.If the connection pool decides to route N requests to the new connection not yet detected as failed, those N requests will be queued up in the new connection
self._request_lock
and they will fail one after the other attempting to connect and fail.If the connect timeout is X., then the overall time to wait for the N'th request if they all start in parallel is
N * X * 2
which can become very high. The reason to multiple by 2 is that the connect timeout is applied separately to the network stream establishment and to the TLS establishment, the timeout doesn't apply on both as a single operation.Note 1: if connect retries is set in the connection pool this performance issue becomes much worse due to the exponential backoff between retries which is applied inside the per-request lock, meaningful the N'th request will wait exponentially long for all previous queued requests to fail.
Note 2: one may think this is an edge case since the connection pool adds a new connection to the end of the connections list, and when picking available connections for future request the first matching one from start of the list is chosen. However, consider the connection pool is created and immediately requested to send N=1000 requests in parallel to some origin using AnyIO task group. A single HTTP2 connection will be created, and all N requests will be assigned to that connection since it's available without requests limit as long as it didn't establish an HTTP connection nor failed to establish one, which can take much longer if connection retries are enabled.
=> What do you think about changing AsyncHttpConnection behavior to remember a connection failure from first connection attempt and when
handle_async_request
is called, inside the request lock re-raise the connection failure if there was one without re-attempt to establish a connection?Alternatively, it can be decided that when a new connection is created, only the request leading to its creation will be assigned to it and it will not be considered available for additional requests until it either successfully establish HTTP connection or failed to do so. I assume this will require a new state that isn't available/idle/expired/closed, the new state could be named "initializing"
Beta Was this translation helpful? Give feedback.
All reactions