[Feature] Drop http keep-alive support in machine learning container #12064
rkojedzinszky
started this conversation in
Feature Request
Replies: 1 comment 2 replies
-
By not using HTTP keep-alives, all TCP connections get closed after a request, and this all new requests will go through load-balancing, if one exists. As ML processing may take time, the little overhead of establishing a new TCP connection upon each request is negligible. |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I have searched the existing feature requests to make sure this is not a duplicate request.
The feature
I have deployed immich on a k8s cluster. I've set up horizontal pod autoscaler for machine-learning component. Howewer, as that container serves requests over http/1.1, the connections remain open, and basically, the newly created machine-learning pods dont receive any new requests, only the old one does.
Thus, I would suggest to disable http keep-alive in the ML server. I've tested this in my case, it is working, and as expected, When k8s autoscaler starts new PODs, they start receiving new requests, effectively speeding up the processing.
Platform
Beta Was this translation helpful? Give feedback.
All reactions