You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We're currently using KN v0.24 in production for our services. Unrelated to this issue, but we had tried to upgrade but weren't able to do so cleanly due to what seems like a bug in the go runtime when a request body is read by the queue-proxy, which is why we're "stuck" on this version for the time being. We will attempt an upgrade soon.
Some of our java services are slow to startup, so we were hoping to use the target utilization percentage annotation on the revision to be able to scale up earlier in anticipation of increased load before we hit peak traffic. Our containerConcurrency is set to 6, min scale is set to 10, which gives us a capacity of 60 inflight requests. As a test, we set the annotation autoscaling.knative.dev/target-utilization-percentage to 33 hoping that we'd see the scale up happen as our stable concurrency hit ~20. Our stable window is set to 120s.
Instead, when the service is running we don't see new replicas being created until much later. New replicas seem to be created only when the autoscaler panics, at which point the stable request concurrency is almost 40 - which seems to correspond to a target utilization of like 66%.
Are we misunderstanding how this param is supposed to work? Is there a reason why new replicas aren't being created even when the stable request concurrency exceeds our target threshold of 33%?
The text was updated successfully, but these errors were encountered:
Ask your question here:
Hey folks,
We're currently using KN v0.24 in production for our services. Unrelated to this issue, but we had tried to upgrade but weren't able to do so cleanly due to what seems like a bug in the go runtime when a request body is read by the
queue-proxy
, which is why we're "stuck" on this version for the time being. We will attempt an upgrade soon.Some of our java services are slow to startup, so we were hoping to use the target utilization percentage annotation on the revision to be able to scale up earlier in anticipation of increased load before we hit peak traffic. Our containerConcurrency is set to 6, min scale is set to 10, which gives us a capacity of 60 inflight requests. As a test, we set the annotation
autoscaling.knative.dev/target-utilization-percentage
to33
hoping that we'd see the scale up happen as our stable concurrency hit ~20. Our stable window is set to 120s.Instead, when the service is running we don't see new replicas being created until much later. New replicas seem to be created only when the autoscaler panics, at which point the stable request concurrency is almost 40 - which seems to correspond to a target utilization of like 66%.
Are we misunderstanding how this param is supposed to work? Is there a reason why new replicas aren't being created even when the stable request concurrency exceeds our target threshold of 33%?
The text was updated successfully, but these errors were encountered: