Skip to content

Commit

Permalink
fix(doc): hpa api version (#1274)
Browse files Browse the repository at this point in the history
Signed-off-by: Derek Wang <[email protected]>
  • Loading branch information
whynowy authored Oct 26, 2023
1 parent c103427 commit 730552e
Showing 1 changed file with 10 additions and 10 deletions.
20 changes: 10 additions & 10 deletions docs/user-guide/reference/autoscaling.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Numaflow is able to run with both `Horizontal Pod Autoscaling` and `Vertical Pod

### Numaflow Autoscaling

Numaflow provides `0 - N` autoscaling capability out of the box, it's available for all the `UDF`, `Sink` and most of
Numaflow provides `0 - N` autoscaling capability out of the box, it's available for all the `UDF`, `Sink` and most of
the [`Source`](../sources/overview.md) vertices (please check each source for more details).

Numaflow autoscaling is enabled by default, there are some parameters can be tuned to achieve better results.
Expand Down Expand Up @@ -40,23 +40,23 @@ spec:
- `disabled` - Whether to disable Numaflow autoscaling, defaults to `false`.
- `min` - Minimum replicas, valid value could be an integer >= 0. Defaults to `0`, which means it could be scaled down to 0.
- `max` - Maximum replicas, positive integer which should not be less than `min`, defaults to `50`. if `max` and `min`
- `max` - Maximum replicas, positive integer which should not be less than `min`, defaults to `50`. if `max` and `min`
are the same, that will be the fixed replica number.
- `lookbackSeconds` - How many seconds to lookback for vertex average processing rate (tps) and pending messages calculation,
- `lookbackSeconds` - How many seconds to lookback for vertex average processing rate (tps) and pending messages calculation,
defaults to `120`. Rate and pending messages metrics are critical for autoscaling, you might need to tune this parameter
a bit to see better results. For example, your data source only have 1 minute data input in every 5 minutes, and you
don't want the vertices to be scaled down to `0`. In this case, you need to increase `lookbackSeconds` to cover all the
5 minutes, so that the calculated average rate and pending messages won't be `0` during the silent period, to prevent
a bit to see better results. For example, your data source only have 1 minute data input in every 5 minutes, and you
don't want the vertices to be scaled down to `0`. In this case, you need to increase `lookbackSeconds` to cover all the
5 minutes, so that the calculated average rate and pending messages won't be `0` during the silent period, to prevent
scaling down to 0 from happening.
- `scaleUpCooldownSeconds` - After a scaling operation, how many seconds to wait for the same vertex, if the follow-up
- `scaleUpCooldownSeconds` - After a scaling operation, how many seconds to wait for the same vertex, if the follow-up
operation is a scaling up, defaults to `90`. Please make sure that the time is greater that the pod to be `Running` and
start processing, because the autoscaling algorithm will divide the TPS by the number of pods even if the pod is not `Running`.
- `scaleDownCooldownSeconds` - After a scaling operation, how many seconds to wait for the same vertex, if the follow-up
operation is a scaling down, defaults to `90`.
- `zeroReplicaSleepSeconds` - How many seconds it will wait after scaling down to `0`, defaults to `120`.
Numaflow autoscaler periodically scales up a vertex pod to "peek" the incoming data, this is the period of time to wait before peeking.
- `targetProcessingSeconds` - It is used to tune the aggressiveness of autoscaling for source vertices, it measures how
fast you want the vertex to process all the pending messages, defaults to `20`. It is only effective for the `Source` vertices which
- `targetProcessingSeconds` - It is used to tune the aggressiveness of autoscaling for source vertices, it measures how
fast you want the vertex to process all the pending messages, defaults to `20`. It is only effective for the `Source` vertices which
support autoscaling, typically increasing the value leads to lower processing rate, thus less replicas.
- `targetBufferAvailability` - Targeted buffer availability in percentage, defaults to `50`. It is only effective for `UDF`
and `Sink` vertices, it determines how aggressive you want to do for autoscaling, increasing the value will bring more replicas.
Expand Down Expand Up @@ -90,7 +90,7 @@ Numaflow autoscaling does not apply to reduce vertices, and following source ver
[Kubernetes HPA](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/) is supported in Numaflow for any type of Vertex. To use HPA, remember to point the `scaleTargetRef` to the vertex as below, and disable Numaflow autoscaling in your Pipeline spec.

```yaml
apiVersion: autoscaling/v2beta1
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: my-vertex-hpa
Expand Down

0 comments on commit 730552e

Please sign in to comment.