From 3e68d944e7dc41db3254914f09d068acbdc34d87 Mon Sep 17 00:00:00 2001 From: David Welsch Date: Fri, 10 May 2024 14:26:11 -0700 Subject: [PATCH] - Added Reference section to table of contents, issue #1366 - Moved some content from KEDA concept topics to Reference. - Added a glossary, issue #1367 - Removed non-inclusive language, issue #1373 https://github.com/kedacore/keda-docs/issues/1366 https://github.com/kedacore/keda-docs/issues/1367 https://github.com/kedacore/keda-docs/issues/1373 Umbrella issue for CNCF tech docs recommendations: https://github.com/kedacore/keda-docs/issues/1361 Signed-off-by: David Welsch --- content/docs/2.15/_index.md | 18 +- .../docs/2.15/authentication-providers/aws.md | 4 +- .../docs/2.15/concepts/scaling-deployments.md | 287 ++---------------- content/docs/2.15/concepts/scaling-jobs.md | 245 +-------------- content/docs/2.15/deploy.md | 6 +- content/docs/2.15/operate/_index.md | 4 +- .../operate/{events.md => cloud-events.md} | 50 +-- content/docs/2.15/operate/metrics-server.md | 2 +- content/docs/2.15/reference/_index.md | 13 + content/docs/2.15/reference/events.md | 28 ++ content/docs/2.15/{ => reference}/faq.md | 1 + content/docs/2.15/reference/glossary.md | 88 ++++++ content/docs/2.15/reference/scaledjob-spec.md | 236 ++++++++++++++ .../docs/2.15/reference/scaledobject-spec.md | 254 ++++++++++++++++ content/docs/2.15/scalers/aws-sqs.md | 4 +- content/docs/2.15/scalers/azure-pipelines.md | 2 +- .../docs/2.15/scalers/redis-sentinel-lists.md | 8 +- .../2.15/scalers/redis-sentinel-streams.md | 10 +- 18 files changed, 705 insertions(+), 555 deletions(-) rename content/docs/2.15/operate/{events.md => cloud-events.md} (50%) create mode 100644 content/docs/2.15/reference/_index.md create mode 100644 content/docs/2.15/reference/events.md rename content/docs/2.15/{ => reference}/faq.md (72%) create mode 100644 content/docs/2.15/reference/glossary.md create mode 100644 content/docs/2.15/reference/scaledjob-spec.md create mode 100644 content/docs/2.15/reference/scaledobject-spec.md diff --git a/content/docs/2.15/_index.md b/content/docs/2.15/_index.md index a3c18da2f..4aa4b1b4b 100644 --- a/content/docs/2.15/_index.md +++ b/content/docs/2.15/_index.md @@ -1,8 +1,20 @@ +++ -title = "The KEDA Documentation" +title = "Getting Started" weight = 1 +++ -Welcome to the documentation for **KEDA**, the Kubernetes Event-driven Autoscaler. Use the navigation to the left to learn more about how to use KEDA and its components. +Welcome to the documentation for **KEDA**, the Kubernetes Event-driven Autoscaler. -Additions and contributions to these docs are managed on [the keda-docs GitHub repo](https://github.com/kedacore/keda-docs). +Use the navigation bar on the left to learn more about KEDA's architecture and how to deploy and use KEDA. + +Where to go +=========== + +What is your involvement with KEDA? + +| Role | Documentation | +| --- | --- | +| User | This documentation is for users who want to deploy KEDA to scale Kubernetes. | +| Core Contributor | To contribute to the core KEDA project see the [KEDA GitHub repo](https://github.com/kedacore/keda). | +| Documentation Contributor | To add or contribute to these docs, or to build and serve the documentation locally, see the [keda-docs GitHub repo](https://github.com/kedacore/keda-docs). | +| Other Contributor | See the [KEDA project on GitHub](https://github.com/kedacore/) for other KEDA repos, including project governance, testing, and external scalers. | diff --git a/content/docs/2.15/authentication-providers/aws.md b/content/docs/2.15/authentication-providers/aws.md index c78d64b10..bfc4958b5 100644 --- a/content/docs/2.15/authentication-providers/aws.md +++ b/content/docs/2.15/authentication-providers/aws.md @@ -35,7 +35,7 @@ If you would like to use the same IAM credentials as your workload is currently ## AssumeRole or AssumeRoleWithWebIdentity? -This authentication uses automatically both, doing a fallback from [AssumeRoleWithWebIdentity](https://docs.aws.amazon.com/STS/latest/APIReference/API_AssumeRoleWithWebIdentity.html) to [AssumeRole](https://docs.aws.amazon.com/STS/latest/APIReference/API_AssumeRole.html) if the first one fails. This extends the capabilities because KEDA doesn't need `sts:AssumeRole` permission if you are already working with [WebIdentities](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_providers_oidc.html), you just need to add KEDA service account to the trusted relations of the role. +This authentication automatically uses both, falling back from [AssumeRoleWithWebIdentity](https://docs.aws.amazon.com/STS/latest/APIReference/API_AssumeRoleWithWebIdentity.html) to [AssumeRole](https://docs.aws.amazon.com/STS/latest/APIReference/API_AssumeRole.html) if the first one fails. This extends the capabilities because KEDA doesn't need `sts:AssumeRole` permission if you are already working with [WebIdentities](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_providers_oidc.html); in this case, you can add a KEDA service account to the trusted relations of the role. ## Setting up KEDA role and policy @@ -43,7 +43,7 @@ The [official AWS docs](https://aws.amazon.com/es/blogs/opensource/introducing-f ### Using KEDA role to access infrastructure -This is the easiest case and you just need to attach to KEDA's role the desired policy/policies, granting the access permissions that you want to provide. For example, this could be a policy to use with SQS: +Attach the desired policies to KEDA's role, granting the access permissions that you want to provide. For example, this could be a policy to use with SQS: ```json { diff --git a/content/docs/2.15/concepts/scaling-deployments.md b/content/docs/2.15/concepts/scaling-deployments.md index 94594d562..c72b42d1f 100644 --- a/content/docs/2.15/concepts/scaling-deployments.md +++ b/content/docs/2.15/concepts/scaling-deployments.md @@ -3,9 +3,11 @@ title = "Scaling Deployments, StatefulSets & Custom Resources" weight = 200 +++ -## Overview +This page describes the deployment scaling behavior of KEDA. See the [Scaled Object specification](../reference/scaledobject-spec.md) for details on how to set the behaviors described below. -### Scaling of Deployments and StatefulSets +# Scaling objects + +## Scaling Deployments and StatefulSets Deployments and StatefulSets are the most common way to scale workloads with KEDA. @@ -21,270 +23,30 @@ For example, if you wanted to use KEDA with an Apache Kafka topic as event sourc * As more messages arrive at the Kafka Topic, KEDA can feed this data to the HPA to drive scale out. * Each replica of the deployment is actively processing messages. Very likely, each replica is processing a batch of messages in a distributed manner. -### Scaling of Custom Resources +## Scaling Custom Resources With KEDA you can scale any workload defined as any `Custom Resource` (for example `ArgoRollout` [resource](https://argoproj.github.io/argo-rollouts/)). The scaling behaves the same way as scaling for arbitrary Kubernetes `Deployment` or `StatefulSet`. The only constraint is that the target `Custom Resource` must define `/scale` [subresource](https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#scale-subresource). -## ScaledObject spec - -This specification describes the `ScaledObject` Custom Resource definition which is used to define how KEDA should scale your application and what the triggers are. The `.spec.ScaleTargetRef` section holds the reference to the target resource, ie. `Deployment`, `StatefulSet` or `Custom Resource`. - -[`scaledobject_types.go`](https://github.com/kedacore/keda/blob/main/apis/keda/v1alpha1/scaledobject_types.go) - -```yaml -apiVersion: keda.sh/v1alpha1 -kind: ScaledObject -metadata: - name: {scaled-object-name} - annotations: - scaledobject.keda.sh/transfer-hpa-ownership: "true" # Optional. Use to transfer an existing HPA ownership to this ScaledObject - validations.keda.sh/hpa-ownership: "true" # Optional. Use to disable HPA ownership validation on this ScaledObject - autoscaling.keda.sh/paused: "true" # Optional. Use to pause autoscaling of objects explicitly -spec: - scaleTargetRef: - apiVersion: {api-version-of-target-resource} # Optional. Default: apps/v1 - kind: {kind-of-target-resource} # Optional. Default: Deployment - name: {name-of-target-resource} # Mandatory. Must be in the same namespace as the ScaledObject - envSourceContainerName: {container-name} # Optional. Default: .spec.template.spec.containers[0] - pollingInterval: 30 # Optional. Default: 30 seconds - cooldownPeriod: 300 # Optional. Default: 300 seconds - idleReplicaCount: 0 # Optional. Default: ignored, must be less than minReplicaCount - minReplicaCount: 1 # Optional. Default: 0 - maxReplicaCount: 100 # Optional. Default: 100 - fallback: # Optional. Section to specify fallback options - failureThreshold: 3 # Mandatory if fallback section is included - replicas: 6 # Mandatory if fallback section is included - advanced: # Optional. Section to specify advanced options - restoreToOriginalReplicaCount: true/false # Optional. Default: false - horizontalPodAutoscalerConfig: # Optional. Section to specify HPA related options - name: {name-of-hpa-resource} # Optional. Default: keda-hpa-{scaled-object-name} - behavior: # Optional. Use to modify HPA's scaling behavior - scaleDown: - stabilizationWindowSeconds: 300 - policies: - - type: Percent - value: 100 - periodSeconds: 15 - triggers: - # {list of triggers to activate scaling of the target resource} -``` - -### Details -```yaml - scaleTargetRef: - apiVersion: {api-version-of-target-resource} # Optional. Default: apps/v1 - kind: {kind-of-target-resource} # Optional. Default: Deployment - name: {name-of-target-resource} # Mandatory. Must be in the same namespace as the ScaledObject - envSourceContainerName: {container-name} # Optional. Default: .spec.template.spec.containers[0] -``` - -The reference to the resource this ScaledObject is configured for. This is the resource KEDA will scale up/down and setup an HPA for, based on the triggers defined in `triggers:`. - -To scale Kubernetes Deployments only `name` is needed to be specified, if one wants to scale a different resource such as StatefulSet or Custom Resource (that defines `/scale` subresource), appropriate `apiVersion` (following standard Kubernetes convention, ie. `{api}/{version}`) and `kind` need to be specified. - -`envSourceContainerName` is an optional property that specifies the name of container in the target resource, from which KEDA should try to get environment properties holding secrets etc. If it is not defined, KEDA will try to get environment properties from the first Container, ie. from `.spec.template.spec.containers[0]`. - -**Assumptions:** Resource referenced by `name` (and `apiVersion`, `kind`) is in the same namespace as the ScaledObject - ---- -#### pollingInterval -```yaml - pollingInterval: 30 # Optional. Default: 30 seconds -``` - -This is the interval to check each trigger on. By default, KEDA will check each trigger source on every ScaledObject every 30 seconds. - -**Example:** in a queue scenario, KEDA will check the queueLength every `pollingInterval`, and scale the resource up or down accordingly. - ---- -#### cooldownPeriod -```yaml - cooldownPeriod: 300 # Optional. Default: 300 seconds -``` - -The period to wait after the last trigger reported active before scaling the resource back to 0. By default, it's 5 minutes (300 seconds). - -The `cooldownPeriod` only applies after a trigger occurs; when you first create your `Deployment` (or `StatefulSet`/`CustomResource`), KEDA will immediately scale it to `minReplicaCount`. Additionally, the KEDA `cooldownPeriod` only applies when scaling to 0; scaling from 1 to N replicas is handled by the [Kubernetes Horizontal Pod Autoscaler](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#support-for-cooldowndelay). - -**Example:** wait 5 minutes after the last time KEDA checked the queue and it was empty. (this is obviously dependent on `pollingInterval`) - ---- -#### initialCooldownPeriod -```yaml - initialCooldownPeriod: 120 # Optional. Default: 0 seconds -``` -The delay before the `cooldownPeriod` starts after the initial creation of the `ScaledObject`. By default, this is 0 seconds, meaning the `cooldownPeriod` begins immediately upon creation. If set to a value such as 120 seconds, the `cooldownPeriod` will only start after the `ScaledObject` has been active for that duration. - -This parameter is particularly useful for managing the scale-down behavior during the initial phase of a `ScaledObject`. For instance, if `initialCooldownPeriod` is set to 120 seconds, KEDA will not scale the resource back to 0 until 120 seconds have passed since the `ScaledObject` creation, regardless of the activity triggers. This allows for a grace period in situations where immediate scaling down after creation is not desirable. - -**Example:** Wait 120 seconds after the `ScaledObject` is created before starting the `cooldownPeriod`. For instance, if the `initialCooldownPeriod` is set to 120 seconds, KEDA will not initiate the cooldown process until 120 seconds have passed since the `ScaledObject` was first created, regardless of the triggers' activity. This ensures a buffer period where the resource won’t be scaled down immediately after creation. (Note: This setting is independent of the `pollingInterval`.) - ---- -#### idleReplicaCount - -```yaml - idleReplicaCount: 0 # Optional. Default: ignored, must be less than minReplicaCount -``` - -> 💡 **NOTE:** Due to limitations in HPA controller the only supported value for this property is 0, it will not work correctly otherwise. See this [issue](https://github.com/kedacore/keda/issues/2314) for more details. -> -> In some cases, you always need at least `n` pod running. Thus, you can omit this property and set `minReplicaCount` to `n`. -> -> **Example** You set `minReplicaCount` to 1 and `maxReplicaCount` to 10. If there’s no activity on triggers, the target resource is scaled down to `minReplicaCount` (1). Once there are activities, the target resource will scale base on the HPA rule. If there’s no activity on triggers, the resource is again scaled down to `minReplicaCount` (1). - -If this property is set, KEDA will scale the resource down to this number of replicas. If there's some activity on target triggers KEDA will scale the target resource immediately to `minReplicaCount` and then will be scaling handled by HPA. When there is no activity, the target resource is again scaled down to `idleReplicaCount`. This setting must be less than `minReplicaCount`. - -**Example:** If there's no activity on triggers the target resource is scaled down to `idleReplicaCount` (0), once there is an activity the target resource is immediately scaled to `minReplicaCount` (10) and then up to `maxReplicaCount` (100) as needed. If there's no activity on triggers the resource is again scaled down to `idleReplicaCount` (0). - ---- -#### minReplicaCount -```yaml - minReplicaCount: 1 # Optional. Default: 0 -``` - -Minimum number of replicas KEDA will scale the resource down to. By default, it's scale to zero, but you can use it with some other value as well. - ---- -#### maxReplicaCount -```yaml - maxReplicaCount: 100 # Optional. Default: 100 -``` - -This setting is passed to the HPA definition that KEDA will create for a given resource and holds the maximum number of replicas of the target resource. - ---- -#### fallback -```yaml - fallback: # Optional. Section to specify fallback options - failureThreshold: 3 # Mandatory if fallback section is included - replicas: 6 # Mandatory if fallback section is included -``` - -The `fallback` section is optional. It defines a number of replicas to fall back to if a scaler is in an error state. +# Features -KEDA will keep track of the number of consecutive times each scaler has failed to get metrics from its source. Once that value passes the `failureThreshold`, instead of not propagating a metric to the HPA (the default error behaviour), the scaler will, instead, return a normalised metric using the formula: -``` -target metric value * fallback replicas -``` -Due to the HPA metric being of type `AverageValue` (see below), this will have the effect of the HPA scaling the deployment to the defined number of fallback replicas. +## Caching Metrics -**Example:** When my instance of prometheus is unavailable 3 consecutive times, KEDA will change the HPA metric such that the deployment will scale to 6 replicas. +This feature enables caching of metric values during polling interval (as specified in `.spec.pollingInterval`). Kubernetes (HPA controller) asks for a metric every few seconds (as defined by `--horizontal-pod-autoscaler-sync-period`, usually 15s), then this request is routed to KEDA Metrics Server, that by default queries the scaler and reads the metric values. Enabling this feature changes this behavior such that KEDA Metrics Server tries to read metric from the cache first. This cache is updated periodically during the polling interval. -There are a few limitations to using a fallback: - - It only supports scalers whose target is an `AverageValue` metric. Thus, it is **not** supported by the CPU & memory scalers, or by scalers whose metric target type is `Value`. In these cases, it will assume that fallback is disabled. - - It is only supported by `ScaledObjects` **not** `ScaledJobs`. - ---- -#### advanced -```yaml -advanced: - restoreToOriginalReplicaCount: true/false # Optional. Default: false -``` - -This property specifies whether the target resource (`Deployment`, `StatefulSet`,...) should be scaled back to original replicas count, after the `ScaledObject` is deleted. -Default behavior is to keep the replica count at the same number as it is in the moment of `ScaledObject's` deletion. - -For example a `Deployment` with `3 replicas` is created, then `ScaledObject` is created and the `Deployment` is scaled by KEDA to `10 replicas`. Then `ScaledObject` is deleted: - 1. if `restoreToOriginalReplicaCount = false` (default behavior) then `Deployment` replicas count is `10` - 2. if `restoreToOriginalReplicaCount = true` then `Deployment` replicas count is set back to `3` (the original value) - ---- - -```yaml -advanced: - horizontalPodAutoscalerConfig: # Optional. Section to specify HPA related options - name: {name-of-hpa-resource} # Optional. Default: keda-hpa-{scaled-object-name} - behavior: # Optional. Use to modify HPA's scaling behavior - scaleDown: - stabilizationWindowSeconds: 300 - policies: - - type: Percent - value: 100 - periodSeconds: 15 -``` - -##### `horizontalPodAutoscalerConfig:` - -###### `horizontalPodAutoscalerConfig.name` - -The name of the HPA resource KEDA will create. By default, it's `keda-hpa-{scaled-object-name}` - -###### `horizontalPodAutoscalerConfig.behavior` - -Starting from Kubernetes v1.18 the autoscaling API allows scaling behavior to be configured through the HPA behavior field. This way one can directly affect scaling of 1<->N replicas, which is internally being handled by HPA. KEDA would feed values from this section directly to the HPA's `behavior` field. Please follow [Kubernetes documentation](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#configurable-scaling-behavior) for details. - -**Assumptions:** KEDA must be running on Kubernetes cluster v1.18+, in order to be able to benefit from this setting. - ---- - -```yaml -advanced: - scalingModifiers: # Optional. Section to specify scaling modifiers - target: {target-value-to-scale-on} # Mandatory. New target if metrics are anyhow composed together - activationTarget: {activation-target-value-to-scale-on} # Optional. New activation target if metrics are anyhow composed together - metricType: {metric-tipe-for-the-modifier} # Optional. Metric type to be used if metrics are anyhow composed together - formula: {formula-for-fetched-metrics} # Mandatory. Formula for calculation -``` - -##### `scalingModifiers` - -The `scalingModifiers` is optional and **experimental**. If defined, both `target` and `formula` are mandatory. Using this structure creates `composite-metric` for the HPA that will replace all requests for external metrics and handle them internally. With `scalingModifiers` each trigger used in the `formula` **must** have a name defined. - -###### `scalingModifiers.target` - -`target` defines new target value to scale on for the composed metric. - -###### `scalingModifiers.activationTarget` - -`activationTarget` defines new [activation target value](./scaling-deployments.md#activating-and-scaling-thresholds) to scale on for the composed metric. (Default: `0`, Optional) - -###### `scalingModifiers.metricType` - -`metricType` defines metric type used for this new `composite-metric`. (Values: `AverageValue`, `Value`, Default: `AverageValue`, Optional) - -###### `scalingModifiers.formula` - - `formula` composes metrics together and allows them to be modified/manipulated. It accepts mathematical/conditional statements using [this external project](https://github.com/antonmedv/expr). If the `fallback` scaling feature is in effect, the `formula` will NOT modify its metrics (therefore it modifies metrics only when all of their triggers are healthy). Complete language definition of `expr` package can be found [here](https://expr.medv.io/docs/Language-Definition). Formula must return a single value (not boolean). - -For examples of this feature see section [Scaling Modifiers](#scaling-modifiers-experimental) below. - ---- -#### triggers -```yaml - triggers: - # {list of triggers to activate scaling of the target resource} -``` - -> 💡 **NOTE:** You can find all supported triggers [here](/scalers). - -Trigger fields: -- **type**: The type of trigger to use. (Mandatory) -- **metadata**: The configuration parameters that the trigger requires. (Mandatory) -- **name**: Name for this trigger. This value can be used to easily distinguish this specific trigger and its metrics when consuming [Prometheus metrics](../operate/prometheus.md). By default, the name is generated from the trigger type. (Optional) -- **useCachedMetrics**: Enables caching of metric values during polling interval (as specified in `.spec.pollingInterval`). For more information, see ["Caching Metrics"](#caching-metrics). (Values: `false`, `true`, Default: `false`, Optional) -- **authenticationRef**: A reference to the `TriggerAuthentication` or `ClusterTriggerAuthentication` object that is used to authenticate the scaler with the environment. - - More details can be found [here](./authentication). (Optional) -- **metricType**: The type of metric that should be used. (Values: `AverageValue`, `Value`, `Utilization`, Default: `AverageValue`, Optional) - - Learn more about how the [Horizontal Pod Autoscaler (HPA) calculates `replicaCount`](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/) based on metric type and value. - - To show the differences between the metric types, let's assume we want to scale a deployment with 3 running replicas based on a queue of messages: - - With `AverageValue` metric type, we can control how many messages, on average, each replica will handle. If our metric is the queue size, the threshold is 5 messages, and the current message count in the queue is 20, HPA will scale the deployment to 20 / 5 = 4 replicas, regardless of the current replica count. - - The `Value` metric type, on the other hand, can be used when we don't want to take the average of the given metric across all replicas. For example, with the `Value` type, we can control the average time of messages in the queue. If our metric is average time in the queue, the threshold is 5 milliseconds, and the current average time is 20 milliseconds, HPA will scale the deployment to 3 * 20 / 5 = 12. - -> ⚠️ **NOTE:** All scalers, except CPU and Memory, support metric types `AverageValue` and `Value` while CPU and Memory scalers both support `AverageValue` and `Utilization`. +Enabling this feature can significantly reduce the load on the scaler service. -### Caching Metrics +This feature is not supported for `cpu`, `memory` or `cron` scaler. -This feature enables caching of metric values during polling interval (as specified in `.spec.pollingInterval`). Kubernetes (HPA controller) asks for a metric every few seconds (as defined by `--horizontal-pod-autoscaler-sync-period`, usually 15s), then this request is routed to KEDA Metrics Server, that by default queries the scaler and reads the metric values. Enabling this feature changes this behavior, KEDA Metrics Server tries to read metric from the cache first. This cache is being updated periodically during the polling interval. +## Pausing autoscaling -Enabling this feature can significantly reduce the load on the scaler service. +It can be useful to instruct KEDA to pause the autoscaling of objects, to do to cluster maintenance or to avoid resource starvation by removing non-mission-critical workloads. -This feature is not supported for `cpu`, `memory` or `cron` scaler. +This is preferable to deleting the resource because it removes the instances it is running from operation without touching the applications themselves. When ready, you can then reenable scaling. -### Pause autoscaling +You can pause autoscaling by adding this annotation to your `ScaledObject` definition: -It can be useful to instruct KEDA to pause autoscaling of objects, if you want to do to cluster maintenance or you want to avoid resource starvation by removing non-mission-critical workloads. You can enable this by adding the below annotation to your `ScaledObject` definition: ```yaml metadata: @@ -299,10 +61,10 @@ The annotation `autoscaling.keda.sh/paused` will pause scaling immediately and u Typically, either one or the other is being used given they serve a different purpose/scenario. However, if both `paused` and `paused-replicas` are set, KEDA will scale your current workload to the number specified count in `paused-replicas` and then pause autoscaling. -To enable/unpause autoscaling again, simply remove all paused annotations from the `ScaledObject` definition. If you paused with `autoscaling.keda.sh/paused`, you can also set the annotation to `false` to unpause. +To unpause (reenable) autoscaling again, remove all paused annotations from the `ScaledObject` definition. If you paused with `autoscaling.keda.sh/paused`, you can unpause by setting the annotation to `false`. -### Scaling Modifiers (Experimental) +## Scaling Modifiers (Experimental) **Example: compose average value** @@ -370,15 +132,14 @@ Conditions can be used within another condition as well. If value of `trig_one` is less than 2 AND `trig_one`+`trig_two` is at least 2 then return 5, if only the first is true return 10, if the first condition is false then return 0. Complete language definition of `expr` package can be found [here](https://expr.medv.io/docs/Language-Definition). Formula must return a single value (not boolean). All formulas are internally wrapped with float cast. -### Activating and Scaling thresholds -To give a consistent solution to this problem, KEDA has 2 different phases during the autoscaling process. +## Activating and Scaling thresholds + +KEDA has 2 different phases during the autoscaling process. - **Activation phase:** The activating (or deactivating) phase is the moment when KEDA (operator) has to decide if the workload should be scaled from/to zero. KEDA takes responsibility for this action based on the result of the scaler `IsActive` function and only applies to 0<->1 scaling. There are use-cases where the activating value (0-1 and 1-0) is totally different than 0, such as workloads scaled with the Prometheus scaler where the values go from -X to X. - **Scaling phase:** The scaling phase is the moment when KEDA has decided to scale out to 1 instance and now it is the HPA controller who takes the scaling decisions based on the configuration defined in the generated HPA (from ScaledObject data) and the metrics exposed by KEDA (metrics server). This phase applies the to 1<->N scaling. -#### Managing Activation & Scaling Thresholds - KEDA allows you to specify different values for each scenario: - **Activation:** Defines when the scaler is active or not and scales from/to 0 based on it. @@ -397,7 +158,7 @@ There are some important topics to take into account: > ⚠️ **NOTE:** If a scaler doesn't define "activation" parameter (a property that starts with `activation` prefix), then this specific scaler doesn't support configurable activation value and the activation value is always 0. -## Transfer ownership of an existing HPA +## Transferring ownership of an existing HPA If your environment already operates using kubernetes HPA, you can transfer the ownership of this resource to a new ScaledObject: @@ -413,7 +174,7 @@ spec: > ⚠️ **NOTE:** You need to specify a custom HPA name in your ScaledObject matching the existing HPA name you want it to manage. -## Disable validations on an existing HPA +## Disabling validations on an existing HPA You are allowed to disable admission webhooks validations with the following snippet. It grants you better flexibility but also brings vulnerabilities. Do it **at your own risk**. @@ -423,13 +184,13 @@ metadata: validations.keda.sh/hpa-ownership: "true" ``` -## Long-running executions +### Long-running executions One important consideration to make is how this pattern can work with long-running executions. Imagine a deployment triggers on a RabbitMQ queue message. Each message takes 3 hours to process. It's possible that if many queue messages arrive, KEDA will help drive scaling out to many replicas - let's say 4. Now the HPA makes a decision to scale down from 4 replicas to 2. There is no way to control which of the 2 replicas get terminated to scale down. That means the HPA may attempt to terminate a replica that is 2.9 hours into processing a 3 hour queue message. There are two main ways to handle this scenario. -### Leverage the container lifecycle +#### Leverage the container lifecycle Kubernetes provides a few [lifecycle hooks](https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/) that can be leveraged to delay termination. Imagine a replica is scheduled for termination and is 2.9 hours into processing a 3 hour message. Kubernetes will send a [`SIGTERM`](https://www.gnu.org/software/libc/manual/html_node/Termination-Signals.html) to signal the intent to terminate. Rather than immediately terminating, a deployment can delay termination until processing the current batch of messages has completed. Kubernetes will wait for a `SIGTERM` response or the `terminationGracePeriodSeconds` before killing the replica. @@ -437,6 +198,6 @@ Kubernetes provides a few [lifecycle hooks](https://kubernetes.io/docs/concepts/ Using this method can preserve a replica and enable long-running executions. However, one downside of this approach is while delaying termination, the pod phase will remain in the `Terminating` state. That means a pod that is delaying termination for a very long duration may show `Terminating` during that entire period of delay. -### Run as jobs +#### Run as jobs The other alternative to handling long-running executions is by running the event driven code in Kubernetes Jobs instead of Deployments or Custom Resources. This approach is discussed [in the next section](../scaling-jobs). diff --git a/content/docs/2.15/concepts/scaling-jobs.md b/content/docs/2.15/concepts/scaling-jobs.md index 9a5c58595..77d96a871 100644 --- a/content/docs/2.15/concepts/scaling-jobs.md +++ b/content/docs/2.15/concepts/scaling-jobs.md @@ -3,8 +3,10 @@ title = "Scaling Jobs" weight = 300 +++ +This page describes the job scaling behavior of KEDA. See the [Scaled Job specification](../reference/scaledjob-spec.md) for details on how to set the behaviors described below. -## Overview + +# Overview As an alternate to [scaling event-driven code as deployments](../scaling-deployments) you can also run and scale your code as Kubernetes Jobs. The primary reason to consider this option is to handle processing long-running executions. Rather than processing multiple events within a deployment, for each detected event a single Kubernetes Job is scheduled. That job will initialize, pull a single event from the message source, and process to completion and terminate. @@ -16,250 +18,33 @@ For example, if you wanted to use KEDA to run a job for each message that lands 1. As additional messages arrive, additional jobs are created. Each job processes a single message to completion. 1. Periodically remove completed/failed job by the `SuccessfulJobsHistoryLimit` and `FailedJobsHistoryLimit.` -## ScaledJob spec - -This specification describes the `ScaledJob` custom resource definition which is used to define how KEDA should scale your application and what the triggers are. - -[`scaledjob_types.go`](https://github.com/kedacore/keda/blob/main/apis/keda/v1alpha1/scaledjob_types.go) - -```yaml -apiVersion: keda.sh/v1alpha1 -kind: ScaledJob -metadata: - name: {scaled-job-name} - labels: - my-label: {my-label-value} # Optional. ScaledJob labels are applied to child Jobs - annotations: - autoscaling.keda.sh/paused: true # Optional. Use to pause autoscaling of Jobs - my-annotation: {my-annotation-value} # Optional. ScaledJob annotations are applied to child Jobs -spec: - jobTargetRef: - parallelism: 1 # [max number of desired pods](https://kubernetes.io/docs/concepts/workloads/controllers/job/#controlling-parallelism) - completions: 1 # [desired number of successfully finished pods](https://kubernetes.io/docs/concepts/workloads/controllers/job/#controlling-parallelism) - activeDeadlineSeconds: 600 # Specifies the duration in seconds relative to the startTime that the job may be active before the system tries to terminate it; value must be positive integer - backoffLimit: 6 # Specifies the number of retries before marking this job failed. Defaults to 6 - template: - # describes the [job template](https://kubernetes.io/docs/concepts/workloads/controllers/job) - pollingInterval: 30 # Optional. Default: 30 seconds - successfulJobsHistoryLimit: 5 # Optional. Default: 100. How many completed jobs should be kept. - failedJobsHistoryLimit: 5 # Optional. Default: 100. How many failed jobs should be kept. - envSourceContainerName: {container-name} # Optional. Default: .spec.JobTargetRef.template.spec.containers[0] - minReplicaCount: 10 # Optional. Default: 0 - maxReplicaCount: 100 # Optional. Default: 100 - rolloutStrategy: gradual # Deprecated: Use rollout.strategy instead (see below). - rollout: - strategy: gradual # Optional. Default: default. Which Rollout Strategy KEDA will use. - propagationPolicy: foreground # Optional. Default: background. Kubernetes propagation policy for cleaning up existing jobs during rollout. - scalingStrategy: - strategy: "custom" # Optional. Default: default. Which Scaling Strategy to use. - customScalingQueueLengthDeduction: 1 # Optional. A parameter to optimize custom ScalingStrategy. - customScalingRunningJobPercentage: "0.5" # Optional. A parameter to optimize custom ScalingStrategy. - pendingPodConditions: # Optional. A parameter to calculate pending job count per the specified pod conditions - - "Ready" - - "PodScheduled" - - "AnyOtherCustomPodCondition" - multipleScalersCalculation : "max" # Optional. Default: max. Specifies how to calculate the target metrics when multiple scalers are defined. - triggers: - # {list of triggers to create jobs} -``` - -You can find all supported triggers [here](../scalers). - -## Details - -```yaml - jobTargetRef: - parallelism: 1 # Optional. Max number of desired instances ([docs](https://kubernetes.io/docs/concepts/workloads/controllers/job/#controlling-parallelism)) - completions: 1 # Optional. Desired number of successfully finished instances ([docs](https://kubernetes.io/docs/concepts/workloads/controllers/job/#controlling-parallelism)) - activeDeadlineSeconds: 600 # Optional. Specifies the duration in seconds relative to the startTime that the job may be active before the system tries to terminate it; value must be positive integer - backoffLimit: 6 # Optional. Specifies the number of retries before marking this job failed. Defaults to 6 -``` - -The `jobTargetRef` is a batch/v1 `JobSpec` object; refer to the Kubernetes API for [more details](https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/job-v1/#JobSpec) about the fields. The `template` field is required. - ---- - -```yaml - pollingInterval: 30 # Optional. Default: 30 seconds -``` - -This is the interval to check each trigger on. By default, KEDA will check each trigger source on every ScaledJob every 30 seconds. - ---- - -```yaml - successfulJobsHistoryLimit: 5 # Optional. Default: 100. How many completed jobs should be kept. - failedJobsHistoryLimit: 5 # Optional. Default: 100. How many failed jobs should be kept. -``` - -The `successfulJobsHistoryLimit` and `failedJobsHistoryLimit` fields are optional. These fields specify how many completed and failed jobs should be kept. By default, they are set to 100. - -This concept is similar to [Jobs History Limits](https://kubernetes.io/docs/tasks/job/automated-tasks-with-cron-jobs/#jobs-history-limits) allowing you to learn what the outcomes of your jobs are. - -The actual number of jobs could exceed the limit in a short time. However, it is going to resolve in the cleanup period. Currently, the cleanup period is the same as the Polling interval. - ---- - - -```yaml - envSourceContainerName: {container-name} # Optional. Default: .spec.JobTargetRef.template.spec.containers[0] -``` - -This optional property specifies the name of container in the Job, from which KEDA should try to get environment properties holding secrets etc. If it is not defined it, KEDA will try to get environment properties from the first Container, ie. from `.spec.JobTargetRef.template.spec.containers[0]`. - -___ -```yaml - minReplicaCount: 10 # Optional. Default: 0 -``` - -The min number of jobs that is created by default. This can be useful to avoid bootstrapping time of new jobs. If minReplicaCount is greater than maxReplicaCount, minReplicaCount will be set to maxReplicaCount. - -New messages may create new jobs - within the limits imposed by maxReplicaCount - in order to reach the state where minReplicaCount jobs are always running. For example, if one sets minReplicaCount to 2 then there will be 2 jobs running permanently. Using a targetValue of 1, if 3 new messages are sent, 2 of those messages will be processed on the already running jobs but another 3 jobs will be created in order to fulfill the desired state dictated by the minReplicaCount parameter that is set to 2. -___ - ---- - -```yaml - maxReplicaCount: 100 # Optional. Default: 100 -``` - -The max number of pods that is created within a single polling period. If there are running jobs, the number of running jobs will be deducted. This table is an example of the scaling logic. - -| Queue Length | Max Replica Count | Target Average Value | Running Job Count | Number of the Scale | -| ------- | ------ | ------- | ------ | ----- | -| 10 | 3 | 1 | 0 | 3 | -| 10 | 3 | 2 | 0 | 3 | -| 10 | 3 | 1 | 1 | 2 | -| 10 | 100 | 1 | 0 | 10 | -| 4 | 3 | 5 | 0 | 1 | - -* **Queue Length:** The number of items in the queue. -* **Target Average Value:** The number of messages that will be consumed on a job. It is defined on the scaler side. e.g. `queueLength` on `Azure Storage Queue` scaler. -* **Running Job Count:** How many jobs are running. -* **Number of the Scale:** The number of the job that is created. - ---- - -```yaml - rollout: - strategy: gradual # Optional. Default: default. Which Rollout Strategy KEDA will use. - propagationPolicy: foreground # Optional. Default: background. Kubernetes propagation policy for cleaning up existing jobs during -``` - -The optional property rollout.strategy specifies the rollout strategy KEDA will use while updating an existing ScaledJob. -Possible values are `default` or `gradual`. \ -When using the `default` rolloutStrategy, KEDA will terminate existing Jobs whenever a ScaledJob is being updated. Then, it will recreate those Jobs with the latest specs. The order in which this termination happens can be configured via the rollout.propagationPolicy property. By default, the kubernetes background propagation is used. To change this behavior specify set propagationPolicy to `foreground`. For further information see [Kubernetes Documentation](https://kubernetes.io/docs/tasks/administer-cluster/use-cascading-deletion/#use-foreground-cascading-deletion). -On the `gradual` rolloutStartegy, whenever a ScaledJob is being updated, KEDA will not delete existing Jobs. Only new Jobs will be created with the latest specs. - - ---- - -```yaml -scalingStrategy: - strategy: "default" # Optional. Default: default. Which Scaling Strategy to use. -``` -Select a Scaling Strategy. Possible values are `default`, `custom`, or `accurate`. The default value is `default`. +# Pausing autoscaling -> 💡 **NOTE:** -> ->`maxScale` is not the running Job count. It is measured as follows: - >```go - >maxScale = min(scaledJob.MaxReplicaCount(), divideWithCeil(queueLength, targetAverageValue)) - >``` - >That means it will use the value of `queueLength` divided by `targetAvarageValue` unless it is exceeding the `MaxReplicaCount`. -> ->`RunningJobCount` represents the number of jobs that are currently running or have not finished yet. -> ->It is measured as follows: ->```go ->if !e.isJobFinished(&job) { -> runningJobs++ ->} ->``` ->`PendingJobCount` provides an indication of the amount of jobs that are in pending state. Pending jobs can be calculated in two ways: -> - Default behavior - Job that have not finished yet **and** the underlying pod is either not running or has not been completed yet -> - Setting `pendingPodConditions` - Job that has not finished yet **and** all specified pod conditions of the underlying pod mark as `true` by kubernetes. -> ->It is measured as follows: ->```go ->if !e.isJobFinished(&job) { -> if len(scaledJob.Spec.ScalingStrategy.PendingPodConditions) > 0 { -> if !e.areAllPendingPodConditionsFulfilled(&job, scaledJob.Spec.ScalingStrategy.PendingPodConditions) { -> pendingJobs++ -> } -> } else { -> if !e.isAnyPodRunningOrCompleted(&job) { -> pendingJobs++ -> } -> } ->} ->``` +It can be useful to instruct KEDA to pause the autoscaling of objects, to do to cluster maintenance or to avoid resource starvation by removing non-mission-critical workloads. -**default** -This logic is the same as Job for V1. The number of the scale will be calculated as follows. +This is preferable to deleting the resource because it removes the instances it is running from operation without touching the applications themselves. When ready, you can then reenable scaling. -_The number of the scale_ - -```go -maxScale - runningJobCount -``` - -**custom** -You can customize the default scale logic. You need to configure the following parameters. If you don't configure it, then the strategy will be `default.` +You can pause autoscaling by adding this annotation to your `ScaledJob` definition: ```yaml -customScalingQueueLengthDeduction: 1 # Optional. A parameter to optimize custom ScalingStrategy. -customScalingRunningJobPercentage: "0.5" # Optional. A parameter to optimize custom ScalingStrategy. -``` - -_The number of the scale_ - -```go -min(maxScale-int64(*s.CustomScalingQueueLengthDeduction)-int64(float64(runningJobCount)*(*s.CustomScalingRunningJobPercentage)), maxReplicaCount) -``` - -**accurate** -If the scaler returns `queueLength` (number of items in the queue) that does not include the number of locked messages, this strategy is recommended. `Azure Storage Queue` is one example. You can use this strategy if you delete a message once your app consumes it. - -```go -if (maxScale + runningJobCount) > maxReplicaCount { - return maxReplicaCount - runningJobCount - } - return maxScale - pendingJobCount -``` -For more details, you can refer to [this PR](https://github.com/kedacore/keda/pull/1227). - ---- - -```yaml -scalingStrategy: - multipleScalersCalculation : "max" # Optional. Default: max. Specifies how to calculate the target metrics (`queueLength` and `maxScale`) when multiple scalers are defined. +metadata: + annotations: + autoscaling.keda.sh/paused: true ``` -Select a behavior if you have multiple triggers. Possible values are `max`, `min`, `avg`, or `sum`. The default value is `max`. - -* **max:** - Use metrics from the scaler that has the max number of `queueLength`. (default) -* **min:** - Use metrics from the scaler that has the min number of `queueLength`. -* **avg:** - Sum up all the active scalers metrics and divide by the number of active scalers. -* **sum:** - Sum up all the active scalers metrics. -### Pause autoscaling - -It can be useful to instruct KEDA to pause the autoscaling of objects, if you want to do to cluster maintenance or you want to avoid resource starvation by removing non-mission-critical workloads. - -This is a great alternative to deleting the resource, because we do not want to touch the applications themselves but simply remove the instances it is running from an operational perspective. Once everything is good to go, we can enable it to scale again. - -You can enable this by adding the below annotation to your `ScaledJob` definition: +To reenable autoscaling, remove the annotation from the `ScaledJob` definition or set the value to `false`. ```yaml metadata: annotations: - autoscaling.keda.sh/paused: true + autoscaling.keda.sh/paused: false ``` -The above annotation will pause autoscaling. To enable autoscaling again, simply remove the annotation from the `ScaledJob` definition or set the value to `false`. -# Sample +## Example + +An example configuration for autoscaling jobs using a RabbitMQ scaler is given below. ```yaml apiVersion: v1 diff --git a/content/docs/2.15/deploy.md b/content/docs/2.15/deploy.md index 0ebec231f..f2527a555 100644 --- a/content/docs/2.15/deploy.md +++ b/content/docs/2.15/deploy.md @@ -16,7 +16,7 @@ Don't see what you need? Feel free to [create an issue](https://github.com/kedac ### Install -Deploying KEDA with Helm is very simple: +To deploy KEDA with Helm: 1. Add Helm repo @@ -147,7 +147,7 @@ VERSION=2.15.0 make undeploy ### Install -If you want to try KEDA v2 on [MicroK8s](https://microk8s.io/) from `1.20` channel, KEDA is included into MicroK8s addons. +If you want to try KEDA v2 on [MicroK8s](https://microk8s.io/) from `1.20` channel, KEDA is included into MicroK8s add-ons. ```sh microk8s enable keda @@ -155,7 +155,7 @@ microk8s enable keda ### Uninstall -To uninstall KEDA in MicroK8s, simply disable the addon as shown below. +To uninstall KEDA in MicroK8s, disable the add-on as shown below. ```sh microk8s disable keda diff --git a/content/docs/2.15/operate/_index.md b/content/docs/2.15/operate/_index.md index e7604a0cf..eaae2df34 100644 --- a/content/docs/2.15/operate/_index.md +++ b/content/docs/2.15/operate/_index.md @@ -1,10 +1,10 @@ +++ title = "Operate" -description = "Guidance & requirements for operating KEDA" +description = "Guidance and requirements for operating KEDA" weight = 1 +++ -We provide guidance & requirements around various areas to operate KEDA: +We provide guidance and requirements around various areas to operate KEDA: - Admission Webhooks ([link](./admission-webhooks)) - Cluster ([link](./cluster)) diff --git a/content/docs/2.15/operate/events.md b/content/docs/2.15/operate/cloud-events.md similarity index 50% rename from content/docs/2.15/operate/events.md rename to content/docs/2.15/operate/cloud-events.md index a4bf08ff1..c022bf1c9 100644 --- a/content/docs/2.15/operate/events.md +++ b/content/docs/2.15/operate/cloud-events.md @@ -1,38 +1,10 @@ +++ -title = "Events" -description = "Kubernetes Events emitted by KEDA" +title = "CloudEvent Support" +description = "Experimental support for cloud events" weight = 100 +++ -## Kubernetes Events emitted by KEDA - -KEDA emits the following [Kubernetes Events](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/#event-v1-core): - -| Event | Type | Description | -| ------------------------------------- | --------- | --------------------------------------------------------------------------------------------------------------------------- | -| `ScaledObjectReady` | `Normal` | On the first time a ScaledObject is ready, or if the previous ready condition status of the object was `Unknown` or `False` | -| `ScaledJobReady` | `Normal` | On the first time a ScaledJob is ready, or if the previous ready condition status of the object was `Unknown` or `False` | -| `ScaledObjectCheckFailed` | `Warning` | If the check validation for a ScaledObject fails | | -| `ScaledJobCheckFailed` | `Warning` | If the check validation for a ScaledJob fails | | -| `ScaledObjectDeleted` | `Normal` | When a ScaledObject is deleted and removed from KEDA watch | | -| `ScaledJobDeleted` | `Normal` | When a ScaledJob is deleted and removed from KEDA watch | | -| `KEDAScalersStarted` | `Normal` | When Scalers watch loop have started for a ScaledObject or ScaledJob | | -| `KEDAScalersStopped` | `Normal` | When Scalers watch loop have stopped for a ScaledObject or a ScaledJob | | -| `KEDAScalerFailed` | `Warning` | When a Scaler fails to create or check its event source| | -| `KEDAScaleTargetActivated` | `Normal` | When the scale target (Deployment, StatefulSet, etc) of a ScaledObject is scaled to 1, triggered by {scalers1;scalers2;...}| | -| `KEDAScaleTargetDeactivated` | `Normal` | When the scale target (Deployment, StatefulSet, etc) of a ScaledObject is scaled to 0 | | -| `KEDAScaleTargetActivationFailed` | `Warning` | When KEDA fails to scale the scale target of a ScaledObject to 1| | -| `KEDAScaleTargetDeactivationFailed` | `Warning` | When KEDA fails to scale the scale target of a ScaledObject to 0| | -| `KEDAJobsCreated` | `Normal` | When KEDA creates jobs for a ScaledJob | | -| `TriggerAuthenticationAdded` | `Normal` | When a new TriggerAuthentication is added| | -| `TriggerAuthenticationDeleted` | `Normal` | When a TriggerAuthentication is deleted| | -| `ClusterTriggerAuthenticationAdded` | `Normal` | When a new ClusterTriggerAuthentication is added| | -| `ClusterTriggerAuthenticationDeleted` | `Normal` | When a ClusterTriggerAuthentication is deleted| | - - -## CloudEvent Support (Experimental) - -### Subscribing to events with `CloudEventSource` +## Subscribing to events with `CloudEventSource` `CloudEventSource` resource can be used in KEDA for subscribing to events that are emitted to the user's defined CloudEvent sink. > 📝 Event will be emitted to both Kubernetes Events and CloudEvents Destination if CloudEventSource resource is created. @@ -79,7 +51,7 @@ In general, an event emitted by KEDA would fundamentally come down to the follow } ``` -### Event Sinks +## Event Sinks There will be multiple types of destination to emit KEDA events to. @@ -88,14 +60,14 @@ Here is an overview of the supported destinations: - [HTTP endpoint](#http-endpoint). - [Azure Event Grid endpoint](#azure-event-grid). -#### HTTP endpoint +### HTTP endpoint ```yaml destination: http: uri: http://foo.bar #An http endpoint that can receive cloudevent ``` -#### Azure Event Grid +### Azure Event Grid ```yaml destination: @@ -107,11 +79,11 @@ Authentication information must be provided by using `authenticationRef` which a Here is an overview of the supported authentication types: -##### Connection String Authentication +#### Connection String Authentication - `accessKey` - Access key string for the Azure Event Grid connection auth. -##### Pod identity based authentication +#### Pod identity based authentication [Azure AD Workload Identity](https://azure.github.io/azure-workload-identity/docs/) providers can be used. ```yaml @@ -125,7 +97,7 @@ spec: provider: azure-workload ``` -### Event Filter +## Event Filter You can include filter(s) to define what event types you are interested in, or want to ignore. This is done by using `includedEventTypes` or `excludedEventTypes` respectively for a given sink. @@ -137,8 +109,8 @@ eventSubscription: #Optional. Submit included/excluded event types will filter e - keda.scaledobject.ready.v1 ``` -### Supported Event List +## Supported Event List | Event Type | Scenario Description | | ------------------------------------- | --------------------------------------------------------------------------------------------------------------------------- | | `keda.scaledobject.ready.v1` | On the first time a ScaledObject is ready, or if the previous ready condition status of the object was `Unknown` or `False` | -| `keda.scaledobject.failed.v1` | If the check validation for a ScaledObject fails | \ No newline at end of file +| `keda.scaledobject.failed.v1` | If the check validation for a ScaledObject fails | diff --git a/content/docs/2.15/operate/metrics-server.md b/content/docs/2.15/operate/metrics-server.md index 6977e990e..45e9699fa 100644 --- a/content/docs/2.15/operate/metrics-server.md +++ b/content/docs/2.15/operate/metrics-server.md @@ -11,7 +11,7 @@ The metrics exposed by KEDA Metrics Server can be queried directly using `kubect kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1" ``` -This will return a json with the list of metrics exposed by KEDA (just an external metric): +This will return a json with the list of metrics exposed by KEDA (external metrics only): ```json { "kind": "APIResourceList", diff --git a/content/docs/2.15/reference/_index.md b/content/docs/2.15/reference/_index.md new file mode 100644 index 000000000..459669eaa --- /dev/null +++ b/content/docs/2.15/reference/_index.md @@ -0,0 +1,13 @@ ++++ +title = "Reference" +weight = 2 ++++ + +Reference information for the KEDA autoscaler. + +- [Scaled Object specification](./scaledobject-spec) +- [ScaledJob specification](../concepts/scaling-jobs/#scaledjob-spec) +- [Events] +- [Firewall requirements] +- [FAQ](./faq.md) +- [Glossary](./glossary.md) diff --git a/content/docs/2.15/reference/events.md b/content/docs/2.15/reference/events.md new file mode 100644 index 000000000..ab1bd2b25 --- /dev/null +++ b/content/docs/2.15/reference/events.md @@ -0,0 +1,28 @@ ++++ +title = "Events reference" +description = "Kubernetes Events emitted by KEDA" +weight = 2500 ++++ + +KEDA emits the following [Kubernetes Events](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/#event-v1-core): + +| Event | Type | Description | +| ------------------------------------- | --------- | --------------------------------------------------------------------------------------------------------------------------- | +| `ScaledObjectReady` | `Normal` | On the first time a ScaledObject is ready, or if the previous ready condition status of the object was `Unknown` or `False` | +| `ScaledJobReady` | `Normal` | On the first time a ScaledJob is ready, or if the previous ready condition status of the object was `Unknown` or `False` | +| `ScaledObjectCheckFailed` | `Warning` | If the check validation for a ScaledObject fails | | +| `ScaledJobCheckFailed` | `Warning` | If the check validation for a ScaledJob fails | | +| `ScaledObjectDeleted` | `Normal` | When a ScaledObject is deleted and removed from KEDA watch | | +| `ScaledJobDeleted` | `Normal` | When a ScaledJob is deleted and removed from KEDA watch | | +| `KEDAScalersStarted` | `Normal` | When Scalers watch loop have started for a ScaledObject or ScaledJob | | +| `KEDAScalersStopped` | `Normal` | When Scalers watch loop have stopped for a ScaledObject or a ScaledJob | | +| `KEDAScalerFailed` | `Warning` | When a Scaler fails to create or check its event source| | +| `KEDAScaleTargetActivated` | `Normal` | When the scale target (Deployment, StatefulSet, etc) of a ScaledObject is scaled to 1, triggered by {scalers1;scalers2;...}| | +| `KEDAScaleTargetDeactivated` | `Normal` | When the scale target (Deployment, StatefulSet, etc) of a ScaledObject is scaled to 0 | | +| `KEDAScaleTargetActivationFailed` | `Warning` | When KEDA fails to scale the scale target of a ScaledObject to 1| | +| `KEDAScaleTargetDeactivationFailed` | `Warning` | When KEDA fails to scale the scale target of a ScaledObject to 0| | +| `KEDAJobsCreated` | `Normal` | When KEDA creates jobs for a ScaledJob | | +| `TriggerAuthenticationAdded` | `Normal` | When a new TriggerAuthentication is added| | +| `TriggerAuthenticationDeleted` | `Normal` | When a TriggerAuthentication is deleted| | +| `ClusterTriggerAuthenticationAdded` | `Normal` | When a new ClusterTriggerAuthentication is added| | +| `ClusterTriggerAuthenticationDeleted` | `Normal` | When a ClusterTriggerAuthentication is deleted| | diff --git a/content/docs/2.15/faq.md b/content/docs/2.15/reference/faq.md similarity index 72% rename from content/docs/2.15/faq.md rename to content/docs/2.15/reference/faq.md index d012d6767..df75bbe78 100644 --- a/content/docs/2.15/faq.md +++ b/content/docs/2.15/reference/faq.md @@ -1,5 +1,6 @@ +++ title = "FAQ" +weight = 2000 +++ {{< faq20 >}} diff --git a/content/docs/2.15/reference/glossary.md b/content/docs/2.15/reference/glossary.md new file mode 100644 index 000000000..fa8ecf689 --- /dev/null +++ b/content/docs/2.15/reference/glossary.md @@ -0,0 +1,88 @@ ++++ +title = "Glossary" +weight = 1000 ++++ + +This document defines the various terms needed to understand the documentation and set up and use KEDA. + +## Admission Webhook + +[In Kubernetes](https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/), an HTTP callback that handle admission requests. KEDA uses an admission webhook to validate and mutate ScaledObject resources. + +## Agent + +A primary role held by the KEDA operator. The Agent activates and deactivates Kubernetes Deployments to scale to and from zero. + +## Cluster + +[In Kubernetes](https://kubernetes.io/docs/reference/glossary/?fundamental=true#term-cluster), a set of one or more nodes that run containerized applications. + +## CRD + +Custom Resource Definition. [In Kubernetes](https://kubernetes.io/docs/reference/glossary/?fundamental=true#term-CustomResourceDefinition), a custom resource that extends the Kubernetes API with custom resources like ScaledObjects that have custom fields and behavior. + +## Event + +A notable occurrence captured by an event source that KEDA may use as a trigger to scale a container or deployment. + +## Event Source + +An external system like Kafka, RabbitMQ, that generates events that KEDA can monitor using a scaler. + +## Grafana + +An open-source monitoring platform that can visualize metrics collected by KEDA. + +## GRPC + +Go Remote Procedure Call. An open-source remote procedure call framework used by KEDA components to communicate. + +## HPA + +Horizontal Pod Autoscaler. Kubernetes autoscaler. By default, scales based on CPU/memory usage. KEDA uses HPA to scale Kubernetes clusters and deployments. + +## KEDA + +Kubernetes Event-Driven Autoscaling. A single-purpose, lightweight autoscaler that can scale a Kubernetes workload based on event metrics. + +## Metric + +Measurement of an event source such as queue length or response lag that KEDA uses to determine scaling. + +## OpenTelemetry + +An observability framework used by KEDA to instrument applications and collect metrics. + +## Operator + +The core KEDA component that monitors metrics and scales workloads accordingly. + +## Prometheus + +An open-source monitoring system that can scrape and store metrics from KEDA. + +## Scaled Object + +A custom resource that defines how KEDA should scale a workload based on events. + +## Scaled Job + +A custom resource KEDA uses to scale an application. + +## Scaler + +A component that integrates KEDA with a specific event source to collect metrics. + +## Stateful Set + +A Kubernetes workload with persistent data. KEDA can scale stateful sets. + +## TLS + +Transport Layer Security. KEDA uses TLS to encrypt communications between KEDA components. + +## Webhook + +An HTTP callback used to notify KEDA of events from external sources. + +[In Kubernetes](https://kubernetes.io/docs/reference/access-authn-authz/webhook/), an HTTP callback used as an event notification mechanism. diff --git a/content/docs/2.15/reference/scaledjob-spec.md b/content/docs/2.15/reference/scaledjob-spec.md new file mode 100644 index 000000000..1b31fc641 --- /dev/null +++ b/content/docs/2.15/reference/scaledjob-spec.md @@ -0,0 +1,236 @@ ++++ +title = "ScaledJob specification" +weight = 4000 ++++ + +## Overview + +This specification describes the `ScaledJob` custom resource definition that defines the triggers and scaling behaviors use by KEDA + +to scale jobs. The `.spec.ScaleTargetRef` section holds the reference to the job, defined in [_scaledjob_types.go_](https://github.com/kedacore/keda/blob/main/apis/keda/v1alpha1/scaledjob_types.go). + +```yaml +apiVersion: keda.sh/v1alpha1 +kind: ScaledJob +metadata: + name: {scaled-job-name} + labels: + my-label: {my-label-value} # Optional. ScaledJob labels are applied to child Jobs + annotations: + autoscaling.keda.sh/paused: true # Optional. Use to pause autoscaling of Jobs + my-annotation: {my-annotation-value} # Optional. ScaledJob annotations are applied to child Jobs +spec: + jobTargetRef: + parallelism: 1 # [max number of desired pods](https://kubernetes.io/docs/concepts/workloads/controllers/job/#controlling-parallelism) + completions: 1 # [desired number of successfully finished pods](https://kubernetes.io/docs/concepts/workloads/controllers/job/#controlling-parallelism) + activeDeadlineSeconds: 600 # Specifies the duration in seconds relative to the startTime that the job may be active before the system tries to terminate it; value must be positive integer + backoffLimit: 6 # Specifies the number of retries before marking this job failed. Defaults to 6 + template: + # describes the [job template](https://kubernetes.io/docs/concepts/workloads/controllers/job) + pollingInterval: 30 # Optional. Default: 30 seconds + successfulJobsHistoryLimit: 5 # Optional. Default: 100. How many completed jobs should be kept. + failedJobsHistoryLimit: 5 # Optional. Default: 100. How many failed jobs should be kept. + envSourceContainerName: {container-name} # Optional. Default: .spec.JobTargetRef.template.spec.containers[0] + minReplicaCount: 10 # Optional. Default: 0 + maxReplicaCount: 100 # Optional. Default: 100 + rolloutStrategy: gradual # Deprecated: Use rollout.strategy instead (see below). + rollout: + strategy: gradual # Optional. Default: default. Which Rollout Strategy KEDA will use. + propagationPolicy: foreground # Optional. Default: background. Kubernetes propagation policy for cleaning up existing jobs during rollout. + scalingStrategy: + strategy: "custom" # Optional. Default: default. Which Scaling Strategy to use. + customScalingQueueLengthDeduction: 1 # Optional. A parameter to optimize custom ScalingStrategy. + customScalingRunningJobPercentage: "0.5" # Optional. A parameter to optimize custom ScalingStrategy. + pendingPodConditions: # Optional. A parameter to calculate pending job count per the specified pod conditions + - "Ready" + - "PodScheduled" + - "AnyOtherCustomPodCondition" + multipleScalersCalculation : "max" # Optional. Default: max. Specifies how to calculate the target metrics when multiple scalers are defined. + triggers: + # {list of triggers to create jobs} +``` + +You can find all supported triggers [here](../scalers). + +## jobTargetRef + +```yaml + jobTargetRef: + parallelism: 1 # Optional. Max number of desired instances ([docs](https://kubernetes.io/docs/concepts/workloads/controllers/job/#controlling-parallelism)) + completions: 1 # Optional. Desired number of successfully finished instances ([docs](https://kubernetes.io/docs/concepts/workloads/controllers/job/#controlling-parallelism)) + activeDeadlineSeconds: 600 # Optional. Specifies the duration in seconds relative to the startTime that the job may be active before the system tries to terminate it; value must be positive integer + backoffLimit: 6 # Optional. Specifies the number of retries before marking this job failed. Defaults to 6 +``` + +The `jobTargetRef` is a batch/v1 `JobSpec` object; refer to the Kubernetes API for [more details](https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/job-v1/#JobSpec) about the fields. The `template` field is required. + + +## pollingInterval + +```yaml + pollingInterval: 30 # Optional. Default: 30 seconds +``` + +This is the interval to check each trigger on. By default, KEDA will check each trigger source on every ScaledJob every 30 seconds. + + +## successfulJobsHistoryLimit, failedJobsHistoryLimit + +```yaml + successfulJobsHistoryLimit: 5 # Optional. Default: 100. How many completed jobs should be kept. + failedJobsHistoryLimit: 5 # Optional. Default: 100. How many failed jobs should be kept. +``` + +The `successfulJobsHistoryLimit` and `failedJobsHistoryLimit` fields are optional. These fields specify how many completed and failed jobs should be kept. By default, they are set to 100. + +This concept is similar to [Jobs History Limits](https://kubernetes.io/docs/tasks/job/automated-tasks-with-cron-jobs/#jobs-history-limits) allowing you to learn what the outcomes of your jobs are. + +The actual number of jobs could exceed the limit in a short time. However, it is going to resolve in the cleanup period. Currently, the cleanup period is the same as the Polling interval. + + +## envSourceContainerName + +```yaml + envSourceContainerName: {container-name} # Optional. Default: .spec.JobTargetRef.template.spec.containers[0] +``` + +This optional property specifies the name of container in the Job, from which KEDA should try to get environment properties holding secrets etc. If it is not defined it, KEDA will try to get environment properties from the first Container, ie. from `.spec.JobTargetRef.template.spec.containers[0]`. + +___ +## minReplicaCount + +```yaml + minReplicaCount: 10 # Optional. Default: 0 +``` + +The min number of jobs that is created by default. This can be useful to avoid bootstrapping time of new jobs. If minReplicaCount is greater than maxReplicaCount, minReplicaCount will be set to maxReplicaCount. + +New messages may create new jobs - within the limits imposed by maxReplicaCount - in order to reach the state where minReplicaCount jobs are always running. For example, if one sets minReplicaCount to 2 then there will be 2 jobs running permanently. Using a targetValue of 1, if 3 new messages are sent, 2 of those messages will be processed on the already running jobs but another 3 jobs will be created in order to fulfill the desired state dictated by the minReplicaCount parameter that is set to 2. + +## maxReplicaCount + +```yaml + maxReplicaCount: 100 # Optional. Default: 100 +``` + +The max number of pods that is created within a single polling period. If there are running jobs, the number of running jobs will be deducted. This table is an example of the scaling logic. + +| Queue Length | Max Replica Count | Target Average Value | Running Job Count | Number of the Scale | +| ------- | ------ | ------- | ------ | ----- | +| 10 | 3 | 1 | 0 | 3 | +| 10 | 3 | 2 | 0 | 3 | +| 10 | 3 | 1 | 1 | 2 | +| 10 | 100 | 1 | 0 | 10 | +| 4 | 3 | 5 | 0 | 1 | + +* **Queue Length:** The number of items in the queue. +* **Target Average Value:** The number of messages that will be consumed on a job. It is defined on the scaler side. e.g. `queueLength` on `Azure Storage Queue` scaler. +* **Running Job Count:** How many jobs are running. +* **Number of the Scale:** The number of the job that is created. + + +## rollout + +```yaml + rollout: + strategy: gradual # Optional. Default: default. Which Rollout Strategy KEDA will use. + propagationPolicy: foreground # Optional. Default: background. Kubernetes propagation policy for cleaning up existing jobs during +``` + +The optional property rollout.strategy specifies the rollout strategy KEDA will use while updating an existing ScaledJob. +Possible values are `default` or `gradual`. \ +When using the `default` rolloutStrategy, KEDA will terminate existing Jobs whenever a ScaledJob is being updated. Then, it will recreate those Jobs with the latest specs. The order in which this termination happens can be configured via the rollout.propagationPolicy property. By default, the kubernetes background propagation is used. To change this behavior specify set propagationPolicy to `foreground`. For further information see [Kubernetes Documentation](https://kubernetes.io/docs/tasks/administer-cluster/use-cascading-deletion/#use-foreground-cascading-deletion). +On the `gradual` rolloutStartegy, whenever a ScaledJob is being updated, KEDA will not delete existing Jobs. Only new Jobs will be created with the latest specs. + + +## scalingStrategy + +```yaml +scalingStrategy: + strategy: "default" # Optional. Default: default. Which Scaling Strategy to use. +``` + +Select a Scaling Strategy. Possible values are `default`, `custom`, or `accurate`. The default value is `default`. + +> 💡 **NOTE:** +> +>`maxScale` is not the running Job count. It is measured as follows: + >```go + >maxScale = min(scaledJob.MaxReplicaCount(), divideWithCeil(queueLength, targetAverageValue)) + >``` + >That means it will use the value of `queueLength` divided by `targetAvarageValue` unless it is exceeding the `MaxReplicaCount`. +> +>`RunningJobCount` represents the number of jobs that are currently running or have not finished yet. +> +>It is measured as follows: +>```go +>if !e.isJobFinished(&job) { +> runningJobs++ +>} +>``` +>`PendingJobCount` provides an indication of the amount of jobs that are in pending state. Pending jobs can be calculated in two ways: +> - Default behavior - Job that have not finished yet **and** the underlying pod is either not running or has not been completed yet +> - Setting `pendingPodConditions` - Job that has not finished yet **and** all specified pod conditions of the underlying pod mark as `true` by kubernetes. +> +>It is measured as follows: +>```go +>if !e.isJobFinished(&job) { +> if len(scaledJob.Spec.ScalingStrategy.PendingPodConditions) > 0 { +> if !e.areAllPendingPodConditionsFulfilled(&job, scaledJob.Spec.ScalingStrategy.PendingPodConditions) { +> pendingJobs++ +> } +> } else { +> if !e.isAnyPodRunningOrCompleted(&job) { +> pendingJobs++ +> } +> } +>} +>``` + +**default** +This logic is the same as Job for V1. The number of the scale will be calculated as follows. + +_The number of the scale_ + +```go +maxScale - runningJobCount +``` + +**custom** +You can customize the default scale logic. You need to configure the following parameters. If you don't configure it, then the strategy will be `default.` + +```yaml +customScalingQueueLengthDeduction: 1 # Optional. A parameter to optimize custom ScalingStrategy. +customScalingRunningJobPercentage: "0.5" # Optional. A parameter to optimize custom ScalingStrategy. +``` + +_The number of the scale_ + +```go +min(maxScale-int64(*s.CustomScalingQueueLengthDeduction)-int64(float64(runningJobCount)*(*s.CustomScalingRunningJobPercentage)), maxReplicaCount) +``` + +**accurate** +If the scaler returns `queueLength` (number of items in the queue) that does not include the number of locked messages, this strategy is recommended. `Azure Storage Queue` is one example. You can use this strategy if you delete a message once your app consumes it. + +```go +if (maxScale + runningJobCount) > maxReplicaCount { + return maxReplicaCount - runningJobCount + } + return maxScale - pendingJobCount +``` +For more details, you can refer to [this PR](https://github.com/kedacore/keda/pull/1227). + + +### multipleScalersCalculation + +```yaml +scalingStrategy: + multipleScalersCalculation : "max" # Optional. Default: max. Specifies how to calculate the target metrics (`queueLength` and `maxScale`) when multiple scalers are defined. +``` +Select a behavior if you have multiple triggers. Possible values are `max`, `min`, `avg`, or `sum`. The default value is `max`. + +* **max:** - Use metrics from the scaler that has the max number of `queueLength`. (default) +* **min:** - Use metrics from the scaler that has the min number of `queueLength`. +* **avg:** - Sum up all the active scalers metrics and divide by the number of active scalers. +* **sum:** - Sum up all the active scalers metrics. diff --git a/content/docs/2.15/reference/scaledobject-spec.md b/content/docs/2.15/reference/scaledobject-spec.md new file mode 100644 index 000000000..e0dd4e562 --- /dev/null +++ b/content/docs/2.15/reference/scaledobject-spec.md @@ -0,0 +1,254 @@ + ++++ +title = "ScaledObject specification" +weight = 3000 ++++ + +## Overview + +This specification describes the `ScaledObject` Custom Resource definition that defines the triggers and scaling behaviors used by KEDA to scale `Deployment`, `StatefulSet` and `Custom Resource` target resources. The `.spec.ScaleTargetRef` section holds the reference to the target resource, defined in [_scaledobject_types.go_](https://github.com/kedacore/keda/blob/main/apis/keda/v1alpha1/scaledobject_types.go). + +```yaml +apiVersion: keda.sh/v1alpha1 +kind: ScaledObject +metadata: + name: {scaled-object-name} + annotations: + scaledobject.keda.sh/transfer-hpa-ownership: "true" # Optional. Use to transfer an existing HPA ownership to this ScaledObject + validations.keda.sh/hpa-ownership: "true" # Optional. Use to disable HPA ownership validation on this ScaledObject + autoscaling.keda.sh/paused: "true" # Optional. Use to pause autoscaling of objects explicitly +spec: + scaleTargetRef: + apiVersion: {api-version-of-target-resource} # Optional. Default: apps/v1 + kind: {kind-of-target-resource} # Optional. Default: Deployment + name: {name-of-target-resource} # Mandatory. Must be in the same namespace as the ScaledObject + envSourceContainerName: {container-name} # Optional. Default: .spec.template.spec.containers[0] + pollingInterval: 30 # Optional. Default: 30 seconds + cooldownPeriod: 300 # Optional. Default: 300 seconds + idleReplicaCount: 0 # Optional. Default: ignored, must be less than minReplicaCount + minReplicaCount: 1 # Optional. Default: 0 + maxReplicaCount: 100 # Optional. Default: 100 + fallback: # Optional. Section to specify fallback options + failureThreshold: 3 # Mandatory if fallback section is included + replicas: 6 # Mandatory if fallback section is included + advanced: # Optional. Section to specify advanced options + restoreToOriginalReplicaCount: true/false # Optional. Default: false + horizontalPodAutoscalerConfig: # Optional. Section to specify HPA related options + name: {name-of-hpa-resource} # Optional. Default: keda-hpa-{scaled-object-name} + behavior: # Optional. Use to modify HPA's scaling behavior + scaleDown: + stabilizationWindowSeconds: 300 + policies: + - type: Percent + value: 100 + periodSeconds: 15 + triggers: + # {list of triggers to activate scaling of the target resource} +``` + +## scaleTargetRef + +```yaml + scaleTargetRef: + apiVersion: {api-version-of-target-resource} # Optional. Default: apps/v1 + kind: {kind-of-target-resource} # Optional. Default: Deployment + name: {name-of-target-resource} # Mandatory. Must be in the same namespace as the ScaledObject + envSourceContainerName: {container-name} # Optional. Default: .spec.template.spec.containers[0] +``` + +The reference to the resource this ScaledObject is configured for. This is the resource KEDA will scale up/down and set up an HPA for, based on the triggers defined in `triggers:`. + +To scale Kubernetes Deployments only `name` need be specified. To scale a different resource such as StatefulSet or Custom Resource (that defines `/scale` subresource), appropriate `apiVersion` (following standard Kubernetes convention, ie. `{api}/{version}`) and `kind` need to be specified. + +`envSourceContainerName` is an optional property that specifies the name of container in the target resource, from which KEDA should try to get environment properties holding secrets etc. If it is not defined, KEDA will try to get environment properties from the first Container, ie. from `.spec.template.spec.containers[0]`. + +**Assumptions:** Resource referenced by `name` (and `apiVersion`, `kind`) is in the same namespace as the ScaledObject + + +## pollingInterval +```yaml + pollingInterval: 30 # Optional. Default: 30 seconds +``` + +This is the interval to check each trigger on. By default, KEDA will check each trigger source on every ScaledObject every 30 seconds. + +**Example:** in a queue scenario, KEDA will check the queueLength every `pollingInterval`, and scale the resource up or down accordingly. + + +## cooldownPeriod +```yaml + cooldownPeriod: 300 # Optional. Default: 300 seconds +``` + +The period to wait after the last trigger reported active before scaling the resource back to 0, in seconds. By default, it's 300 (5 minutes). + +The `cooldownPeriod` only applies after a trigger occurs; when you first create your `Deployment` (or `StatefulSet`/`CustomResource`), KEDA will immediately scale it to `minReplicaCount`. Additionally, the KEDA `cooldownPeriod` only applies when scaling to 0; scaling from 1 to N replicas is handled by the [Kubernetes Horizontal Pod Autoscaler](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/../concepts/scaling-deployments.md#support-for-cooldowndelay). + +**Example:** wait 5 minutes after the last time KEDA checked the queue and it was empty. (this is obviously dependent on `pollingInterval`) + + +## initialCooldownPeriod +```yaml + InitialCooldownPeriod: 120 # Optional. Default: 0 seconds +``` +The delay before the `cooldownPeriod` starts after the initial creation of the `ScaledObject`, in seconds. By default, it's 0, meaning the `cooldownPeriod` begins immediately upon creation. If set to a value such as 120 seconds, the `cooldownPeriod` will only start after the `ScaledObject` has been active for that duration. + +This parameter is particularly useful for managing the scale-down behavior during the initial phase of a `ScaledObject`. For instance, if `InitialCooldownPeriod` is set to 120 seconds, KEDA will not scale the resource back to 0 until 120 seconds have passed since the `ScaledObject` creation, regardless of the activity triggers. This allows for a grace period in situations where immediate scaling down after creation is not desirable. + +**Example:** Wait 120 seconds after the `ScaledObject` is created before starting the `cooldownPeriod`. For instance, if the `InitialCooldownPeriod` is set to 120 seconds, KEDA will not initiate the cooldown process until 120 seconds have passed since the `ScaledObject` was first created, regardless of the triggers' activity. This ensures a buffer period where the resource won’t be scaled down immediately after creation. (Note: This setting is independent of the `pollingInterval`.) + + +## idleReplicaCount + +```yaml + idleReplicaCount: 0 # Optional. Default: ignored, must be less than minReplicaCount +``` + +> 💡 **NOTE:** Due to limitations in HPA controller the only supported value for this property is 0, it will not work correctly otherwise. See this [issue](https://github.com/kedacore/keda/issues/2314) for more details. +> +> In some cases, you always need at least `n` pod running. Thus, you can omit this property and set `minReplicaCount` to `n`. +> +> **Example** You set `minReplicaCount` to 1 and `maxReplicaCount` to 10. If there’s no activity on triggers, the target resource is scaled down to `minReplicaCount` (1). Once there are activities, the target resource will scale base on the HPA rule. If there’s no activity on triggers, the resource is again scaled down to `minReplicaCount` (1). + +If this property is set, KEDA will scale the resource down to this number of replicas. If there's some activity on target triggers KEDA will scale the target resource immediately to `minReplicaCount` and then will be scaling handled by HPA. When there is no activity, the target resource is again scaled down to `idleReplicaCount`. This setting must be less than `minReplicaCount`. + +**Example:** If there's no activity on triggers the target resource is scaled down to `idleReplicaCount` (0), once there is an activity the target resource is immediately scaled to `minReplicaCount` (10) and then up to `maxReplicaCount` (100) as needed. If there's no activity on triggers the resource is again scaled down to `idleReplicaCount` (0). + + +## minReplicaCount + +```yaml + minReplicaCount: 1 # Optional. Default: 0 +``` + +Minimum number of replicas KEDA will scale the resource down to. By default, it's scale to zero, but you can use it with some other value as well. + +## maxReplicaCount + +```yaml + maxReplicaCount: 100 # Optional. Default: 100 +``` +This setting is passed to the HPA definition that KEDA will create for a given resource and holds the maximum number of replicas of the target resource. + + +## fallback +```yaml + fallback: # Optional. Section to specify fallback options + failureThreshold: 3 # Mandatory if fallback section is included + replicas: 6 # Mandatory if fallback section is included +``` + +The `fallback` section is optional. It defines a number of replicas to fall back to if a scaler is in an error state. + +KEDA will keep track of the number of consecutive times each scaler has failed to get metrics from its source. Once that value passes the `failureThreshold`, instead of not propagating a metric to the HPA (the default error behaviour), the scaler will, instead, return a normalised metric using the formula: +``` +target metric value * fallback replicas +``` +Due to the HPA metric being of type `AverageValue` (see below), this will have the effect of the HPA scaling the deployment to the defined number of fallback replicas. + +**Example:** When my instance of prometheus is unavailable 3 consecutive times, KEDA will change the HPA metric such that the deployment will scale to 6 replicas. + +There are a few limitations to using a fallback: + - It only supports scalers whose target is an `AverageValue` metric. Thus, it is **not** supported by the CPU & memory scalers, or by scalers whose metric target type is `Value`. In these cases, it will assume that fallback is disabled. + - It is only supported by `ScaledObjects` **not** `ScaledJobs`. + + +## advanced + +### restoreToOriginalReplicaCount + +```yaml +advanced: + restoreToOriginalReplicaCount: true/false # Optional. Default: false +``` + +This property specifies whether the target resource (`Deployment`, `StatefulSet`,...) should be scaled back to original replicas count, after the `ScaledObject` is deleted. +Default behavior is to keep the replica count at the same number as it is in the moment of `ScaledObject's` deletion. + +For example a `Deployment` with `3 replicas` is created, then `ScaledObject` is created and the `Deployment` is scaled by KEDA to `10 replicas`. Then `ScaledObject` is deleted: + 1. if `restoreToOriginalReplicaCount = false` (default behavior) then `Deployment` replicas count is `10` + 2. if `restoreToOriginalReplicaCount = true` then `Deployment` replicas count is set back to `3` (the original value) + + +### horizontalPodAutoscalerConfig + +```yaml +advanced: + horizontalPodAutoscalerConfig: # Optional. Section to specify HPA related options + name: {name-of-hpa-resource} # Optional. Default: keda-hpa-{scaled-object-name} + behavior: # Optional. Use to modify HPA's scaling behavior + scaleDown: + stabilizationWindowSeconds: 300 + policies: + - type: Percent + value: 100 + periodSeconds: 15 +``` + +#### horizontalPodAutoscalerConfig.name + +The name of the HPA resource KEDA will create. By default, it's `keda-hpa-{scaled-object-name}` + +#### horizontalPodAutoscalerConfig.behavior + +Starting from Kubernetes v1.18 the autoscaling API allows scaling behavior to be configured through the HPA behavior field. This way one can directly affect scaling of 1<->N replicas, which is internally being handled by HPA. KEDA would feed values from this section directly to the HPA's `behavior` field. Please follow [Kubernetes documentation](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/../concepts/scaling-deployments.md#configurable-scaling-behavior) for details. + +**Assumptions:** KEDA must be running on Kubernetes cluster v1.18+, in order to be able to benefit from this setting. + + + +```yaml +advanced: + scalingModifiers: # Optional. Section to specify scaling modifiers + target: {target-value-to-scale-on} # Mandatory. New target if metrics are anyhow composed together + activationTarget: {activation-target-value-to-scale-on} # Optional. New activation target if metrics are anyhow composed together + metricType: {metric-tipe-for-the-modifier} # Optional. Metric type to be used if metrics are anyhow composed together + formula: {formula-for-fetched-metrics} # Mandatory. Formula for calculation +``` + +### scalingModifiers + +The `scalingModifiers` is optional and **experimental**. If defined, both `target` and `formula` are mandatory. Using this structure creates `composite-metric` for the HPA that will replace all requests for external metrics and handle them internally. With `scalingModifiers` each trigger used in the `formula` **must** have a name defined. + +#### scalingModifiers.target + +`target` defines new target value to scale on for the composed metric. + +#### scalingModifiers.activationTarget + +`activationTarget` defines a new [activation target value](../concepts/scaling-deployments.md#activating-and-scaling-thresholds) to scale on for the composed metric. (Default: `0`, Optional) + +#### scalingModifiers.metricType + +`metricType` defines metric type used for this new `composite-metric`. (Values: `AverageValue`, `Value`, Default: `AverageValue`, Optional) + +#### scalingModifiers.formula + + `formula` composes metrics together and allows them to be modified/manipulated. It accepts mathematical/conditional statements using [this external project](https://github.com/antonmedv/expr). If the `fallback` scaling feature is in effect, the `formula` will NOT modify its metrics (therefore it modifies metrics only when all of their triggers are healthy). Complete language definition of `expr` package can be found [here](https://expr.medv.io/docs/Language-Definition). Formula must return a single value (not boolean). + +For examples of this feature see section [Scaling Modifiers](../concepts/scaling-deployments.md#scaling-modifiers-experimental). + + +## triggers + +```yaml + triggers: + # {list of triggers to activate scaling of the target resource} +``` + +> 💡 **NOTE:** You can find all supported triggers [here](/scalers). + +Trigger fields: +- **type**: The type of trigger to use. (Mandatory) +- **metadata**: The configuration parameters that the trigger requires. (Mandatory) +- **name**: Name for this trigger. This value can be used to easily distinguish this specific trigger and its metrics when consuming [Prometheus metrics](../operate/prometheus.md). By default, the name is generated from the trigger type. (Optional) +- **useCachedMetrics**: Enables caching of metric values during polling interval (as specified in `.spec.pollingInterval`). For more information, see ["Caching Metrics"](../concepts/scaling-deployments.md#caching-metrics). (Values: `false`, `true`, Default: `false`, Optional) +- **authenticationRef**: A reference to the `TriggerAuthentication` or `ClusterTriggerAuthentication` object that is used to authenticate the scaler with the environment. + - More details can be found [here](./authentication). (Optional) +- **metricType**: The type of metric that should be used. (Values: `AverageValue`, `Value`, `Utilization`, Default: `AverageValue`, Optional) + - Learn more about how the [Horizontal Pod Autoscaler (HPA) calculates `replicaCount`](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/) based on metric type and value. + - To show the differences between the metric types, let's assume we want to scale a deployment with 3 running replicas based on a queue of messages: + - With `AverageValue` metric type, we can control how many messages, on average, each replica will handle. If our metric is the queue size, the threshold is 5 messages, and the current message count in the queue is 20, HPA will scale the deployment to 20 / 5 = 4 replicas, regardless of the current replica count. + - The `Value` metric type, on the other hand, can be used when we don't want to take the average of the given metric across all replicas. For example, with the `Value` type, we can control the average time of messages in the queue. If our metric is average time in the queue, the threshold is 5 milliseconds, and the current average time is 20 milliseconds, HPA will scale the deployment to 3 * 20 / 5 = 12. + +> ⚠️ **NOTE:** All scalers, except CPU and Memory, support metric types `AverageValue` and `Value` while CPU and Memory scalers both support `AverageValue` and `Utilization`. diff --git a/content/docs/2.15/scalers/aws-sqs.md b/content/docs/2.15/scalers/aws-sqs.md index aba7bbec6..3ce3c8297 100644 --- a/content/docs/2.15/scalers/aws-sqs.md +++ b/content/docs/2.15/scalers/aws-sqs.md @@ -29,8 +29,8 @@ triggers: **Parameter list:** -- `queueURL` - Full URL for the SQS Queue. The simple name of the queue can be used in case there's no ambiguity. (Optional, You can use this instead of `queueURLFromEnv` parameter) -- `queueURLFromEnv` - Name of the environment variable on the scale target to read the queue URL from. (Optional, You can use this instead of `queueURL` parameter) +- `queueURL` - Full URL for the SQS Queue. The short name of the queue can be used if there's no ambiguity. (Optional. Only one of `queueURL` and `queueURLFromEnv` is required. If both are provided, `queueURL` is used.) +- `queueURLFromEnv` - Name of the environment variable on the scale target to read the queue URL from. (Optional. Only one of `queueURL` and `queueURLFromEnv` is required.) - `queueLength` - Target value for queue length passed to the scaler. Example: if one pod can handle 10 messages, set the queue length target to 10. If the actual messages in the SQS Queue is 30, the scaler scales to 3 pods. (default: 5) - `activationQueueLength` - Target value for activating the scaler. Learn more about activation [here](./../concepts/scaling-deployments.md#activating-and-scaling-thresholds). (Default: `0`, Optional) diff --git a/content/docs/2.15/scalers/azure-pipelines.md b/content/docs/2.15/scalers/azure-pipelines.md index af6d1818f..dbcf3a949 100644 --- a/content/docs/2.15/scalers/azure-pipelines.md +++ b/content/docs/2.15/scalers/azure-pipelines.md @@ -78,7 +78,7 @@ Finally, it is also possible get the pool ID from the response of a HTTP request ### Supporting demands in agents -By default, if you do not wish to use demands in your agent scaler then it will scale based simply on the pool's queue length. +By default, if you do not use demands in your agent scaler then it scales based on the pool's queue length. Demands (Capabilities) are useful when you have multiple agents with different capabilities existing within the same pool, for instance in a kube cluster you may have an agent supporting dotnet5, dotnet6, java or maven; diff --git a/content/docs/2.15/scalers/redis-sentinel-lists.md b/content/docs/2.15/scalers/redis-sentinel-lists.md index 852ab00d7..b548402b0 100644 --- a/content/docs/2.15/scalers/redis-sentinel-lists.md +++ b/content/docs/2.15/scalers/redis-sentinel-lists.md @@ -41,7 +41,7 @@ triggers: - Both the hostname, username and password fields need to be set to the names of the environment variables in the target deployment that contain the host name, username and password respectively. - `sentinelUsernameFromEnv` - Environment variable to read the authentication username from to authenticate with the Redis Sentinel server. - `sentinelPasswordFromEnv` - Environment variable to read the authentication password from to authenticate with the Redis Sentinel server. -- `sentinelMaster` - The name of the master in Sentinel to get the Redis server address for. +- sentinelMaster - The name of the primary (still referred to as the 'master' in Sentinel) to get the Redis server address for. - `listName` - Name of the Redis List that you want to monitor. - `listLength` - Average target value to trigger scaling actions. - `activationListLength` - Target value for activating the scaler. Learn more about activation [here](./../concepts/scaling-deployments.md#activating-and-scaling-thresholds). (Default: `0`, Optional) @@ -54,7 +54,7 @@ Some parameters could be provided using environmental variables, instead of sett - `addressesFromEnv` - The hosts and their respective ports of the Redis Sentinel nodes, similar to `addresses`, but reads it from an environment variable on the scale target. - `hostsFromEnv` - The hosts of the Redis Sentinel nodes, similar to `hosts`, but reads it from an environment variable on the scale target. - `portsFromEnv` - The corresponding ports for the hosts of the Redis Sentinel nodes, similar to `ports`, but reads it from an environment variable on the scale target. -- `sentinelMasterFromEnv` - The name of the master in Sentinel to get the Redis server address for, similar to `sentinelMaster`, but reads it from an environment variable on the scale target. +- `sentinelMasterFromEnv` - The name of the primary (still referred to as the 'master' in Sentinel) to get the Redis server address for; similar to `sentinelMaster`, but reads it from an environment variable on the scale target. ### Authentication Parameters @@ -65,7 +65,7 @@ You can authenticate by using a password. - `addresses` - Comma separated list of host:port format. - `hosts` - Comma separated list of hostname of the Redis Sentinel nodes. If specified, the `ports` should also be specified. - `ports` - Comma separated list of ports of the Redis Sentinel nodes. If specified, the `hosts` should also be specified. -- `sentinelMaster` - The name of the master in Sentinel to get the Redis server address for. +- `sentinelMaster` - The name of the primary (still referred to as the 'master' in Sentinel) to get the Redis server address for. **Authentication:** @@ -133,7 +133,7 @@ spec: addresses: node1:26379, node2:26379, node3:26379 listName: mylist listLength: "10" - sentinelMaster: "mymaster" + sentinelMaster: "myprimary" authenticationRef: name: keda-trigger-auth-redis-secret ``` diff --git a/content/docs/2.15/scalers/redis-sentinel-streams.md b/content/docs/2.15/scalers/redis-sentinel-streams.md index e8fb32351..1c9d737d0 100644 --- a/content/docs/2.15/scalers/redis-sentinel-streams.md +++ b/content/docs/2.15/scalers/redis-sentinel-streams.md @@ -63,7 +63,7 @@ triggers: - `sentinelUsernameFromEnv` - Name of the environment variable your deployment uses to get the Redis Sentinel username. (Optional) - `sentinelPasswordFromEnv` - Name of the environment variable your deployment uses to get the Redis Sentinel password. (Optional) -- `sentinelMaster` - The name of the master in Sentinel to get the Redis server address for. +- `sentinelMaster` - The name of the primary (still referred to as the 'master' in Sentinel) in Sentinel to get the Redis server address for. - `stream` - Name of the Redis Stream. - `consumerGroup` - Name of the Consumer group associated with Redis Stream. > Setting the `consumerGroup` causes the scaler to operate on `pendingEntriesCount`. Lack of `consumerGroup` will cause the scaler to be based on `streamLength` @@ -80,7 +80,7 @@ Some parameters could be provided using environmental variables, instead of sett - `addressesFromEnv` - The hosts and corresponding ports of Redis Sentinel nodes, similar to `addresses`, but reads it from an environment variable on the scale target. Name of the environment variable your deployment uses to get the URLs of Redis Sentinel nodes. The resolved hosts should follow a format like `node1:26379, node2:26379, node3:26379 ...`. - `hostsFromEnv` - The hosts of the Redis Sentinel nodes, similar to `hosts`, but reads it from an environment variable on the scale target. - `portsFromEnv` - The corresponding ports for the hosts of Redis Sentinel nodes, similar to `ports`, but reads it from an environment variable on the scale target. -- `sentinelMasterFromEnv` - The name of the master in Sentinel to get the Redis server address for, similar to `sentinelMaster`, but reads it from an environment variable on the scale target. +- `sentinelMasterFromEnv` - The name of the primary (still referred to as the 'master' in Sentinel) in Sentinel to get the Redis server address for, similar to `sentinelMaster`, but reads it from an environment variable on the scale target. ### Authentication Parameters @@ -116,7 +116,7 @@ spec: stream: my-stream consumerGroup: consumer-group-1 pendingEntriesCount: "10" - sentinelMaster: "mymaster" + sentinelMaster: "myprimary" ``` #### Using `TriggerAuthentication` @@ -196,7 +196,7 @@ spec: stream: my-stream consumerGroup: consumer-group-1 pendingEntriesCount: "10" - sentinelMaster: "mymaster" + sentinelMaster: "myprimary" authenticationRef: name: keda-redis-stream-triggerauth # name of the TriggerAuthentication resource ``` @@ -226,7 +226,7 @@ spec: passwordFromEnv: REDIS_PASSWORD # name of the environment variable in the Deployment stream: my-stream streamLength: "50" - sentinelMaster: "mymaster" + sentinelMaster: "myprimary" ``` #### Using `lagCount`