feat: add activation feature for CPU/Memory scaler #6231

kunwooy · 2024-10-11T01:34:39Z

Currently, the cpu & memory scaler lacks the activation feature because it delegates the scaling responsibilities to the built-in Kubernetes HPA controller. As a result, even if the scale target is currently scaled-out by the cpu/memory metric being above the threshold value, if some other Keda scalers which use External Metrics are used in conjunction with the cpu/memory scaler, it will be deactivated (and thus scaled to zero) when all other scalers using External Metrics are deactivated.

Hence, my proposal is to introduce a way to check the activation of the cpu/memory scaler. Since the scaling behavior will be handled by the HPA controller, cpu/memory scaler only needs to feed in the activation value to the scaled object controller in its GetMetricsAndActivity() method. Moreover to enable such feature, I introduce activationValue field in cpu/memory trigger's metadata.

(I have accidentally closed the last pull request: #6174)

Checklist

When introducing a new scaler, I agree with the scaling governance policy
I have verified that my change is according to the deprecations & breaking changes policy
Tests have been added
Changelog has been updated and is aligned with our changelog requirements
A PR is opened to update the documentation on (repo) (if applicable)
Commits are signed with Developer Certificate of Origin (DCO - learn more)

Fixes #6057

Relates to #

kunwooy · 2024-10-11T01:35:32Z

@JorTurFer You suggested that I query the metrics server directly instead of querying HPA. I have modified the code (#6174 (comment)).

If approved, modification to the helm chart is needed since the operator's service account needs permission on pods.metrics.k8s.io APIs

pkg/scalers/cpu_memory_scaler.go

SpiritZhou · 2024-10-16T05:44:43Z

Do you mind updating the CHANGELOG.md as well?

kunwooy · 2024-10-16T08:29:06Z

@SpiritZhou I updated the CHANGELOG.md too.

zroubalik

Thanks, I quickly checked the code, I haven't read full the getAverage... functions.

But I would like to see more e2e tests (covering cases like scale 0->1, 1->0, activation with just a single cpu/mem scaler, etc ...to cover all possible cases).

I also think that this feature should be released first as and experimental one.

pkg/k8s/metricsclient.go

zroubalik · 2024-11-05T21:26:17Z

pkg/scalers/cpu_memory_scaler.go

-	err := config.TypedConfig(&meta)
+func getScaleTarget(scalableObjectName, scalableObjectNamespace string, kubeClient client.Client) (string, string, error) {
+	scaledObject := &kedav1alpha1.ScaledObject{}
+	err := kubeClient.Get(context.Background(), types.NamespacedName{


is there a specific reason to use context.Background()? If not, then we should pass ctx from the top level here.

I used context.Background() because neither of its top level functions parseResourceMetadata() and NewCPUMemoryScaler has context. If there is any other way, I'd be happy to know

feel free to add the param there, you can see that ctx is also added to a few other scalers

pkg/scalers/cpu_memory_scaler.go

zroubalik · 2024-11-05T21:33:36Z

pkg/scalers/cpu_memory_scaler.go

-func parseResourceMetadata(config *scalersconfig.ScalerConfig, logger logr.Logger) (cpuMemoryMetadata, error) {
-	meta := cpuMemoryMetadata{}
-	err := config.TypedConfig(&meta)
+func getScaleTarget(scalableObjectName, scalableObjectNamespace string, kubeClient client.Client) (string, string, error) {


please add comment explaining this function

zroubalik · 2024-11-05T21:34:11Z

pkg/scalers/cpu_memory_scaler.go

 	default:
 		return meta, fmt.Errorf("unknown metric type: %s, allowed values are 'Utilization' or 'AverageValue'", string(meta.MetricType))
 	}

+	if config.ScalableObjectType == "ScaledObject" {


we should fail for other types imho

Are you suggesting that I do not check for config.ScalableObjectType in the current code, and instead return an error for other types? If that's so, should I parse the error string to see if the error is related to type? Because config.ScalableObjectType not being ScaledObject should not result in the error of the parseResourceMetadata() function.

basically add else and if the type is different fire an error mentioning that the type is not supported

pkg/scalers/cpu_memory_scaler.go

zroubalik · 2024-11-05T21:40:52Z

pkg/scalers/cpu_memory_scaler.go

+
+		labelSelector = labels.SelectorFromSet(statefulSet.Spec.Selector.MatchLabels)
+	default:
+		return nil, nil, nil


What about other types?

See if this can be reused:

keda/pkg/scaling/resolver/scale_resolvers.go

Line 96 in 4eb7149

gvk := obj.Status.ScaleTargetGVKR.GroupVersionKind()

I added GVK as the means to reference the scale target. However in the default case, there is still no way to fetch the MatchLabels field which is required to select the corresponding PodMetrics

you can try to check if we can create a duck type that would desribe the resource, similar as it is done in the referenced code for podspec or here https://github.com/kedacore/keda/blob/main/apis/keda/v1alpha1/withtriggers_types.go

If we are not able to do that, then we should check supported scaletargets when we are creating the scaler and fail in that case

kunwooy · 2024-11-05T23:45:53Z

Thanks, I quickly checked the code, I haven't read full the getAverage... functions.

But I would like to see more e2e tests (covering cases like scale 0->1, 1->0, activation with just a single cpu/mem scaler, etc ...to cover all possible cases).

I also think that this feature should be released first as and experimental one.

@zroubalik I have added comments and made some changes you pointed out.

However about the e2e test, you should note that this feature does not enable CPU/Memory scalers to activate itself and scale from zero because it is impossible to collect CPU/Memory data when there is no pod running. So the only difference this commit brings is the case where a CPU/Memory scaler is used with other external trigger scaler, and that external trigger is deactivated while the cpu/memory trigger is activated. When that happens, the deployment/statefulset should not scale to zero (while previously there were no activation feature for cpu/memory trigger and thus was scaled to zero). You can check out the e2e test for the above case here:

https://github.com/kunwooy/keda/blob/d5ad0944617d9a4e1efac5685dbcb3ed9cf61507/tests/scalers/cpu/cpu_test.go?plain=1#L255

Also, how do I move the feature to an experimental feature?

Signed-off-by: kunwooy <[email protected]>

zroubalik

ad experimental feature - this is just a matter of documentation and we should also pring info log message when a scaler with this feature is created

zroubalik · 2024-11-07T11:20:27Z

pkg/scalers/cpu_memory_scaler.go

 	default:
 		return meta, fmt.Errorf("unknown metric type: %s, allowed values are 'Utilization' or 'AverageValue'", string(meta.MetricType))
 	}

+	if config.ScalableObjectType == "ScaledObject" {


basically add else and if the type is different fire an error mentioning that the type is not supported

zroubalik · 2024-11-07T11:24:25Z

pkg/scalers/cpu_memory_scaler.go

+
+		labelSelector = labels.SelectorFromSet(statefulSet.Spec.Selector.MatchLabels)
+	default:
+		return nil, nil, nil


you can try to check if we can create a duck type that would desribe the resource, similar as it is done in the referenced code for podspec or here https://github.com/kedacore/keda/blob/main/apis/keda/v1alpha1/withtriggers_types.go

If we are not able to do that, then we should check supported scaletargets when we are creating the scaler and fail in that case

kunwooy · 2024-11-10T10:26:27Z

@zroubalik I'm currently unavailable right now. I'll make the changes and let you know within next week.

kunwooy requested a review from a team as a code owner October 11, 2024 01:34

semgrep-app bot reviewed Oct 11, 2024

View reviewed changes

pkg/scalers/cpu_memory_scaler.go Outdated Show resolved Hide resolved