Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto-scale vtgate with HPA #598

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

siadat
Copy link

@siadat siadat commented Aug 30, 2024

Problem

We'd like to be able to auto-scale number of vtgate replicas using HorizontalPodAutoscaler.

Solution

In order to make a CRD scalable by HPA, it should define the Scale subresource.

We also make sure if a vtgate is being autoscaled, the replicas value is not copied from VitessCluster. Otherwise, autoscaled values will be overwritten.

Context

https://vitess.slack.com/archives/CNE9WP677/p1725015568336609

Signed-off-by: Sina Siadat <[email protected]>
@siadat siadat changed the title Make VitessCell scalable by HPA Scale vtgate with HPA Sep 10, 2024
@siadat siadat changed the title Scale vtgate with HPA Auto-scale vtgate with HPA Sep 10, 2024
Copy link
Member

@frouioui frouioui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution 🚀 Would it be possible to add an end-to-end test for this? We have examples of other features being covered by E2E tests under ./test/endtoend/.

@@ -0,0 +1,68 @@
/*
Copyright 2019 PlanetScale Inc.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should be 2024

Suggested change
Copyright 2019 PlanetScale Inc.
Copyright 2024 PlanetScale Inc.

Comment on lines +65 to +68
autoscaler:
properties:
behavior:
properties:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ./test/endtoend/operator/operator-latest.yaml file we use for our tests must be updated with the newly compiled CRDs.

To do this, you can use the following command: kustomize build ./deploy > build/_output/operator.yaml. The output found in ./build/_output/operator.yamlcan be copied over the test file.

The changes made to the Deployment named vitess-operator will have to be reverted before pushing, as the new file you'll copy will override some defaults we use for our tests.

Finally, the changes made to ./test/endtoend/operator/operator-latest.yaml will have to be copied to the Vitess repository in a separate PR, into the file ./examples/operator/operator.yaml.

Comment on lines 27 to 28
// Spec specifies all the internal parameters needed to deploy vtgate,
// as opposed to the API type planetscalev2.VitessCellGatewaySpec, which is the public API.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit picking here too, the comment should begin with HpaSpec, it throws a warning in the IDE otherwise

Comment on lines +193 to +206
err = r.reconciler.ReconcileObject(ctx, vtc, key, labels, wantHpa, reconciler.Strategy{
Kind: &autoscalingv2.HorizontalPodAutoscaler{},

New: func(key client.ObjectKey) runtime.Object {
return vtgate.NewHorizontalPodAutoscaler(key, hpaSpec)
},
UpdateInPlace: func(key client.ObjectKey, obj runtime.Object) {
newObj := obj.(*autoscalingv2.HorizontalPodAutoscaler)
vtgate.UpdateHorizontalPodAutoscaler(newObj, hpaSpec)
},
})
if err != nil {
resultBuilder.Error(err)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The HorizontalPodAutoscaler resource is not something that the operator or even the ReconcileVitessCell object know about. It leads to the error I attached below and that can be found in vitess-operator's logs.

 pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:229: Failed to watch *v2.HorizontalPodAutoscaler: failed to list *v2.HorizontalPodAutoscaler: horizontalpodautoscalers.autoscaling is forbidden: User "system:serviceaccount:default:vitess-operator" cannot list resource "horizontalpodautoscalers" in API group "autoscaling" in the namespace "default"

We must give the ReconcileVitessCell object the proper information to start watching this resource: this can be done by adding *v2.HorizontalPodAutoscaler to the watchResources slice in the vitesscell pkg.

Moreover, the K8S roles needs to be updated to give the proper permission for K8S to communicate with the API, this can be done by adding the following snippet to the ./deploy/role.yaml file.

- apiGroups:
  - autoscaling
  resources:
  - horizontalpodautoscalers
  verbs:
  - '*'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this change, creating a cluster does not fail as the vitess-operator can still run. But no HPA is created:

$ kubectl get hpa
No resources found in default namespace.

With this change however, it looks good, vitess-operator is healthy and we can see the HPA:

$ kubectl get hpa
NAME                            REFERENCE                                  TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
example-zone1-vtgate-bc6cde92   VitessCell/example-zone1-vtgate-bc6cde92   <unknown>/10%   1         10        0          33s

Note that the target is unknown because my metrics-server was disabled at the time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants