Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snapshotter sidecar is deployed by Helm chart even if enableVolumeSnapshot == false #942

Closed
judgeaxl opened this issue Jun 17, 2021 · 7 comments · Fixed by #960
Closed
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@judgeaxl
Copy link

/kind bug

What happened?

When disabling snapshots, the ebs-snapshot-controller is no longer deployed, but the snapshot sidecar in the ebs-csi-controller is still deployed, causing repeated errors in the logs if the snapshot CRDs have not been deployed.

Example:

E0617 04:39:20.714430       1 reflector.go:127] github.com/kubernetes-csi/external-snapshotter/client/v3/informers/externalversions/factory.go:117: Failed to watch *v1beta1.VolumeSnapshotClass: failed to list *v1beta1.VolumeSnapshotClass: the server could not find the requested resource (get volumesnapshotclasses.snapshot.storage.k8s.io)

What you expected to happen?

In addition to not deploying the snapshot controller itself, it should not deploy the snapshot sidecar for the ebs-csi-controller pod, or at least disable it so that I don't get errors about a feature I don't need.

How to reproduce it (as minimally and precisely as possible)?

On a 1.19 or 1.20 cluster without the SnapshotVolumeContent and SnapshotVolumeClass CRDs, deploy the EBS-CSI Helm chart 1.2.0-1.2.2 with enableVolumeSnapshot = false and tail the ebs-csi-controller pod logs.

Anything else we need to know?:

I believe this worked in an earlier version of the chart, but I'm afraid I don't remember in what version.

From what I can tell, the snapshot controller stateful set checks the enableVolumeSnapshot flag, but the ebs-csi-controller does not check it for the csi-snapshotter sidecar container.

Environment

  • Kubernetes version (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.1", GitCommit:"5e58841cce77d4bc13713ad2b91fa0d961e69192", GitTreeState:"clean", BuildDate:"2021-05-12T14:18:45Z", GoVersion:"go1.16.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"20+", GitVersion:"v1.20.4-eks-6b7464", GitCommit:"6b746440c04cb81db4426842b4ae65c3f7035e53", GitTreeState:"clean", BuildDate:"2021-03-19T19:33:03Z", GoVersion:"go1.15.8", Compiler:"gc", Platform:"linux/amd64"}
  • Driver version: From Helm chart 1.2.0-1.2.2
@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Jun 17, 2021
@krmichel
Copy link
Contributor

@wongma7 Do we want to make the snapshot functionality something that can be disabled again? We removed the option from the controller since it moved out of alpha but if it is going to cause issues without the controller then maybe we need the flag still. Did was there ever a decision about whether this chart would continue to install the snapshot controller or expect it to already be installed? It seems like even if we add a toggle back it should be different or work in conjunction with the one to install the controller. Someone who wants the sidecar might have already installed the controller and we don't want to install a second one, right?

@wongma7
Copy link
Contributor

wongma7 commented Jun 17, 2021

If we do offer again an option to disable it, and I'm not sure if we should yet, then I would like to name it differently to disambiguate installing the sidecar (installSnapshotSidecar?) from installing the snapshot controller+CRDs (installSnapshotController?).

This PR is adding functionality to install the CRDs https://github.com/kubernetes-sigs/aws-ebs-csi-driver/pull/938/files#diff-1d5fa767c508a62e96682164464f2a7a08d37010ba05aaaa1b59ec98578fb6feR1, let's discuss further there. IMO it makes sense to install the snapshot controller and CRDs by default, because there are users who expect the EBS helm chart to "just work" i.e. to install the controller and CRDs. It's too early to expect the snapshot controller to already be installed, kops has started to install the controller and CRDs only recently with 1.20. And if the CRDs are already installed, the helm install command fails with a message saying that, so there's no risk of having an extra install. So IMO installSnapshotController would satisfy all use-cases.

@wongma7
Copy link
Contributor

wongma7 commented Jun 17, 2021

Another idea:

snapshotController.enable = true/false
sidecars.snapshotterImage.enable = true/false

@judgeaxl
Copy link
Author

I've got to say I'm happy with any solution that gets rid of the errors.

I think the main issue with baking the CRDs into the deployment would be if they conflict with anything else also installing them, e.g. if the cluster managers start to deploy them by default.

I get the impression though that volume snapshots is an optional feature, and if that's the case, then I think it makes sense to allow the sidecar to be optional too.

@judgeaxl
Copy link
Author

If this project decides the snapshotting feature should be mandatory, and I don't have an opinion either way, then I think installing the CRDs and snapshot controller should be listed as a prerequisite, rather than being sort of half, or even fully installed by this chart. After all, I might have installed them from somewhere else for another reason, they're on a different release cycle, etc.

@wongma7
Copy link
Contributor

wongma7 commented Jul 1, 2021

I think the snapshotter sidecar spitting out errors shouldn't affect the operation of the rest of the driver, but the Pod will show 5/6 containers unhealthy, so I understand obviously it would be better without the errors.

Yes the feature is optional, it's just that with the 3 moving pieces involved (CRDs, snapshot controller, snapshot sidecar) all having to be compatible/updated in lockstep, it's not trivial to provide a single switch to turn it on and off, and clearly our helm chart has failed to do* this since if they are already installed it fails installs a duplicate controller ?, if they are not already installed the user has to know to set enable to true, and even the nit only installs half (the controller).

Baking the CRDs into helm isn't an issue, helm should figure it out for us: https://helm.sh/docs/chart_best_practices/custom_resource_definitions/. So with the fix from #938, if enableVolumeSnapshot is true we'll install the CRDs ,not just the controller.

then I think installing the CRDs and snapshot controller should be listed as a prerequisite, rather than being sort of half, or even fully installed by this chart.

I'm now inclined to agree after a lot of waffling on this issue. originally I was sure we should not install the snapshot controller at all, see #635, because we are essentially duplicating the distribution efforts by e.g. kubernetes distros like kops or the snapshot controller repo itself which provides its own templates. But, at the time kops didn't actually install them by default and the feature (i.e. the CRDS) were still beta, so as you can see the issue closed and we continued maintaining the snapshot controller installation til now.

As of kops 1.20, they install the snapshot controller and CRDs by default, so I think it would be fair for us to in the next helm release (2.0.0?) just stop installing the CRDs and snapshot controller altogether and list it as prerequisite, with a link to snapshot-controller repo...

@wongma7
Copy link
Contributor

wongma7 commented Jul 1, 2021

#635 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
4 participants