Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add validation of Kubernetes feature gates #280

Merged
merged 2 commits into from
Jun 9, 2021

Conversation

stoyanr
Copy link
Contributor

@stoyanr stoyanr commented Jun 3, 2021

How to categorize this PR?

/area usability
/area ops-productivity
/kind enhancement
/platform gcp

What this PR does / why we need it:
Adds validation of Kubernetes feature gates (in CloudControllerManager field of the ControlPlaneConfig custom resource), as requested in gardener/gardener#3987 and #263.

Which issue(s) this PR fixes:
Fixes #263

Special notes for your reviewer:

  • Also vendors g/g v1.24.1-0.20210608063816-580010846480 (and k8s v0.20.7) to be able to use the ValidateFeatureGates function.

Release note:

When creating or updating shoots, any Kubernetes feature gates mentioned are validated against the Kubernetes version. If any feature gates are unknown or not supported in the Kubernetes version, the validation fails.

@stoyanr stoyanr requested review from a team as code owners June 3, 2021 07:51
@gardener-robot gardener-robot added area/ops-productivity Operator productivity related (how to improve operations) area/usability Usability related kind/enhancement Enhancement, improvement, extension platform/gcp Google cloud platform/infrastructure needs/review Needs review size/m Size of pull request is medium (see gardener-robot robot/bots/size.py) labels Jun 3, 2021
@stoyanr stoyanr marked this pull request as draft June 3, 2021 07:52
@gardener-robot-ci-3 gardener-robot-ci-3 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jun 3, 2021
@gardener-robot-ci-1 gardener-robot-ci-1 added needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Jun 3, 2021
@stoyanr
Copy link
Contributor Author

stoyanr commented Jun 3, 2021

/invite @rfranzke @BeckerMax @voelzmo

Similar PR should be opened for all other provider extensions. I will do this after gardener/gardener#4149 and this one are merged.

@stoyanr stoyanr force-pushed the add-feature-gates-validation branch from 8a955b8 to 4a482b0 Compare June 3, 2021 08:00
@gardener-robot-ci-3 gardener-robot-ci-3 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jun 3, 2021
@gardener-robot-ci-1 gardener-robot-ci-1 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jun 3, 2021
@voelzmo
Copy link
Member

voelzmo commented Jun 4, 2021

Hey @stoyanr, thanks for the ping! I'll set aside some time to really look at this and the corresponding gardener PR later today. Looking at the sheer numbers of the PR, I'm already feeling kind of anxious to look at ~50k changed lines within a single commit. I'll trust that there's a lot of things going on that I don't really need to care about in detail (e.g. updating vendored dependencies) – would it be possible to split these actions out to a separate commit in the future, such that I can easily select what to focus the review on? Thanks!

@stoyanr
Copy link
Contributor Author

stoyanr commented Jun 4, 2021

@voelzmo The ~50k changed lines come from the revendor. This is really quite normal, nothing to worry about. You shouldn't look into all the vendor files at all. No, it's not possible (or not at all required) to split this commit into multiple ones, without the revendor I wouldn't be able to use the ValidateFeatureGates function that's only introduced in a newer gardener (that's based on a newer kubernetes).

When you ignore go.sum and all the vendor files, it's really a rather small PR.

@voelzmo
Copy link
Member

voelzmo commented Jun 4, 2021

Hey @stoyanr, thanks for your perspective! Just to clarify: I'm not asking to make this two separate PRs, it is just about committing things separately if they can be separated – some people call this working mode micro commits. I understand that this is not required at all, it is just that for me as a reviewer having an easy means to separate things to look at during a review makes my life much easier (e.g. by using the 'only look at changes from a single commit' dropdown during the review).

TL;DR
you, as the code author, say

When you ignore go.sum and all the vendor files, it's really a rather small PR.

I, as a reviewer, ask

please make it easy for me to understand that this is a rather small PR by separating the stuff you "just had to do" (e.g. vendoring, formatting, renaming variables that were already there) from the things your change is about (introducing the additional validation logic)

I'm not sure if this is a thing y'all have been discussing in the team already, I just wanted to share my feelings when opening a PR tab with big numbers of changed lines and a very low number of commits ;)

@stoyanr
Copy link
Contributor Author

stoyanr commented Jun 4, 2021

@voelzmo Please understand that what I am doing here is a standard practice. Micro commits is what we do as well, no need to discuss it here. This is a micro commit. You simply have to ignore vendor in this case.

Once again: I need to use a function from a newer gardener, which is itself based on a newer kubernetes. There is no other way to do it besides vendoring this new gardener / kubernetes. This brings all these 50k changes you are worried about, and which you basically should ignore. In extension projects, this happens all the time, you will find a number of such PRs already merged.

I would have agreed it makes sense to have the revendor itself in a different commit if after the revendor a lot of changes were needed to the project (due to incompatible changes, etc.). However, this is not the case. I therefore see no value in this (vendor is very easy to ignore) and since it adds effort, I wouldn't do it.

If you still have concerns about it, let's discuss it privately first before continuing the discussion here.

@stoyanr stoyanr force-pushed the add-feature-gates-validation branch from 4a482b0 to c95ec3c Compare June 4, 2021 14:09
@gardener-robot-ci-2 gardener-robot-ci-2 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jun 4, 2021
@gardener-robot-ci-1 gardener-robot-ci-1 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jun 4, 2021
@stoyanr
Copy link
Contributor Author

stoyanr commented Jun 4, 2021

@voelzmo I split it into 2 commits after all, I realized it might be a bit more challenging if you are not used to working with such commits. Hopefully it's easier for you now.

@voelzmo
Copy link
Member

voelzmo commented Jun 4, 2021

Highly appreciated, thanks!

@gardener-robot
Copy link

@rfranzke, @BeckerMax You have pull request review open invite, please check

@gardener-robot gardener-robot removed the needs/review Needs review label Jun 8, 2021
@stoyanr
Copy link
Contributor Author

stoyanr commented Jun 8, 2021

Infrastructure tests fail with "Invalid service account email (@sap-gcp-k8s-canary-custom.iam.gserviceaccount.com)." (nothing to do with my change). Any idea how to fix this?

@rfranzke
Copy link
Member

rfranzke commented Jun 8, 2021

@dguendisch @schrodit Can you check, please?

@schrodit
Copy link

schrodit commented Jun 8, 2021

@stoyanr we had a look and found the issue:

In fact you did one change to the tests.
This change passes the infra extension to the WaitUntilExtensionObjectDeleted function. Later that same object is used in the verifyDeletion function to calculate the service account name.
see

serviceAccountName := getServiceAccountName(project, infra.Namespace)

We suspect that the infra object is cleared in WaitUntilExtensionObjectDeleted so that the namespace is empty.
With that, the service name could not be correctly resolved.
So you should be able to fix the issue by creating a deep copy of the infra object and pass the copy to the verifyDeletion func.

@stoyanr stoyanr force-pushed the add-feature-gates-validation branch from ae6cef4 to 63385dd Compare June 8, 2021 14:42
@gardener-robot gardener-robot added the needs/review Needs review label Jun 8, 2021
@gardener-robot-ci-1 gardener-robot-ci-1 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jun 8, 2021
@stoyanr
Copy link
Contributor Author

stoyanr commented Jun 8, 2021

@schrodit Thanks, I fixed it now.

@gardener-robot-ci-1 gardener-robot-ci-1 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jun 8, 2021
@vpnachev
Copy link
Member

vpnachev commented Jun 8, 2021

/test

@testmachinery
Copy link

testmachinery bot commented Jun 8, 2021

Testrun: e2e-gfds4
Workflow: e2e-gfds4-wf
Phase: Succeeded

+---------------------+---------------------+-----------+----------+
|        NAME         |        STEP         |   PHASE   | DURATION |
+---------------------+---------------------+-----------+----------+
| infrastructure-test | infrastructure-test | Succeeded | 8m21s    |
+---------------------+---------------------+-----------+----------+

@stoyanr
Copy link
Contributor Author

stoyanr commented Jun 9, 2021

/cc @rfranzke @ialidzhikov Tests are passing now, could you please take another look?

Copy link
Member

@rfranzke rfranzke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@gardener-robot gardener-robot added reviewed/lgtm Has approval for merging and removed needs/review Needs review labels Jun 9, 2021
@timebertt
Copy link
Member

FYI:

We suspect that the infra object is cleared in WaitUntilExtensionObjectDeleted so that the namespace is empty.

Yes that's the case. I fixed it in gardener/gardener#4191 (namely the first commit).
So once that change is vendored, there shouldn't be the need to DeepCopy the object before passing it down anymore.

@ialidzhikov
Copy link
Member

/invite @ialidzhikov

Sorry for the long delay, I will try to check this PR later today. :(

Copy link
Member

@ialidzhikov ialidzhikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Generally I had some doubts for OOT CCMs that they accept the same set of feature gates, but I now checked the --help of OOT CCMs that we currently use (alicloud and openstack) and it seems that even the OOT CCMs accept the same set of feature gates. So, lgtm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ops-productivity Operator productivity related (how to improve operations) area/usability Usability related kind/enhancement Enhancement, improvement, extension kind/test Test needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) platform/gcp Google cloud platform/infrastructure reviewed/lgtm Has approval for merging size/m Size of pull request is medium (see gardener-robot robot/bots/size.py)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ControllerManager Feature Gate CustomResourceValidation no longer exists