Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 MachineSet should allow scale down operations to proceed when templates don't exist #10913

Merged
merged 1 commit into from
Jul 24, 2024

Conversation

vincepri
Copy link
Member

@vincepri vincepri commented Jul 19, 2024

We're currently always trying to check that the controller can access and set owner references on MachineSets, even if no operations are ongoing. If a MachineSet is scaling down, we should not block operations if the external reference isn't present anymore (was deleted) and we should let the controller proceed.

In most gitops scenarios, it's preferable that any change being rolled out it's done by setting new template references to new resources. For example, let's say we have a MachineDeployment/MachineSet with a boostrap and infrastructure reference set. To apply a rollout and change to all Machines under management, new templates need to be created, and the references changed to the new templates, which will cause a rollout.

With gitops flows, usually new resources replacing old ones have the side effect that old resources are cleaned up almost immediately; this causes the MachineSet controller to stop scale down (and any rollout) operations due to the references returning a not found error.

What this PR does / why we need it:

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Related to #6588 (comment)

/area machineset

@k8s-ci-robot k8s-ci-robot added area/machineset Issues or PRs related to machinesets cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jul 19, 2024
@vincepri vincepri changed the title 🐛 MachineSet should try to check on the MachineSet templates only… 🐛 MachineSet should try to check on the MachineSet templates only during scale up Jul 19, 2024
@k8s-ci-robot k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Jul 19, 2024
@vincepri vincepri changed the title 🐛 MachineSet should try to check on the MachineSet templates only during scale up 🐛 MachineSet should try to check on templates only during scale up Jul 19, 2024
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jul 19, 2024
Copy link
Member

@sbueringer sbueringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One nit

@vincepri vincepri changed the title 🐛 MachineSet should try to check on templates only during scale up 🐛 MachineSet should allow scale down operations to proceed when templates don't exist Jul 22, 2024
@vincepri vincepri added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Jul 22, 2024
@sbueringer
Copy link
Member

/lgtm

/assign @fabriziopandini

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 22, 2024
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: ba4848aad06fd02e84885eb39ad07e6e54103205

@fabriziopandini
Copy link
Member

/test ?

@k8s-ci-robot
Copy link
Contributor

@fabriziopandini: The following commands are available to trigger required jobs:

  • /test pull-cluster-api-build-main
  • /test pull-cluster-api-e2e-blocking-main
  • /test pull-cluster-api-e2e-conformance-ci-latest-main
  • /test pull-cluster-api-e2e-conformance-main
  • /test pull-cluster-api-e2e-main
  • /test pull-cluster-api-e2e-mink8s-main
  • /test pull-cluster-api-e2e-upgrade-1-30-1-31-main
  • /test pull-cluster-api-test-main
  • /test pull-cluster-api-test-mink8s-main
  • /test pull-cluster-api-verify-main

The following commands are available to trigger optional jobs:

  • /test pull-cluster-api-apidiff-main

Use /test all to run the following jobs that were automatically triggered:

  • pull-cluster-api-apidiff-main
  • pull-cluster-api-build-main
  • pull-cluster-api-e2e-blocking-main
  • pull-cluster-api-test-main
  • pull-cluster-api-verify-main

In response to this:

/test ?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 22, 2024
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jul 22, 2024
@vincepri
Copy link
Member Author

/test pull-cluster-api-e2e-main

@sbueringer
Copy link
Member

Thx!

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 23, 2024
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: f446ddb2df6d4e960060392a243b15f087bef92d

@sbueringer
Copy link
Member

/test pull-cluster-api-e2e-main

Copy link
Member

@fabriziopandini fabriziopandini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: fabriziopandini

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 23, 2024
@sbueringer
Copy link
Member

sbueringer commented Jul 23, 2024

/hold

I think second time in a row this test case is failing. As far as I know it's not (this) flaky on main

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 23, 2024
@sbueringer
Copy link
Member

/test pull-cluster-api-e2e-main

@sbueringer
Copy link
Member

sbueringer commented Jul 24, 2024

Also saw it on other PRs I think. Let's check periodics and then merge (just trying to get tests stable so we should be careful to not introduce new flakes)

@sbueringer
Copy link
Member

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 24, 2024
@k8s-ci-robot k8s-ci-robot merged commit 0908bfc into kubernetes-sigs:main Jul 24, 2024
22 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.8 milestone Jul 24, 2024
@vincepri
Copy link
Member Author

/cherry-pick release-1.7

@k8s-infra-cherrypick-robot

@vincepri: #10913 failed to apply on top of branch "release-1.7":

Applying: 🐛 MachineSet should allow scale down operations to proceed when templates don't exist
Using index info to reconstruct a base tree...
M	internal/controllers/machineset/machineset_controller.go
M	internal/controllers/machineset/machineset_controller_test.go
Falling back to patching base and 3-way merge...
Auto-merging internal/controllers/machineset/machineset_controller_test.go
Auto-merging internal/controllers/machineset/machineset_controller.go
CONFLICT (content): Merge conflict in internal/controllers/machineset/machineset_controller.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 🐛 MachineSet should allow scale down operations to proceed when templates don't exist
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherry-pick release-1.7

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/machineset Issues or PRs related to machinesets cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants