Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: check isScalingEvent only on stable and newRS #3883

Merged
merged 8 commits into from
Oct 17, 2024

Conversation

zachaller
Copy link
Collaborator

@zachaller zachaller commented Oct 9, 2024

This bug can cause rollouts to get stuck in a tight loop of thinking it's constantly in a scaling event. Argo rollouts does not ever update the /desired-replicas annotation on ReplicaSets that are not stable or desired (newRs) there can be races where a ReplicaSet will get marked as old with a value in /desired-replicas that dose not match spec.replicas on the rollout this could either be due to a scaling event right after that rs gets moved to old or a code path that does not update the rs before marking it as old.

This change makes this loop not possible because we will not check all replicasets but instead only check stable and newRs which are the only ones we scale on scaling events.

Example state of issue below: We have an old replicaset that does not match spec.replicas of the rollout this causes isScalingEvent to get into a loop.

Rollout:

apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
spec:
  replicas: 10

Desired/Canary

apiVersion: apps/v1
kind: ReplicaSet
metadata:
  annotations:
    rollout.argoproj.io/desired-replicas: '10'

Old

apiVersion: apps/v1
kind: ReplicaSet
metadata:
  annotations:
    rollout.argoproj.io/desired-replicas: '9'

Stable

apiVersion: apps/v1
kind: ReplicaSet
metadata:
  annotations:
    rollout.argoproj.io/desired-replicas: '10'

Signed-off-by: Zach Aller <[email protected]>
Copy link
Contributor

github-actions bot commented Oct 9, 2024

Published E2E Test Results

  4 files    4 suites   3h 18m 19s ⏱️
113 tests 101 ✅  7 💤 5 ❌
458 runs  424 ✅ 28 💤 6 ❌

For more details on these failures, see this check.

Results for commit 0e92e32.

♻️ This comment has been updated with latest results.

Copy link
Contributor

github-actions bot commented Oct 9, 2024

Published Unit Test Results

2 276 tests   2 276 ✅  2m 59s ⏱️
  128 suites      0 💤
    1 files        0 ❌

Results for commit 0e92e32.

♻️ This comment has been updated with latest results.

Copy link

codecov bot commented Oct 9, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 83.85%. Comparing base (50300e5) to head (0e92e32).
Report is 11 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3883      +/-   ##
==========================================
- Coverage   83.88%   83.85%   -0.04%     
==========================================
  Files         163      163              
  Lines       18564    18564              
==========================================
- Hits        15573    15567       -6     
- Misses       2119     2122       +3     
- Partials      872      875       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Zach Aller <[email protected]>
@zachaller zachaller changed the title chore: do not merge chore: check scaling only on stable and newRS Oct 9, 2024
@zachaller zachaller changed the title chore: check scaling only on stable and newRS fix: check scaling only on stable and newRS Oct 9, 2024
Signed-off-by: Zach Aller <[email protected]>
Signed-off-by: Zach Aller <[email protected]>
@zachaller zachaller changed the title fix: check scaling only on stable and newRS fix: check isScalingEvent only on stable and newRS Oct 11, 2024
@zachaller zachaller marked this pull request as ready for review October 11, 2024 21:21
rollout/sync.go Outdated Show resolved Hide resolved
Copy link
Contributor

@leoluz leoluz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please check my comment

rollout/sync_test.go Outdated Show resolved Hide resolved
Signed-off-by: Zach Aller <[email protected]>
Signed-off-by: Zach Aller <[email protected]>
Signed-off-by: Zach Aller <[email protected]>
Copy link

sonarcloud bot commented Oct 16, 2024

Copy link
Contributor

@leoluz leoluz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zachaller zachaller added this to the v1.8 milestone Oct 17, 2024
@zachaller zachaller merged commit de827d4 into argoproj:master Oct 17, 2024
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants