-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 Adjust machinepool helper e2e timeout #8739
🐛 Adjust machinepool helper e2e timeout #8739
Conversation
@chrischdi Maybe this is the cause of the flake. What I'm not certain about is how this ever really worked given we call this function right after the patch call and I don't understand how the NodeRefs are updated that quickly in a passing test. |
17f0f9d
to
a618773
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Woah that's a very ugly timing bug!
👍 Huge thanks for digging more into it!
/lgtm |
LGTM label has been added. Git tree hash: 5afd629d9bf973857452bdf686b8bae8a7cbae97
|
/test pull-cluster-api-e2e-full-main |
Let's please merge the CR bump first |
/hold To merge Controller Runtime bump first Thanks for the heads up @sbueringer |
a618773
to
5269466
Compare
5269466
to
c0884ca
Compare
/retest |
lgtm pending the rebase in a bit |
Signed-off-by: killianmuldoon <[email protected]>
c0884ca
to
a26fe0e
Compare
/lgtm feel free to hold cancel obviously :) |
LGTM label has been added. Git tree hash: 90117fb38f2388abfdd93125685f8e3ccf22aad9
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: sbueringer The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold cancel Just saw this flake again in the CI - hopefully this gets ahead of it 😄 |
/cherry-pick release-1.3 |
@killianmuldoon: once the present PR merges, I will cherry-pick it on top of release-1.3 in a new PR and assign it to you. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/cherry-pick release-1.4 |
@killianmuldoon: once the present PR merges, I will cherry-pick it on top of release-1.4 in a new PR and assign it to you. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
We should hold the cherry-picks until we have some signal that this works - but I'd prefer to have them in the queue as a reminder. |
@killianmuldoon: #8739 failed to apply on top of branch "release-1.3":
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@killianmuldoon: #8739 failed to apply on top of branch "release-1.4":
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/area machinepool |
area e2e-testing |
/area e2e-testing |
Adjust the timeout in the PollImmediate call in getMachinePoolInstanceVersions.
In this function we don't get the MachinePool so the nodeRefs stay the same on each call. Because the timeout for this function is 3 minutes per Node and the timeout of the wrapping Eventually call is set at 5 minuted in our end to end test the end result is that we only ever run one
get
request for the MachinePool - there are two nodes each gets a 3 minute timeout.If upgrades aren't finished or it's out of sync when the function is initalized the nodes being looked for are never updated.
Also added some better logging - improving on ##8728 so if this doesn't fix the issue, or if there's additional flakes in future, we might get more information from the logs.
Fixes (Hopefully) #8718