-
Notifications
You must be signed in to change notification settings - Fork 170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check machine status and log details if it is not running #3887
Check machine status and log details if it is not running #3887
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (after fixing the pointer dereference in the log statement) - I think this functionality could do with some more robust unit tests but that can be a follow-up effort.
3f8d33d
to
25b7a44
Compare
/azp run ci,e2e |
Azure Pipelines successfully started running 2 pipeline(s). |
2f9c36c
to
89c0c33
Compare
/azp run ci,e2e |
Azure Pipelines successfully started running 2 pipeline(s). |
89c0c33
to
8f6ca27
Compare
/azp run ci,e2e |
Azure Pipelines successfully started running 2 pipeline(s). |
8f6ca27
to
8d8c47e
Compare
/azp run ci,e2e |
Azure Pipelines successfully started running 2 pipeline(s). |
6f6519a
to
407e4f3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except for one spot where I think it's worth checking for nil pointers.
407e4f3
to
9ac41f2
Compare
/azp run ci,e2e |
Azure Pipelines successfully started running 2 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
* Check machine status and log details if it is not running * Resolve comments from review
Which issue this PR addresses:
Provides more data for ARO-4247
What this PR does / why we need it:
It is hard to troubleshoot worker node creation failures for new cluster installation. We wait for worker nodes, but sometimes they never show up because there was an Azure failure creating the VM. CAPI logs for machine creation happen too early and don't make it to Kusto.
Test plan for issue:
Deploy test clusters and make sure we get the logs desired. Centraluseuap would be a good place to start if the error creating availability sets reproduces.
Happy-path logs look like the following: