-
Notifications
You must be signed in to change notification settings - Fork 144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
install fails if enroll fails and surface errors #3554
Conversation
* surface errors that might occur during enroll * fail install command if agent cannot be restarted * do not print success message if there was an enroll error. Print an error message and the error instead * add logs to show the different enroll attempts * add more context t errors * refactor internal/pkg/agent/install/perms_unix.go and add more context to errors restore main version * ignore agent restart error on enroll tests as there is no agent to be restarted * daemonReloadWithBackoff does not retry on context deadline exceeded
This pull request is now in conflicts. Could you fix it? 🙏
|
This pull request does not have a backport label. Could you fix it @AndersonQ? 🙏
NOTE: |
/test |
🌐 Coverage report
|
Pinging @elastic/elastic-agent (Team:Elastic-Agent) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside from the usual backoff awkwardness (not the point of the PR) there are just a couple of small nitpicks.
I agree with @ycombinator that, since we squash commits it would be better to separate small typo corrections or refactors in a dedicated PR
if err != nil && | ||
// There is no agent running, therefore nothing to be restarted. | ||
// However, this will cause the Enroll command to return an error | ||
// which we'll ignore here. | ||
!strings.Contains(err.Error(), | ||
"could not reload agent daemon, unable to trigger restart") { | ||
t.Fatalf("enrrol coms returned and unexpected error: %v", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the description it seems that we expect the error about not being able to trigger a restart, we can simplify with:
if err != nil && | |
// There is no agent running, therefore nothing to be restarted. | |
// However, this will cause the Enroll command to return an error | |
// which we'll ignore here. | |
!strings.Contains(err.Error(), | |
"could not reload agent daemon, unable to trigger restart") { | |
t.Fatalf("enrrol coms returned and unexpected error: %v", err) | |
} | |
// There is no agent running, therefore nothing to be restarted. | |
// However, this will cause the Enroll command to return an error | |
// which we'll ignore here. | |
require.ErrorContainsf(t, err, "could not reload agent daemon, unable to trigger restart", "enroll command returned an unexpected error: %v", err) |
SonarQube Quality Gate |
* fix install/enroll cmd not failing when agent restart fails * surface errors that might occur during enroll * fail install command if agent cannot be restarted * do not print success message if there was an enroll error. Print an error message and the error instead * add logs to show the different enroll attempts * add more context t errors * refactor internal/pkg/agent/install/perms_unix.go and add more context to errors restore main version * ignore agent restart error on enroll tests as there is no agent to be restarted * daemonReloadWithBackoff does not retry on context deadline exceeded and context cancelled * fix typos (cherry picked from commit f7e558f)
* fix install/enroll cmd not failing when agent restart fails * surface errors that might occur during enroll * fail install command if agent cannot be restarted * do not print success message if there was an enroll error. Print an error message and the error instead * add logs to show the different enroll attempts * add more context t errors * refactor internal/pkg/agent/install/perms_unix.go and add more context to errors restore main version * ignore agent restart error on enroll tests as there is no agent to be restarted * daemonReloadWithBackoff does not retry on context deadline exceeded and context cancelled * fix typos (cherry picked from commit f7e558f) Co-authored-by: Anderson Queiroz <[email protected]>
This reverts commit f7e558f.
For the agent to be actually enrolled it needs to restart after the enroll process is completed, so it'll pickup the new config and "connect" to fleet-server.
This change makes the enroll command to fail if it cannot restart the agent after enrolling on fleet
What does this PR do?
This change makes the enroll command to fail if it cannot restart the agent after enrolling on fleet
Why is it important?
For the agent to be actually enrolled it needs to restart after the enroll process is completed, so it'll pickup the new config and "connect" to fleet-server.
Checklist
[ ] I have commented my code, particularly in hard-to-understand areas[ ] I have made corresponding changes to the documentation[ ] I have made corresponding change to the default configuration files[ ] I have added tests that prove my fix is effective or that my feature works[ ] I have added an entry in./changelog/fragments
using the changelog tool[ ] I have added an integration test or an E2E testRelated issues
Questions to ask yourself