Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement osqueryd health check #141

Closed
zwass opened this issue Sep 15, 2017 · 0 comments
Closed

Implement osqueryd health check #141

zwass opened this issue Sep 15, 2017 · 0 comments

Comments

@zwass
Copy link
Contributor

zwass commented Sep 15, 2017

Perhaps by issuing a periodic select 1 to ensure that osqueryd is still responding to messages over the extension socket. If this fails, we could initiate a restart of osqueryd or the entire launcher.

zwass added a commit that referenced this issue Oct 7, 2017
- Correctly detect when error channel is closed (potential fix for #134).
  Previously the logic was inverted for whether the channel was closed, so
  recovery was not initiated. Unit test TestOsqueryDies repros the suspected
  issue.
- Allow logger to be set properly.
- Add logging around recovery scenarios.
- Check communication with both osquery and extension server in health check
  (previously only the extension server was checked).
- Add healthcheck on interval that causes recovery on failure (Closes #141).
zwass added a commit that referenced this issue Oct 7, 2017
- Correctly detect when error channel is closed (potential fix for #134).
  Previously the logic was inverted for whether the channel was closed, so
  recovery was not initiated. Unit test TestOsqueryDies repros the suspected
  issue.
- Allow logger to be set properly.
- Add logging around recovery scenarios.
- Check communication with both osquery and extension server in health check
  (previously only the extension server was checked).
- Add healthcheck on interval that causes recovery on failure (Closes #141).
- Do not set cmd output to ioutil.Discard. Causes a bug with cmd.Wait (see
  golang/go#20730)
zwass added a commit that referenced this issue Oct 12, 2017
- Correctly detect when error channel is closed (potential fix for #134).
  Previously the logic was inverted for whether the channel was closed, so
  recovery was not initiated. Unit test TestOsqueryDies repros the suspected
  issue.
- Allow logger to be set properly.
- Add logging around recovery scenarios.
- Check communication with both osquery and extension server in health check
  (previously only the extension server was checked).
- Add healthcheck on interval that causes recovery on failure (Closes #141).
- Do not set cmd output to ioutil.Discard. Causes a bug with cmd.Wait (see
  golang/go#20730)
marpaia pushed a commit that referenced this issue Oct 17, 2017
- Changes to public API to better reflect actual usage and ease implementation.
- Use errgroup for coordination of process management/cleanup. This helps
  prevent leaking of goroutines (relative to existing implementation).
- Fix bug in which osquery process was not restarted after failure.
- Allow logger to be set properly.
- Add logging around recovery scenarios.
- Check communication with both osquery and extension server in health check
  (previously only the extension server was checked).
- Add healthcheck on interval that initiates recovery on failure (Closes #141).
- Do not set cmd output to ioutil.Discard. Causes a bug with cmd.Wait (see
  golang/go#20730).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants