-
Notifications
You must be signed in to change notification settings - Fork 0
Implement Health Checks w/ Rollback #29
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot find the health-checker.sh
file, I think it was not commited by mistake (you may need to change .gitignore
(?)).
I have not yet had time to review the changes and test some specificities in Docker/Bash behaviour myself, but am already leaving some general comments to get the ball rolling.
Also, it seems that CircleCI did not run for this PR, what gives? It should have ran shellcheck which would have caught some of the issues highlighted as comments. I guess we'll have to look into that.
Right, I'm still new to this git/github stuff, not sure why the file is missing 🙃 I'll try to add it back later |
debc517
to
c5691de
Compare
Added the missing file. |
Added health checks for nijobs-be on deployment
c5691de
to
f1788e6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some small changes, overall is looking good and almost ready to merge.
However, there are two issues, IMO:
- CI did not run for this PR, and there are a lot of issues that shellcheck flags (I noticed a half-dozen variable quoting issues, at least), so they should be fixed;
- We should consider the effects of a rollback in regards to affecting state that is not controlled by this deploy (e.g. external databases). Are there corner cases that should be considered? Is it possible to run, the app will not be healthy, and rolling back would result in dirty(er?) state?
Finally, these are kind of big changes so they should be thoroughly tested. @imnotteixeira did you test this locally? I can test the healthcheck part locally at least, I think, as I don't currently have the setup for the fully integrated test. Can anyone help with this?
deployments/health-checker.sh
Outdated
|
||
# According to 1 retry every 10 seconds, this will try for 5 minutes | ||
local MAX_ATTEMPTS=26 | ||
local RETRY_INTERVAL_SECONDS=1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, was testing this manually, so I wanted to to see it faster
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was not updated to use declare -r
nor readonly
. We should do that, but this is not reason enough to block the PR any longer if this is the only issue
a3c9da2
to
9741b75
Compare
9741b75
to
ab268d0
Compare
I did test the health checker locally, but when trying to deploy something, it changes to the master branch, and loses the changes in this PR, so not sure how to do it |
That should not happen. Are you trying to use this to deploy itself? Make sure to clone the repositories to deploy locally into the deployments folder, and then run the deploy script with the correct project name and branch. |
@miguelpduarte Finally managed to test this, and it is working as expected! Since I had some problem setting the env up, I got to verify it rolls back on service launch failure (cleans up new container, and in case there is an old one, reboots it) Also checked it succeeds the health check when service actually works. I think we're green to merge this! |
Closes #28
Created a function to health check any web based service (via cURL)
This will try the configured endpoint for 5 minutes, on 10 second intervals. If the checks never return 200, the check fails and the new deployment rolls back (new container is removed, and old one is restarted)
Onboarded
nijobs-be
for this new feature in the project configs.