-
-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add an st2canary pod that validates st2.packs.volumes #323
add an st2canary pod that validates st2.packs.volumes #323
Conversation
@armab I want to get this in the next release as it should help isolate issues with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering what this Pod does, verifying write filesystem access, can we move the logic to the original tests
dir, maybe even possible under the BATS tests?
It can be the same pod
https://github.com/StackStorm/stackstorm-k8s/blob/master/templates/tests/st2tests-pod.yaml
or perhaps a new one dedicated to volume testing?
And so it could be run with the same testing mechanism helm test
exactly when it's needed for troubleshooting, instead of running on every Helm deployment which feels a bit off for production.
I've never thought of using |
@mamercad What do you think of this pod? We often get questions about getting the volumes working, but the errors are very hard to debug by the time st2 is running. So, I'm trying to prevent people from installing this chart if their values for volumes do not lead to a valid setup. And, if someone tries to upgrade from an install w/o volumes enabled to an install with it enabled, then I want the upgrade to fail early if the volumes settings are not working. There are several sources for volumes issues:
I really want a way to push people to figure that out before the chart finishes installing something that is broken. A key component of this is that it runs BEFORE the chart gets installed or updated. @armab doesn't like the idea of running this with production clusters, but that's explicitly one of my goals. Especially with production installs, volumes issues should be found and fixed as early as possible, which is before the rest of the components get installed or upgraded. Thoughts? |
I'm very much a fan of this. |
I'm not overly familiar with StackStorm HA at this point in time (I'm not running it in production yet), but, I will say that as an operator of other Kubernetes applications, I'd personally rather have an application never get off of the ground (versus being broken while getting off of the ground) if that's a possibility. Adding simple RW testing to a Pod whose purpose and name implies that it's for testing (heh, maybe it should be renamed to In short, I'm in the "fail as early as possible" camp as well. |
Ooh. I like the idea of calling it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few more comments.
Can we make it a Job
, same as others that we already have and call more explicitly what it does like st2-verify-pack-write-access
(or somewhat close to that)?
The jobs we already have:
https://github.com/StackStorm/stackstorm-k8s/blob/master/templates/jobs.yaml
and as I remember if a job like register-content failed, the entire helm deployment will report a failure too. So logically that's following what you want to do.
That would be a bit more consistent with the existing mechanics.
Yeah, that makes sense. I'll work on:
|
49ef9b0
to
0017fbe
Compare
0017fbe
to
e2a8341
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for moving it to the jobs, looks more consistent.
Left a few more comments and observations.
80d8ddb
to
195e992
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot! 👍
Closes #321
We often get questions about getting the volumes working, but the errors are very hard to debug by the time st2 is running. So, I'm trying to prevent people from installing this chart if their values for volumes do not lead to a valid setup. And, if someone tries to upgrade from an install w/o volumes enabled to an install with it enabled, then I want the upgrade to fail early if the volumes settings are not working.
There are several sources for volumes issues:
I really want a way to push people to figure that out before the chart finishes installing something that is broken.
A key component of this is that it runs BEFORE the chart gets installed or updated. Especially with production installs, volumes issues should be found and fixed as early as possible, which is before the rest of the components get installed or upgraded.