-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adding specific logic for dealing with validation jobs #599
adding specific logic for dealing with validation jobs #599
Conversation
Is the issue that steps with If so, can we leverage the methods in here to simplify this logic? |
I filed #600 so that the harness can help us write a test for this. In the meantime, we should write one in Go. |
maybe I am just very confused about how delete should work but why don't we have that as two steps? I understand delete as so why do we have
and not two steps, one running the validation and the other running the delete? |
I hope this clarifies: https://www.youtube.com/watch?v=eW0qfhEVTTY&feature=youtu.be&t=1432 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fabianbaier thanks, that helped a bit. What I heard there (correct me if I am wrong):
- nobody really understands what is delete supposed to do and on which types of objects and what problems is it supposed to be solving (it should not be used instead of regular garbage collection on the cluster)
- it should be two steps - one step runs the job, second deletes it
- we definitely need more docs to address the first point I made here
I still don't feel like I have enough context to be able to approve it and we should probably address the points ^^^ as part of this or in other PRs
adding restartPolicty to Job template creating integration-tests creating e2e test and added events to the planexecution controller added e2e test for step-delete with more verbose assertion
ac0e3ba
to
32368a1
Compare
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: fabianbaier The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
I've rebased this one as it solves an issue that I'm having with the MySQL operator. But I agree with @alenkacz that we need to think through the logic here a lot better. This will still be broken for all other non-Job types after merging. We should likely document the desired behavior in a KEP to ensure we're all on the same page. I think @runyontr also has some ideas on a better fix. |
@fabianbaier: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
As this is stale for a while now, and as mentioned ( https://kubernetes.slack.com/archives/CG3HTFCMV/p1568656474021300 ) I am happy to close this out if it has been or will be addressed in other PRs or if there is a plan change the logic or |
PE does not exist anymore and we're be changing how delete is implemented in Tasker -> closing |
What type of PR is this?
/component operator
/kind bug
What this PR does / why we need it:
This PR is intended to short term fix some business logic in the
planexecution_controller.go
. In particular, when dealing with jobs in general. It looks like we don't have right now any Operator that actually uses theDelete
boolean, nor do we actually have a test that makes sure a validation job was successfully created and completed. Also, I think the zookeeper test in the currentkudobuilder/operators
repo seems to cover all the test cases but we might want to break it down into individual tests, e.g. one for just the validation phase to finish and one for just a new validation job will come up after parameter changes are applied.Which issue(s) this PR fixes:
Fixes #586
Special notes for your reviewer:
Run a clean Minikube and install KUDO from this branch:
minikube start --vm-driver=hyperkit --cpus=6 --memory=9216 --disk-size=10g
kubectl apply -f config/crds && kubectl apply -f config/rbac
Make sure you have the most recent
kudobuilder/operator
repo cloned. In my case I can find it under/Users/fabianbaier/go/src/github.com/kudobuilder/operators
.Make sure you have the most recent
Zookeeper
operator installed ( The one with a validation phase and Delete bool set totrue
- at the moment this should be as easy askubectl kudo install zookeeper --skip-instance
)Now you can try it out via (make sure you adjust your path to your operator repo):
You can see it in action by watching the output of your
go run ./cmd/kubectl-kudo/main.go test
terminal from above but also having a new terminal open that runs:kubectl get pods -w --all-namespaces
. The output for the test in thego run ./cmd/kubectl-kudo/main.go test
terminal should look like:Which means that not just a validation job was running but changing it parameters and applying it also worked. The flow of it should be in general:
validation
phase and create a validation jobtrue
cpu
value of200m
) to the Zookeeper operatorThe output in the 2nd terminal that watches what happens to our pods looks promising:
As you can see there are actually two validation jobs created. The second one is right after the parameter change, which is intended. There is right now no test that actually checks if this job has the new parameter (so this could be added too) however manually checking showed the new parameter in this job were reflected. that should do it for now.
Overall I think there are a lot of anti patterns and nested if this logic that makes things very hard to debug and probably needs some refactoring.
Does this PR introduce a user-facing change?: