Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2 deployer pods are present for a brief moment #16870

Closed
tnozicka opened this issue Oct 13, 2017 · 7 comments
Closed

2 deployer pods are present for a brief moment #16870

tnozicka opened this issue Oct 13, 2017 · 7 comments
Assignees
Labels
component/apps kind/bug Categorizes issue or PR as related to a bug. kind/test-flake Categorizes issue or PR as related to test flakes. priority/P0 sig/master

Comments

@tnozicka
Copy link
Contributor

This is probably because we don't wait for them to be deleted and allow deploying newer RC just because the old one is marked cancelled but this doesn't mean the deployer pod has been deleted yet as that is handled asynchronously by deployer controller.

@smarterclayton this is the issue you pinged me about
cc: @mfojtik

Seen in https://ci.openshift.redhat.com/jenkins/job/zz_origin_gce_image/502/testReport/junit/(root)/Extended/_Feature_DeploymentConfig__deploymentconfigs_when_run_iteratively__Conformance__should_only_deploy_the_last_deployment__Suite_openshift_conformance_parallel_/

logs.txt

@tnozicka tnozicka added component/apps kind/bug Categorizes issue or PR as related to a bug. kind/test-flake Categorizes issue or PR as related to test flakes. priority/P1 sig/master labels Oct 13, 2017
@tnozicka tnozicka self-assigned this Oct 13, 2017
@smarterclayton
Copy link
Contributor

This was probably pushed over the edge into continuous failures by #16913 - we clean up pods faster, so the controller is more likely to race.

@smarterclayton
Copy link
Contributor

Bumping priority since this is breaking most jobs - @tnozicka said he had recreated and is working on a fix.

@tnozicka
Copy link
Contributor Author

@smarterclayton yes, that seem probable because it now waits for about 3s in terminating state and in the meantime we create a new RC and new deployer pod

@tnozicka
Copy link
Contributor Author

temporarily disabling the check #16956

@tnozicka
Copy link
Contributor Author

I thing this is just the check actually seeing 2 deployer pods in non terminating state although they aren't actually running. There seems to be a bug in kubelet updating pod phase from Succeeded to Pending #17011

@tnozicka
Copy link
Contributor Author

It has been confirmed to be caused by #17011 which is now fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/apps kind/bug Categorizes issue or PR as related to a bug. kind/test-flake Categorizes issue or PR as related to test flakes. priority/P0 sig/master
Projects
None yet
Development

No branches or pull requests

3 participants