Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release-3.11] UPSTREAM: 68678: tighten maximum retry loop for aggregate api availability #21012

Conversation

openshift-cherrypick-robot

This is an automated cherry-pick of #20999

/assign deads2k

@sdodson
Copy link
Member

sdodson commented Sep 18, 2018

/retest

@derekwaynecarr
Copy link
Member

we can queue this up for first z release.

@sdodson
Copy link
Member

sdodson commented Sep 19, 2018

we can queue this up for first z release.

Multiple other bugs are suspected of being tied back to this. Even if this doesn't resolve the issue it leads to upgrades stalling for up to 1000s. Other non aggregated APIs are unresponsive around the same time, could this potentially cause a wider API degradation?

@deads2k
Copy link
Contributor

deads2k commented Sep 19, 2018

Multiple other bugs are suspected of being tied back to this. Even if this doesn't resolve the issue it leads to upgrades stalling for up to 1000s. Other non aggregated APIs are unresponsive around the same time, could this potentially cause a wider API degradation?

No. It is strictly related to remote servers.

@knobunc
Copy link
Contributor

knobunc commented Sep 19, 2018

That e2e test failure is https://bugzilla.redhat.com/show_bug.cgi?id=1630537

@derekwaynecarr
Copy link
Member

@sdodson @deads2k is there a bz that we can link to this change?

@sdodson
Copy link
Member

sdodson commented Sep 19, 2018

This is poorly tracked because it's one of many problems in the constellation of upgrade issues where the API becomes unreliable.

https://bugzilla.redhat.com/show_bug.cgi?id=1623571
https://bugzilla.redhat.com/show_bug.cgi?id=1628961
Are bugs where storage migration fails on aggregated APIs, bugs where David and I sorted through adding the discovery and waits for aggregated api availability.

https://bugzilla.redhat.com/show_bug.cgi?id=1628881 is a bug assigned to the networking team where @squeed suggests that it now looks like it's failing due to api aggregation.

There are probably others but I've lost track of them.

@derekwaynecarr
Copy link
Member

thanks @sdodson

since we have had to take some other prs, i am going to take this now as well.

/approve
/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Sep 19, 2018
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: derekwaynecarr, openshift-cherrypick-robot

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 19, 2018
@dmage
Copy link
Contributor

dmage commented Sep 19, 2018

/retest

@openshift-merge-robot openshift-merge-robot merged commit 340f7a8 into openshift:release-3.11 Sep 19, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants