[release-v3.27] Auto pick #8913: updating the logic for stale endpoint management #9026
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Cherry pick of #8913 on release-v3.27.
#8913: updating the logic for stale endpoint management
Original PR Body below
Description
Background:
This issue surfaces when we run the K8s for hyperv containers test https://github.com/kubernetes/kubernetes/blob/6381e6504ac210297e382c22029791267d440d9e/test/e2e/windows/service.go#L51 to reproduce you will need to set up testing cluster using capz windows-testing/capz/readme.md at master · kubernetes-sigs/windows-testing (github.com) and run this specific test. Else you could also look at test-grid logs for details on test failure: sig-windows-experimental Test Grid (kubernetes.io)
In the test Calico is unable to recognize a container endpoint is in ready state as it doesn't attach an endpoint as the logic is based on HNSEndpoint sharedContainers which are an empty field for hyperv containers. https://github.com/microsoft/hcsshim/blob/8beabacfc2d21767a07c20f8dd5f9f3932dbf305/internal/hns/hnsendpoint.go#L146
Calico logic to define a stale container needs to be updated. I do so by using the HNS Endpoint state attribute.
PR for hcshim: microsoft/hcsshim#2177
The change has been backported to hcsshim 0.11 and 0.12 and we here ingest the latest tag to ingest dependent changes.
This change was tested by running kubernetes e2e tests locally for both hyperv containers and process isolated containers. All tests pass.
Related issues/PRs
Todos
Release Note
Reminder for the reviewer
Make sure that this PR has the correct labels and milestone set.
Every PR needs one
docs-*
label.docs-pr-required
: This change requires a change to the documentation that has not been completed yet.docs-completed
: This change has all necessary documentation completed.docs-not-required
: This change has no user-facing impact and requires no docs.Every PR needs one
release-note-*
label.release-note-required
: This PR has user-facing changes. Most PRs should have this label.release-note-not-required
: This PR has no user-facing changes.Other optional labels:
cherry-pick-candidate
: This PR should be cherry-picked to an earlier release. For bug fixes only.needs-operator-pr
: This PR is related to install and requires a corresponding change to the operator.