Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding documentation for Topology Aware Hints #27032

Merged
merged 1 commit into from
Mar 31, 2021

Conversation

robscott
Copy link
Member

This adds documentation for the new Topology Aware Hints feature and also notes that Service Topology is deprecated.

/cc @andrewsykim @sftim
/sig network

@k8s-ci-robot k8s-ci-robot added this to the 1.21 milestone Mar 13, 2021
@k8s-ci-robot k8s-ci-robot added sig/network Categorizes an issue or PR as relevant to SIG Network. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Mar 13, 2021
@k8sio-netlify-preview-bot
Copy link
Collaborator

k8sio-netlify-preview-bot commented Mar 13, 2021

Deploy preview for kubernetes-io-vnext-staging processing.

Building with commit b9ec368

https://app.netlify.com/sites/kubernetes-io-vnext-staging/deploys/6064f7b4d270fc0007f201ff

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. language/en Issues or PRs related to English language sig/docs Categorizes an issue or PR as relevant to SIG Docs. labels Mar 13, 2021
@reylejano
Copy link
Member

reylejano commented Mar 13, 2021

/assign @PI-Victor

@robscott robscott force-pushed the topology-aware-hints branch 2 times, most recently from 87e4cef to e158206 Compare March 15, 2021 21:38

## Constraints

* Topology Aware Hints are not compatible with `internalTrafficPolicy=Local`,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it invalid to spec eTP: Cluster and iTP: Local ? I don't think that's disallowed? In that case, topology hints modulate the meaning of "Cluster", no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're completely right. There's a bug in the implementation that makes these exclusive, but that should not be the case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if my understanding is correct, before the fix for kubernetes/kubernetes#100313, it's not compatible with externalTrafficPolicy=Local as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, thanks!

use both features in the same cluster on different Services, just not on the
same Service.

* This approach will not work well for Services that have a large proportion of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The APPROACH might, but the implementation will need to adapt to handle it, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For an alpha feature, it's OK to document current shortcomings without making a commitment about future improvements to address them (we don't want to make promises unless we're sure of delivering on them).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is worth further discussion, but the only ways I can think to adapt to this would be new metrics that could enable feedback-based balancing or extra configuration.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I meant is that the implementation could evolve to consider feedback and the hints API would still be appropriate

Copy link
Contributor

@sftim sftim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @robscott

Thanks for this PR.
The feature looks really useful, I can see myself wanting that next week actually.

This will need some changes before we can merge it I'm afraid. Any feedback I've marked as a nit is optional; the other stuff pretty much does need fixing though.

use both features in the same cluster on different Services, just not on the
same Service.

* This approach will not work well for Services that have a large proportion of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For an alpha feature, it's OK to document current shortcomings without making a commitment about future improvements to address them (we don't want to make promises unless we're sure of delivering on them).

@robscott robscott force-pushed the topology-aware-hints branch 2 times, most recently from 3e49aff to f319635 Compare March 20, 2021 00:42
Comment on lines 22 to 24
Service Topology is deprecated as of Kubernetes v1.21, and will be removed in
v1.22. Topology Aware Hints was introduced in Kubernetes v1.21 and provides
similar functionality.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question. I think that specific doc is focused on deprecations of API versions, not fields. I know we'll have a deprecation notice in the 1.21 release notes, but maybe we need something more visible in the docs?

/cc @liggitt

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that page is currently focused on entirely deprecated API versions that require migration to another API version. Since v1beta1 EndpointSlices will stop being served in 1.25, and the v1 replacement is ready, they should be marked as deprecated, and added to that doc, along with a mention of the topology field not being writeable in v1.

@PI-Victor
Copy link
Member

Hi @robscott , thank you for having your Doc PR ready for review, friendly reminder about the upcoming doc related dates for the 1.21 release:
Upcoming doc related dates for the 1.21 release:

  • Docs Ready for Review deadline is March 24
  • Docs Ready to Merge deadline is March 31

@sftim
Copy link
Contributor

sftim commented Mar 26, 2021

@kubernetes/sig-network-pr-reviews how's this looking?

@reylejano
Copy link
Member

Hi @andrewsykim @bowei @dcbw @thockin @wojtek-t , since this documentation PR is related to SIG Network KEP 2433 Topology Aware Hints , as reviewers or approvers of the KEP please provide a technical review for this PR by March 31 to get this into the release. Thank you!

@sftim
Copy link
Contributor

sftim commented Mar 30, 2021

As this is documenting an alpha feature, it's OK to have a few documentation rough edges.

We do want:

  • factual accuracy, no obvious incorrect statements etc
  • a clear indication that the feature is alpha (DONE!)
  • enough detail that someone can decide whether or not that feature is relevant; if they find it's on in their cluster, should they worry, etc.
  • accurate feature gate & related details

I think that's the minimum. Obviously going beyond that is better and welcome.

We will consider PRs against dev-1.21 even after the initial docs merge, and because this is website is continuously delivered, we'll also consider the same equivalent commits post-release if you rebase against the primary branch.

TL;DR: So we don't need this PR to be completely finished feature docs, so long as the minimum is done well.

@sftim
Copy link
Contributor

sftim commented Mar 30, 2021

BTW @robscott I do think this documentation is already good and well beyond the minimum, thanks for being so thorough!

@robscott
Copy link
Member Author

I think this PR is just waiting for technical signoff at this point. @thockin or @andrewsykim can you add a technical LGTM to this?

real-time feedback. It is still possible for individual endpoints to become
overloaded.

3. **One or more Nodes has insufficient information:** If any node does not have
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this true even when a single node may be missing topology labels for a short period of time?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other words, during that short period of time, would kube-proxy flap between topolgoy-aware hints and all endpoints?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is what is is for now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, the current approach is very quick to bail out. Would like to try to find some kind of acceptable threshold for missing data, but that's been hard to quantify so far.

## Using Topology Aware Hints

You can enable Topology Aware Hints for a Service by setting the
`service.kubernetes.io/topology-aware-hints` annotation to `auto`. This tells
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if this was covered in the PR review, I was unable to review it due to other priorities :(

Do we "drop annotations" like we do field when the feature gate is off, or is the annotation just ignored in this case?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ignored

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, the annotation is a way for users to opt into this feature on a Service, so dropping that opt-in would not be good here.

@reylejano
Copy link
Member

@robscott please review and address andrewsykim's comments. I also made a few suggestions.
Thank you!

Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall good, some smaller comments

## Using Topology Aware Hints

You can enable Topology Aware Hints for a Service by setting the
`service.kubernetes.io/topology-aware-hints` annotation to `auto`. This tells
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ignored

## Using Topology Aware Hints

You can enable Topology Aware Hints for a Service by setting the
`service.kubernetes.io/topology-aware-hints` annotation to `auto`. This tells
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Auto" or "auto"? It should be "Auto" probably

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately it is "auto", should I try to fix that in upstream? https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/endpointslice/utils.go#L395

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, and now you get to handle both!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

real-time feedback. It is still possible for individual endpoints to become
overloaded.

3. **One or more Nodes has insufficient information:** If any node does not have
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is what is is for now.

@thockin
Copy link
Member

thockin commented Mar 31, 2021

/assign

@robscott robscott force-pushed the topology-aware-hints branch 2 times, most recently from 1f2f840 to 851610b Compare March 31, 2021 20:59
@robscott
Copy link
Member Author

Thanks for the reviews @andrewsykim @thockin and @reylejano! This is ready for another round of review when you have time.

Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 31, 2021
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 82c662891624c9cc186ea3775704212ecfd8028c

@reylejano
Copy link
Member

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: reylejano

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 31, 2021
@k8s-ci-robot k8s-ci-robot merged commit bfa3449 into kubernetes:dev-1.21 Mar 31, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. language/en Issues or PRs related to English language lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/docs Categorizes an issue or PR as relevant to SIG Docs. sig/network Categorizes an issue or PR as relevant to SIG Network. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.