Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Topology Manager documentation to include the scope feature #24781

Merged
merged 1 commit into from
Nov 24, 2020

Conversation

k-wiatrzyk
Copy link
Contributor

@k-wiatrzyk k-wiatrzyk commented Oct 29, 2020

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Oct 29, 2020
@k8sio-netlify-preview-bot
Copy link
Collaborator

k8sio-netlify-preview-bot commented Oct 29, 2020

Deploy preview for kubernetes-io-vnext-staging processing.

Building with commit 4255393

https://app.netlify.com/sites/kubernetes-io-vnext-staging/deploys/5fbd06620af54a0008a770e9

@k8s-ci-robot
Copy link
Contributor

Welcome @k-wiatrzyk!

It looks like this is your first PR to kubernetes/website 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/website has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Oct 29, 2020
@k8s-ci-robot k8s-ci-robot added language/en Issues or PRs related to English language sig/docs Categorizes an issue or PR as relevant to SIG Docs. labels Oct 29, 2020
@annajung
Copy link
Contributor

/milestone 1.20
/sig node
/hold
pending merge of kubernetes/kubernetes#92967

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Oct 29, 2020
@k8s-ci-robot k8s-ci-robot added this to the 1.20 milestone Oct 29, 2020
@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Oct 30, 2020
@annajung
Copy link
Contributor

based on the comment kubernetes/enhancements#693 (comment)

this PR will be on /hold until an exception has been filed/approved.

@annajung
Copy link
Contributor

annajung commented Nov 4, 2020

exception approved
hold for k/k PR

@annajung
Copy link
Contributor

annajung commented Nov 5, 2020

/assign

@annajung
Copy link
Contributor

Hi @k-wiatrzyk
Friendly reminder that docs ready for review deadline is coming up on Nov 23rd, right after Kubecon NA. We would appreciate it if you could provide some/all docs content for this feature early so that it can be reviewed by Sig Docs before Kubecon and other events happening in November.

If it's ready to be reviewed, could you also remove WIP from the title? Thank you!

@k-wiatrzyk
Copy link
Contributor Author

Hi @annajung
Thanks! And I understand, time is crucial. I will give a final update to the docs tomorrow. The code has been merged so we are on the right track.

@annajung
Copy link
Contributor

/hold cancel
k/k pr merged

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 12, 2020
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Nov 13, 2020
@k-wiatrzyk k-wiatrzyk changed the title [WIP] Update Topology Manager documentation to include the scope feature Update Topology Manager documentation to include the scope feature Nov 13, 2020
Copy link
Contributor

@sftim sftim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @k-wiatrzyk

LGTM!

Here's some optional feedback. What I mean is, I'd be happy to have this feedback accepted, but it's also OK to merge without it.

These changes need technical review; can you recommend someone who can provide that?

Comment on lines 63 to 65
### container scope

This scope was available before any other scope was implemented, and it remains as the default setting in the kubelet.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### container scope
This scope was available before any other scope was implemented, and it remains as the default setting in the kubelet.
### container scope
The `container` scope is used by default.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to suggested change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines +71 to +90
### pod scope

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### pod scope
### pod scope
To select the `pod` scope, start the kubelet with the command line option `--topology-manager-scope=pod`.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

* all containers can be and are allocated to a single NUMA node;
* all containers can be and are allocated to a shared set of NUMA nodes.

The total amount of particular resource demanded for the entire pod is calculated according to [effective requests/limits](/docs/concepts/workloads/pods/init-containers/#resources) formula, and thus, this total value is equal to the maximum of:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about the link to the init container concept - topology scope is relevant even if you never use init containers. Instead, I recommend duplicating the text here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, I'm not sure if duplicating here is needed. The init container link also has other details (i.e. regarding scheduling).

Below the selected part there are two bullet points that describe the formula shortly. Init containers are taken into consideration if they exist in the pod. If not, only the maximum of app containers is considered.

...this total value is equal to the maximum of:
* the sum of all app container requests,
* the maximum of init container requests,

* the maximum of init container requests,
for a resource.

The scope in tandem with `single-numa-node` Topology Manager Policy is specifically valuable for high-performance applications. By combining both options, we are able to place all containers in a pod on a single NUMA node; hence, the inter-NUMA communication overhead can be eliminated.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The scope in tandem with `single-numa-node` Topology Manager Policy is specifically valuable for high-performance applications. By combining both options, we are able to place all containers in a pod on a single NUMA node; hence, the inter-NUMA communication overhead can be eliminated.
Using the `pod` scope in tandem with `single-numa-node` Topology Manager policy is specifically valuable for workloads that are latency sensitive or for high-throughput applications that perform IPC. By combining both options, you are able to place all containers in a pod onto a single NUMA node; hence, the inter-NUMA communication overhead can be eliminated for that pod.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(this is a bunch of suggestions in one - some might not be technically accurate)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


Within this scope, the Topology Manager performs a number of sequential resource alignments, i.e., for each container (in a pod) a separate alignment is computed. In other words, there is no notion of grouping the containers to a specific set of NUMA nodes, for this particular scope. In effect, the Topology Manager performs an arbitrary alignment of individual containers to NUMA nodes.

The notion of grouping the containers was endorsed and implemented on purpose in the following scope, i.e. the Pod Scope.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/i.e./for example/
nit: What do you think about removing the capitalization of:
... for example, the pod scope.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, capitalization was inconsistent with the rest. Done

@kbhawkey
Copy link
Contributor

Hi @k-wiatrzyk .
There are a few things to clean up, otherwise the docs look good. LGTM.

@irvifa
Copy link
Member

irvifa commented Nov 19, 2020

/label tide/merge-method-squash

@k8s-ci-robot k8s-ci-robot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Nov 19, 2020
@sftim
Copy link
Contributor

sftim commented Nov 19, 2020

I only see 1 commit
/remove-label tide/merge-method-squash

@k8s-ci-robot k8s-ci-robot removed the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Nov 19, 2020
@k-wiatrzyk
Copy link
Contributor Author

@kbhawkey @sftim
Thanks for the reviews and LGTMs.

Pinged @klueska for a tech review

Comment on lines +54 to +80
### Topology Manager Scopes

The Topology Manager can deal with the alignment of resources in a couple of distinct scopes:

* `container` (default)
* `pod`

Either option can be selected at a time of the kubelet startup, with `--topology-manager-scope` flag.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably combine the Scope and Policy sections into 1 (i.e. Topology Manager Scopes and Policies). That way you can have a quick intro into what both of them are and how they interact before diving into the details of them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I will prepare such an update today.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think after this is addressed then this is good to go(?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope so. I won't change any content here, only rearrange stuff.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@irvifa @klueska
I grouped scope and policies in the common section (## Topology Manager Scopes and Policies) and added a short description for it.

@k-wiatrzyk k-wiatrzyk force-pushed the dev-1.20-tm-scope branch 3 times, most recently from 3b54845 to de37944 Compare November 23, 2020 16:00
@annajung
Copy link
Contributor

Hi @klueska @kubernetes/sig-node-pr-reviews can you please provide tech review / lgtm?

@@ -51,6 +51,52 @@ The hint is then stored in the Topology Manager for use by the *Hint Providers*

Support for the Topology Manager requires `TopologyManager` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) to be enabled. It is enabled by default starting with Kubernetes 1.18.

## Topology Manager Scopes and Policies

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move the intro text from the policy section to here and then reword what you currently have here to the following:

The Topology Manager currently:
 - Aligns Pods of all QoS classes.
 - Aligns the requested resources that Hint Provider provides topology hints for.

If these conditions are met, the Topology Manager will align the requested resources.

In order to customise how this alignment is carried out, the Topology Manager provides two distinct knobs: 'scope` and `policy`.

The `scope` defines the granularity at which you would like resource alignment to be performed (e.g. at the `pod` or `container` level). And the `policy` defines the the actual strategy used to carry out the alignment (e.g. `best-effort`, `restricted`, `single-numa-node`, etc.).

Details on the various `scopes` and `policies` available today can be found below. 

{{< note >}}
To align CPU resources with other requested resources in a Pod Spec, the CPU Manager should be enabled and proper CPU Manager policy should be configured on a Node. See [control CPU Management Policies](/docs/tasks/administer-cluster/cpu-management-policies/).
{{< /note >}}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, applying now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks!

@k-wiatrzyk k-wiatrzyk force-pushed the dev-1.20-tm-scope branch 2 times, most recently from 064d178 to eaa0810 Compare November 24, 2020 13:07
@klueska
Copy link
Contributor

klueska commented Nov 24, 2020

/approve
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 24, 2020
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: d2fef34dcb088b0cee486253b9857e3d3a441d1b

@annajung
Copy link
Contributor

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: annajung, klueska

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 24, 2020
@k8s-ci-robot k8s-ci-robot merged commit 6fa3e91 into kubernetes:dev-1.20 Nov 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. language/en Issues or PRs related to English language lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/docs Categorizes an issue or PR as relevant to SIG Docs. sig/node Categorizes an issue or PR as relevant to SIG Node. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants