From 11e9cfea2a3e76dbc6041acccb1f00e51a17556f Mon Sep 17 00:00:00 2001 From: Connor Doyle Date: Wed, 6 Sep 2017 21:10:58 -0700 Subject: [PATCH 1/3] Noted limitation of alpha static cpumanager. --- docs/tasks/administer-cluster/cpu-management-policies.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/docs/tasks/administer-cluster/cpu-management-policies.md b/docs/tasks/administer-cluster/cpu-management-policies.md index 3faf7de653e0d..d6eb932ed1f99 100644 --- a/docs/tasks/administer-cluster/cpu-management-policies.md +++ b/docs/tasks/administer-cluster/cpu-management-policies.md @@ -49,6 +49,10 @@ using the [cpuset cgroup controller](https://www.kernel.org/doc/Documentation/cg **Note:** System services such as the container runtime and the kubelet itself can continue to run on these exclusive CPUs.  The exclusivity only extends to other pods. {: .note} +**Note:** The alpha version of this policy does not guarantee static +exclusive allocations across Kubelet restarts. +{: .note} + This policy manages a shared pool of CPUs that initially contains all CPUs in the node minus any reservations by the kubelet `--kube-reserved` or `--system-reserved` options. CPUs reserved by these options are taken, in From 63215910fe1f427f9466b08964a221e0ed2661a7 Mon Sep 17 00:00:00 2001 From: Connor Doyle Date: Wed, 6 Sep 2017 20:57:48 -0700 Subject: [PATCH 2/3] Updated CPU manager docs to match implementation. - Removed references to CPU pressure node condition and evictions. - Added note about new --cpu-manager-reconcile-period flag. - Added note about node allocatable requirements for static policy. - Noted limitation of alpha static cpumanager. --- .../cpu-management-policies.md | 54 ++++++++++--------- 1 file changed, 29 insertions(+), 25 deletions(-) diff --git a/docs/tasks/administer-cluster/cpu-management-policies.md b/docs/tasks/administer-cluster/cpu-management-policies.md index d6eb932ed1f99..0a9d08be71c96 100644 --- a/docs/tasks/administer-cluster/cpu-management-policies.md +++ b/docs/tasks/administer-cluster/cpu-management-policies.md @@ -15,7 +15,7 @@ directives. ## CPU Management Policies By default, the kubelet uses [CFS quota](https://en.wikipedia.org/wiki/Completely_Fair_Scheduler) -to enforce pod CPU limits.  When the node runs many CPU bound pods, +to enforce pod CPU limits.  When the node runs many CPU-bound pods, the workload can move to different CPU cores depending on whether the pod is throttled and which CPU cores are available at scheduling time.  Many workloads are not sensitive to this migration and thus @@ -25,13 +25,25 @@ However, in workloads where CPU cache affinity and scheduling latency significantly affect workload performance, the kubelet allows alternative CPU management policies to determine some placement preferences on the node. -Enable these management policies with the `--cpu-manager-policy` kubelet -option.  There are two supported policies: +### Configuration -* `none`: the default, which represents the existing scheduling behavior +The CPU Manager is introduced as an alpha feature in Kubernetes v1.8. It +must be explicitly enabled in the kubelet feature gates: +`--feature-gates=CPUManager=true`. + +The CPU Manager policy is set with the `--cpu-manager-policy` kubelet +option. There are two supported policies: + +* `none`: the default, which represents the existing scheduling behavior. * `static`: allows pods with certain resource characteristics to be granted increased CPU affinity and exclusivity on the node. +The CPU manager periodically writes resource updates through the CRI in +order to reconcile in-memory CPU assignments with cgroupfs. The reconcile +frequency is set through a new Kubelet configuration value +`--cpu-manager-reconcile-period`. If not specified, it defaults to the same +duration as `--node-status-update-frequency`. + ### None policy The `none` policy explicitly enables the existing default CPU @@ -54,7 +66,8 @@ exclusive allocations across Kubelet restarts. {: .note} This policy manages a shared pool of CPUs that initially contains all CPUs in the -node minus any reservations by the kubelet `--kube-reserved` or +node. The amount of exclusively allocatable CPUs is equal to the total +number of CPUs in the node minus any CPU reservations by the kubelet `--kube-reserved` or `--system-reserved` options. CPUs reserved by these options are taken, in integer quantity, from the initial shared pool in ascending order by physical core ID.  This shared pool is the set of CPUs on which any containers in @@ -63,26 +76,21 @@ cpu `requests` also run on CPUs in the shared pool. Only containers that are both part of a `Guaranteed` pod and have integer CPU `requests` are assigned exclusive CPUs. -**Note:** When reserving CPU with `--kube-reserved` or `--system-reserved` options, it is advised to use *integer* CPU quantities. +**Note:** The kubelet requires a CPU reservation greater than zero be made +using either `--kube-reserved` and/or `--system-reserved` when the static +policy is enabled. This is because zero CPU reservation would allow the shared +pool to become empty. {: .note} As `Guaranteed` pods whose containers fit the requirements for being statically assigned are scheduled to the node, CPUs are removed from the shared pool and -placed in the cpuset for the container.  CFS quota is not used to bound +placed in the cpuset for the container. CFS quota is not used to bound the CPU usage of these containers as their usage is bound by the scheduling domain itself. In others words, the number of CPUs in the container cpuset is equal to the integer -CPU `limit` specified in the pod spec.  This static assignment increases CPU -affinity and decreases context switches due to throttling for the CPU bound +CPU `limit` specified in the pod spec. This static assignment increases CPU +affinity and decreases context switches due to throttling for the CPU-bound workload. -In the event that the shared pool is depleted the kubelet takes two actions: - -* Evict all pods that include a container that does not specify a `cpu` - quantity in `requests` as those pods now have no CPUs on which to run. -* Set a `NodeCPUPressure` node condition to `true` in the node status. When - this condition is true, the scheduler will not assign any pod to the node - that has a container which lacks a `cpu` quantity in `requests`. - Consider the containers in the following pod specs: ```yaml @@ -93,8 +101,7 @@ spec: ``` This pod runs in the `BestEffort` QoS class because no resource `requests` or -`limits` are specified. It is evicted if shared pool is depleted. It runs -in the shared pool. +`limits` are specified. It runs in the shared pool. ```yaml spec: @@ -109,9 +116,8 @@ spec: ``` This pod runs in the `Burstable` QoS class because resource `requests` do not -equal `limits` and the `cpu` quantity is not specified. It is -evicted if shared pool is depleted. It runs in the shared pool. - +equal `limits` and the `cpu` quantity is not specified. It runs in the shared +pool. ```yaml spec: @@ -128,9 +134,7 @@ spec: ``` This pod runs in the `Burstable` QoS class because resource `requests` do not -equal `limits`. The non-zero `cpu` quantity in `requests` prevents the -shared pool from depleting. It runs in the shared pool. - +equal `limits`. It runs in the shared pool. ```yaml spec: From 9ff91c7c708da4a5b271b10408a743f2fdc83329 Mon Sep 17 00:00:00 2001 From: Connor Doyle Date: Mon, 11 Sep 2017 09:14:44 -0700 Subject: [PATCH 3/3] Move cpu-manager task link to rsc mgmt section. --- _data/tasks.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_data/tasks.yml b/_data/tasks.yml index 866279e34ca54..e2ccd4c90ce1c 100644 --- a/_data/tasks.yml +++ b/_data/tasks.yml @@ -124,6 +124,7 @@ toc: - docs/tasks/administer-cluster/quota-pod-namespace.md - docs/tasks/administer-cluster/quota-api-object.md - docs/tasks/administer-cluster/opaque-integer-resource-node.md + - docs/tasks/administer-cluster/cpu-management-policies.md - docs/tasks/administer-cluster/access-cluster-api.md - docs/tasks/administer-cluster/access-cluster-services.md - docs/tasks/administer-cluster/securing-a-cluster.md @@ -140,7 +141,6 @@ toc: - docs/tasks/administer-cluster/cpu-memory-limit.md - docs/tasks/administer-cluster/out-of-resource.md - docs/tasks/administer-cluster/reserve-compute-resources.md - - docs/tasks/administer-cluster/cpu-management-policies.md - docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods.md - docs/tasks/administer-cluster/declare-network-policy.md - title: Install Network Policy Provider