Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --cpus context #267

Merged
merged 13 commits into from
Jun 29, 2022
51 changes: 35 additions & 16 deletions accepted/2019/support-for-memory-limits.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,16 @@

**Owner** [Rich Lander](https://github.com/richlander)

.NET Core has support for [control groups](https://en.wikipedia.org/wiki/Cgroups) (cgroups), which is the basis of [Docker limits](https://docs.docker.com/config/containers/resource_constraints/). We found that the algorithm we use to honor cgroups works well for larger memory size limits (for example, >500MB), but that it is not possible to configure a .NET Core application to run indefinitely at lower memory levels. This document proposes an approach to support low memory size limits, <100MB.
.NET Core has support for [control groups](https://en.wikipedia.org/wiki/Cgroups) (cgroups), which is the basis of [Docker limits](https://docs.docker.com/config/containers/resource_constraints/). We found that the algorithm we use to honor cgroups works well for larger memory size limits (for example, >500MB), but that it is not possible to configure a .NET Core application to run indefinitely at lower memory levels. This document proposes an approach to support low memory size limits, for example <100MB.

Note: Windows has a concept similar to cgroups called [job objects](https://docs.microsoft.com/windows/desktop/ProcThread/job-objects). .NET Core should honor job objects in the same way as cgroups, as appropriate. This document will focus on cgroups throughout.
Note: Windows has a concept similar to cgroups called [job objects](https://docs.microsoft.com/windows/desktop/ProcThread/job-objects). .NET 6+ correctly honors job objects in the same way as cgroups. This document will focus on cgroups throughout.

It is critical to provide effective and well defined experiences when .NET Core applications are run within memory-limited cgroups. An application should run indefinitely given a sensible configuration for that application. We considered relying on orchestators to manage failing applications (that can no longer satisfy the configuration of a cgroup), but believe this to be antithetical as a primary solution for building reliable systems. We also expect that there are scenarios where orchestrators will be unavailable or primitive or hardware will be constrainted, and therefore not tolerant of frequently failing applications. As a result, we need a better tuned algorithm for cgroup support to the end of running reliable software within constrained environments.
It is critical to provide effective and well defined experiences when .NET Core applications are run within memory-limited cgroups. An application should run indefinitely given a sensible configuration for that application. We considered relying on orchestators to manage failing applications (that can no longer satisfy the configuration of a cgroup), but believe this to be antithetical as a primary solution for building reliable systems. We also expect that there are scenarios where orchestrators will be unavailable or primitive or hardware will be constrained, and therefore not tolerant of frequently failing applications. As a result, we need a better tuned algorithm for cgroup support to the end of running reliable software within constrained environments.

See [implementing hard limit for GC heap dotnet/coreclr #22180](https://github.com/dotnet/coreclr/pull/22180).

See [Validate container improvements with .NET 6](https://github.com/dotnet/runtime/issues/53149).

## GC Heap Hard Limit

The following configuration knobs will be exposed to enable developers to configure their applications:
Expand All @@ -29,29 +31,46 @@ The GC will more aggressive perform GCs as the GC heap grows closer to the `GCHe

The GC will throw an `OutOfMemoryException` for allocations that would cause the committed heap size to exceed the `GCHeapHardLimit` memory size, even after a full compacting GC.

## GC Heap Heap Minimum Size
## GC Heap Minimum Size

Using Server GC, there are multiple GC heaps created, up to one per core. This model doesn't scale well when a small memory limit is set on a machine with many cores.

The minimum _reserved_ segment size per heap: 16mb
The minimum _reserved_ segment size per heap: `16 MiB`
richlander marked this conversation as resolved.
Show resolved Hide resolved

Example -- CPU unconstrained:

Example:
```bash
docker run --rm -m 256mb mcr.microsoft.com/dotnet/samples
```

* 48 core machine
* cgroup has a 200MB memory limit
* cgroup has a 256MiB memory limit
* cgroup has no CPU/core limit
* 160MB `GCHeapHardLimit`
* Server GC will create 10 GC heaps
* All 48 cores can be used by the application
* 192MB `GCHeapHardLimit`
* Server GC will create 12 GC heaps, with 16 MiB reserved memory
* All 48 cores can be used by the application, per [container policy](https://docs.docker.com/config/containers/resource_constraints/#cpu)

Example:
`heaps = (256 * .75) / 16`
`heaps = 12`

Example -- CPU constrained:

```bash
docker run --rm -m 256mb --cpus 2 mcr.microsoft.com/dotnet/samples
```

* 48 core machine
* cgroup has a 200MB memory limit
* cgroup has 4 CPU/core limit
* 160MB `GCHeapHardLimit`
* Server GC will create 4 GC heaps
* Only 4 cores can be used by the application
* cgroup has a 256MiB memory limit
* cgroup has 2 CPU/core limit
* 192MB `GCHeapHardLimit`
* Server GC will create 2 GC heaps, with 16 MiB reserved memory
* Only 2 cores can be used by the application

There are other scenarios, like using `--cpuset-cpus` (CPU affinity) but they all follow from these two examples.

If [`DOTNET_PROCESSOR_COUNT`](https://github.com/dotnet/runtime/issues/48094) is set, including if it differs from `--cpus`, then the GC will use the ENV value for determining the maximum number of heaps to create.

Note: .NET Framework has the same behavior but `COMPlus_RUNNING_IN_CONTAINER` must be set. Also processor count is affected (in the same way) by `COMPlus_PROCESSOR_COUNT`.
richlander marked this conversation as resolved.
Show resolved Hide resolved

## Previous behavior

Expand Down