Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libvirt: remove default cputune shares value #56

Merged
merged 1 commit into from
Nov 3, 2023

Conversation

jovial
Copy link

@jovial jovial commented Nov 3, 2023

Previously, the libvirt driver defaulted to 1024 * (# of CPUs) for the
value of domain/cputune/shares in the libvirt XML. This value is then
passed directly by libvirt to the cgroups API. Cgroups v2 imposes a
maximum value of 10000 that can be passed in. This makes Nova
unable to launch instances with more than 9 CPUs on hosts that run
cgroups v2, like Ubuntu Jammy or RHEL 9.

Fix this by just removing the default entirely. Because there is no
longer a guarantee that domain/cputune will contain at least a shares
element, we can stop always generating the former, and only generate
it if it will actually contain something.

We can also make operators's lives easier by leveraging the fact that
we update the XML during live migration, so this patch also adds a
method to remove the shares value from the live migration XML if one
was not set as the quota:cpu_shares flavor extra spec.

For operators that have set this extra spec to something greater
than 10000, their flavors will have to get updates, and their
instances resized.

Partial-bug: 1978489
Change-Id: I49d757f5f261b3562ada27e6cf57284f615ca395
(cherry picked from commit f77a9fe)
(cherry picked from commit 888e837bb71464cd1c2179964ac3e853ac18db52)

@jovial jovial requested a review from a team as a code owner November 3, 2023 13:05
Previously, the libvirt driver defaulted to 1024 * (# of CPUs) for the
value of domain/cputune/shares in the libvirt XML. This value is then
passed directly by libvirt to the cgroups API. Cgroups v2 imposes a
maximum value of 10000 that can be passed in. This makes Nova
unable to launch instances with more than 9 CPUs on hosts that run
cgroups v2, like Ubuntu Jammy or RHEL 9.

Fix this by just removing the default entirely. Because there is no
longer a guarantee that domain/cputune will contain at least a shares
element, we can stop always generating the former, and only generate
it if it will actually contain something.

We can also make operators's lives easier by leveraging the fact that
we update the XML during live migration, so this patch also adds a
method to remove the shares value from the live migration XML if one
was not set as the quota:cpu_shares flavor extra spec.

For operators that *have* set this extra spec to something greater
than 10000, their flavors will have to get updates, and their
instances resized.

Partial-bug: 1978489
Change-Id: I49d757f5f261b3562ada27e6cf57284f615ca395
(cherry picked from commit f77a9fe)
(cherry picked from commit 888e837bb71464cd1c2179964ac3e853ac18db52)
@jovial jovial force-pushed the bugfix/yoga-live-migrate-rocky-9 branch from 1ef0270 to fa3c95b Compare November 3, 2023 14:39
@jovial jovial closed this Nov 3, 2023
@jovial jovial reopened this Nov 3, 2023
@jovial jovial enabled auto-merge November 3, 2023 15:00
@jovial jovial merged commit 22d06e3 into stackhpc/yoga Nov 3, 2023
6 checks passed
@jovial jovial deleted the bugfix/yoga-live-migrate-rocky-9 branch November 3, 2023 15:20
jovial added a commit to stackhpc/stackhpc-kayobe-config that referenced this pull request Nov 3, 2023
jovial added a commit to stackhpc/stackhpc-kayobe-config that referenced this pull request Nov 6, 2023
* Build nova from StackHPC fork

We need to include some fixes that haven't landed upstream yet.

* Bump nova tag

Includes:
- stackhpc/nova#55
- stackhpc/nova#56

* Add release notes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants