-
Notifications
You must be signed in to change notification settings - Fork 410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MCO-708: Extract and merge kernel arguments from /proc/cmdline #3856
Conversation
@ori-amizur: This pull request references MCO-708 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/assign @sinnykumari |
/uncc @djoshy |
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic won't be reliable and can create undesired behavior in certain cases. For example, consider someone applied kargs hugepagesz=1G hugepages=4 hugepagesz=2M hugepages=6
through MCO (very common through PAO/NTO). For such kargs ordering matters, what it means for our example is we are requesting 4 hugepages of size 1G size and 6 hugepages of size 2M. Now, suppose /proc/cmdline contains hugepagesz=1G hugepages=6
. With the proposed logic, it will consider that we already have two kargs and will skip them and apply remaining.
For context, see past bug https://bugzilla.redhat.com/show_bug.cgi?id=1866546#c10 .
From past experience, handling kargs reliably is tricky and we would want to stick with what user have provided with respecting the order.
What I am proposing is instead of merging un-applied kargs to oldconfig what we can do here is in compareMachineConfig() , when we see that diff is only in kargs. Add an extra comparison where we merge all kargs supplied in newConfig separated by space and see if this complete set of merged kargs string is present in /proc/cmdline. If yes, compareMachineConfig would return true. If not, then it means that something didn't match and we fallback to performing update with reboot in order for MCO to apply MCO managed kargs present in newwConfig.
After having offline conversation with Ori, realized that we always perform update on firstboot from nil MC err = dn.update(nil, &mc, false)
. So, any changes made in oldConfig is mainly used for comparison in compareMachineConfig(). So, it it should be fine.
pkg/daemon/update.go
Outdated
@@ -122,6 +122,24 @@ func (dn *Daemon) performPostConfigChangeAction(postConfigChangeActions []string | |||
return dn.triggerUpdateWithMachineConfig(state.currentConfig, state.desiredConfig, true) | |||
} | |||
|
|||
func mergeRunningKargs(config *mcfgv1.MachineConfig, requestedKargs []string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment on what and why we are doing this would be helpful.
Will be good to have a unit test for this. |
7fa9f39
to
279be2d
Compare
/retest |
1 similar comment
/retest |
pkg/daemon/update.go
Outdated
} | ||
|
||
func setRunningKargs(config *mcfgv1.MachineConfig, requestedKargs []string) error { | ||
b, err := os.ReadFile("/proc/cmdline") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's very similar code in validateKernelArguments
, which parses rpm-ostree kargs
(which is basically the same thing as /proc/cmdline
in practice).
But, we also already have a parseKernelArguments
function here.
(BTW, I do think longer term we need to drop this kernel argument handling out of the MCO; I think containers/bootc#22 (comment) may be the best path to do it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parseKernelArguments
parses the kernel arguments from machine config, not from cmdline.
Although it may be used for splitting the cmdline assuming the first and only element of the provided slice is the cmdline.
I am not sure how validateKernelArguments
can help for comparing the existing kernel arguments with the desired kernel arguments, since it is used for validation that it matches kargs from rpm-ostree
not from cmdline.
In addition, the function doesn't preserve the order as requested by @sinnykumari
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems there is attempt in parseKernelArguments
to parse each of the kernel arguments provided in the machine config. On the other hand KernelArguments that were not parsed in machine-configs are compared in the function compareMachineConfig
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cgwalters what are you suggesting here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically I think we shouldn't have one part of the MCO code reading /proc/cmdline
and another running rpm-ostree kargs
. They're not the same thing, but...what we're doing is aiming to be close, and the quotation bug below shows why it's good to only have one path.
pkg/daemon/update.go
Outdated
config.Spec.KernelArguments = nil | ||
for _, split := range splits { | ||
for _, reqKarg := range requestedKargs { | ||
if strings.ReplaceAll(reqKarg, "\"", "") == strings.ReplaceAll(split, "\"", "") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this line, what's the ReplaceAll
doing here?
I would say we should try to align with validateKernelArguments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In /proc/cmdline the arguments appear without quotes even if the quotes were provided to ostree as kernel arguments. To align them, the quotes are removed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
even if the quotes were provided to ostree as kernel arguments
What quotes given to ostree? The ostree side just passes through these things. Where exactly are the quotes coming from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In 4.14 if we take the kernelArguments in the file /etc/ignition-machine-config-encapsulated.json
we get:
[
"systemd.unified_cgroup_hierarchy=1",
"cgroup_no_v1=\"all\"",
"psi=1"
]
But after they are applied, in the cmdline file, it looks like (partial):
systemd.unified_cgroup_hierarchy=1 cgroup_no_v1=all psi=1
So the second one appears without quotes in the cmdline file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK let's be clear then, we're talking about cgroup_no_v1="all"
as the weird outlier here. Looks like that comes from this code:
kernelArgsv2 = []string{"systemd.unified_cgroup_hierarchy=1", "cgroup_no_v1=\"all\"", "psi=1"} |
Which...what the heck is that? (a bit of time passes) OK, I found it in https://docs.kernel.org/admin-guide/cgroup-v2.html
But I think whoever wrote that code was likely confused...it certainly seems to me like the Linux kernel is intending to parse cgroup_no_v1=all
, not cgroup_no_v1="all"
. (Though I bet I see why the code author made that mistake because the kernel docs seem to include the quotes in one case but not another).
Compounding the fun here, also AFAICS there are zero warnings for unknown/unhandled parameters here. So specifying (edit: Nope, see below).cgroup_no_v1="all"
just silently does nothing
Now the next large problem here is that something in the stack is "eating" these quotes.
Playing with this using e.g.:
$ rpm-ostree kargs --append='cgroup_no_v1="all"'
$ reboot
...
$ cat /proc/cmdline
... cgroup_no_v1=all
However, notice the bootloader entry does have those quotes:
$ grep options /boot/loader/entries/ostree-2-rhcos.conf
options ignition.platform.id=qemu ... cgroup_no_v1="all"
$
If I was a betting person here (and I am) then my money would be on grub doing the wrong thing.
Wait, wait no perhaps not: I think it's pretty clear, check out this kernel code:
https://github.com/torvalds/linux/blob/7ba2090ca64ea1aa435744884124387db1fac70f/lib/cmdline.c#L224
It's clearly implementing a minimal subset of "shell like" parsing...
but can't escape "
Oh man, what an ad-hoc ill-specified mess!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Offhand I think this logic dates back to the Linux kernel pre-history torvalds/linux@1da177e
So here's what I'd say:
- Factor out a helper function named e.g.
parseKernelArgumentsLikeLinux
that performs the same steps as the Linux kernel in "parsing" command line arguments (and I'm now realizing that we probably need to do this in ostree side too) - Cross-reference the Linux kernel code
- Consider changing the kubelet controller code to drop the unnecessary quotes because we've just painted ourselves into an unfortunate corner case with this
Alternatively - notice that rpm-ostree kargs
outputs what it wrote into the bootloader config (i.e. it includes quotes). So instead of parsing /proc/cmdline
, we could parse that (which again is what the other MCO code is doing)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using rpm-ostree kargs
sounds like a good solution.
dc7301f
to
67d9448
Compare
/retest |
@@ -122,6 +122,28 @@ func (dn *Daemon) performPostConfigChangeAction(postConfigChangeActions []string | |||
return dn.triggerUpdateWithMachineConfig(state.currentConfig, state.desiredConfig, true) | |||
} | |||
|
|||
func setRunningKargsWithCmdline(config *mcfgv1.MachineConfig, requestedKargs []string, cmdline []byte) error { | |||
splits := splitKernelArguments(strings.TrimSpace(string(cmdline))) | |||
config.Spec.KernelArguments = nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this shouldn't be necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wanted to make sure that calling twice wouldn't cause it to behave badly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for adding unit test and comment.
In firstboot MCO checks if reboot can be skipped. In order for reboot to be skipped, the kernel arguments of the current (booted) system and the expected system need to match. Currently, in firstboot the list of the current kargs is assumed to be empty. To reflect the actual list of arguments the system was booted with, this change extracts the set of booted kargs from /proc/cmdline to be used for comparison. Only kargs that appear both in the requested kargs and /proc/cmdline are used for comparison.
c8e1191
to
f2877d6
Compare
@ori-amizur: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Thanks! |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cgwalters, ori-amizur, sinnykumari The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
} | ||
|
||
func setRunningKargs(config *mcfgv1.MachineConfig, requestedKargs []string) error { | ||
rpmostreeKargsBytes, err := runGetOut("rpm-ostree", "kargs") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very neat and consistent recommendation!
In firstboot MCO checks if reboot can be skipped. In order for reboot to be skipped, the kernel arguments of the current (booted) system and the expected system need to match.
Currently, in firstboot the list of the current kargs is assumed to be empty. To reflect the actual list of arguments the system was booted with, this change extracts the set of booted kargs from /proc/cmdline to be used for comparison.
Only kargs that appear both in the requested kargs and /proc/cmdline are used for comparison.
- What I did
Extracted the kernel arguments from /proc/cmdline to be used for comparison with the expected kernel arguments
- How to verify it
- Description for the changelog
Extract and merge kernel arguments from /proc/cmdline