-
Notifications
You must be signed in to change notification settings - Fork 410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-16921: daemon: Make binary writing idempotent #3825
Conversation
@sinnykumari: This pull request references Jira Issue OCPBUGS-16921, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/lgtm |
@@ -466,7 +466,8 @@ func ReexecuteForTargetRoot(target string) error { | |||
// Otherwise, we assume that there's no suffixing needed. Hopefully | |||
// by RHEL10 the MCD will have fundamentally changed and we won't be doing the | |||
// chroot() thing anymore. | |||
klog.Info("not chrooting for source=rhel-%s target=rhel-%s", sourceMajor, targetMajor) | |||
klog.Infof("not chrooting for source=rhel-%s target=rhel-%s", sourceMajor, targetMajor) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually sorry, hold on a second...I think I was wrong in our chat. We do need to chroot right? We just don't want to re-exec...
Something more like
diff --git a/pkg/daemon/daemon.go b/pkg/daemon/daemon.go
index 6a9f2079c..2e18bf5da 100644
--- a/pkg/daemon/daemon.go
+++ b/pkg/daemon/daemon.go
@@ -471,35 +471,38 @@ func ReexecuteForTargetRoot(target string) error {
} else {
klog.Info("assuming we can use container binary chroot() to host")
}
- sourceBinary := "/usr/bin/machine-config-daemon" + sourceBinarySuffix
- src, err := os.Open(sourceBinary)
- if err != nil {
- return fmt.Errorf("opening %s: %w", sourceBinary, err)
- }
- defer src.Close()
+ var targetBin string
+ if sourceBinarySuffix != "" {
+ sourceBinary := "/usr/bin/machine-config-daemon" + sourceBinarySuffix
+ src, err := os.Open(sourceBinary)
+ if err != nil {
+ return fmt.Errorf("opening %s: %w", sourceBinary, err)
+ }
+ defer src.Close()
- targetBinBase := "run/bin/machine-config-daemon"
- targetBin := filepath.Join(target, targetBinBase)
- targetBinDir := filepath.Dir(targetBin)
- if _, err := os.Stat(targetBinDir); err != nil {
- if err := os.Mkdir(targetBinDir, 0o755); err != nil {
- return fmt.Errorf("mkdir %s: %w", targetBinDir, err)
+ targetBinBase := "run/bin/machine-config-daemon"
+ targetBin = filepath.Join(target, targetBinBase)
+ targetBinDir := filepath.Dir(targetBin)
+ if _, err := os.Stat(targetBinDir); err != nil {
+ if err := os.Mkdir(targetBinDir, 0o755); err != nil {
+ return fmt.Errorf("mkdir %s: %w", targetBinDir, err)
+ }
}
- }
- f, err := os.Create(targetBin)
- if err != nil {
- return fmt.Errorf("writing %s: %w", targetBin, err)
- }
- if _, err := io.Copy(f, src); err != nil {
+ f, err := os.Create(targetBin)
+ if err != nil {
+ return fmt.Errorf("writing %s: %w", targetBin, err)
+ }
+ if _, err := io.Copy(f, src); err != nil {
+ f.Close()
+ return fmt.Errorf("writing %s: %w", targetBin, err)
+ }
+ if err := f.Chmod(0o755); err != nil {
+ return err
+ }
+ // Must close our writable fd
f.Close()
- return fmt.Errorf("writing %s: %w", targetBin, err)
}
- if err := f.Chmod(0o755); err != nil {
- return err
- }
- // Must close our writable fd
- f.Close()
if err := syscall.Chroot(target); err != nil {
return fmt.Errorf("failed to chroot to %s: %w", target, err)
@@ -509,6 +512,9 @@ func ReexecuteForTargetRoot(target string) error {
return fmt.Errorf("failed to change directory to /: %w", err)
}
+ if targetBin == "" {
+ return nil
+ }
// Now we will see the binary in the target root
targetBin = "/" + targetBinBase
// We have a "belt and suspenders" approach for detecting the case where
?
But there's still something weird going on here because I don't understand how we can be recursing here...IOW why are we getting that error of having the text file being busy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are right, we don't need these new binaries copy stuff but we do need to chroot. This happens when doing things in hurry :/
/lgtm cancel |
Fixes OCPBUGS-16921
/retest-required |
not sure why tests are in pending state |
For the (async, public) record we had a live chat about this and we figured out that what is likely happening here is that in this scenario the MCD is running as a pod (from the binary in So then your question is...wait, how can it possibly work to have two instances of the MCD running? Well, I don't know...this goes to #1592 |
Looks good to go in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall should be good, 2 minor comments inline
if _, err := io.Copy(f, src); err != nil { | ||
targetBinDir := filepath.Dir(targetBin) | ||
if _, err := os.Stat(targetBinDir); err != nil { | ||
if err := os.Mkdir(targetBinDir, 0o755); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we making the assumption that the error of the above is errnotfound? Might be helpful to make it explicit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
umm, I am not sure adding here additional error check will add much value. Irrespective of what is the error, we are returning the error anyway with no further action needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I meant the error return from os.Stat(), must like my other comment below, not the os.Mkdir
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
pkg/daemon/daemon.go
Outdated
return fmt.Errorf("mkdir %s: %w", targetBinDir, err) | ||
|
||
// Be idempotent | ||
if _, err := os.Stat(targetBin); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah so we skip the write if the targetBin exists? Might be good to also make that explicit here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea, updated
When we are skipping the mcd binary write when it already exists. Today this is the case for RHEL worker node during upgrade. Explicitly make sure that we are skipping because file doesn't exist
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: sinnykumari, yuqi-zhang The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@sinnykumari: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@sinnykumari: Jira Issue OCPBUGS-16921: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-16921 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
- What I did
- How to verify it
- Description for the changelog