-
Notifications
You must be signed in to change notification settings - Fork 516
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
machined.run
function blocks forever if c.V1Alpha2().Run
fails without ever running controllers.
#8263
Comments
This code is ugly, as it combines older sequence-based initialization and newer controller-based approach. But if we want to fix it (which I believe we don't need to for real), we just need to wait with some timeout and fail the sequencer. This works well. If we need this for debugging purposes, we can add a log line. |
The thing is, under current implementation you will never see
We would also need to add timeout for
|
While we decide what to do with siderolabs#8263 and siderolabs#8256 this quickfix at least allows us to see what went wrong Signed-off-by: Dmitriy Matrenichev <[email protected]>
Created #8264 as a quick fix. |
that's my point - it doesn't happen unless it's a developer error, so this is a non-issue for our users, but rather improvement for developers. this is not bad to be fixed, but I would rather try to find the least intrusive way to do so, and in the end this code will most probably go away. |
While we decide what to do with siderolabs#8263 and siderolabs#8256 this quickfix at least allows us to see what went wrong Signed-off-by: Dmitriy Matrenichev <[email protected]>
The main culprit is here:
talos/internal/app/machined/main.go
Lines 208 to 214 in 474fa04
If it exits too soon, the
talos/internal/app/machined/main.go
Lines 227 to 233 in 474fa04
will block forever, because EnforceKSPPRequirements (specifically runtime.KernelParamsSetCondition) depends on runtime running. But even if we somehow exit from this function we will forever block here
talos/internal/app/machined/main.go
Lines 197 to 199 in 474fa04
Whats is interesting here, is that passed
ctx
will essentially act ascontext.Background()
because deferred cancelation will happen AFTER we finish with system.Singleton.Shutdown(...). And those, if runtime is not running,Shutdown
will block forever too.I try to fix this in #8256, but I don't think we should pass canceled context to the
Shutdown(...)
.Instead we create a new context with timeout of 60 seconds (stopServices try to wait for about 30 seconds, so we make the overall context twice as big). But I'm unsure if it's proper fix or not.
The text was updated successfully, but these errors were encountered: