-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
35coreos-live: stop overriding NetworkManager-wait-online timeout to 5s #1088
35coreos-live: stop overriding NetworkManager-wait-online timeout to 5s #1088
Conversation
Requires: coreos/fedora-coreos-pinger#41 (or alternatively, disable or nuke fedora-coreos-pinger as per coreos/fedora-coreos-tracker#770). |
Wow - really nice work and investigation here @jlebon @bengal
Might it be nice to have a test that tries to detect if any units (including ones we don't own) pull in |
I still support adding coreos/coreos-installer#565 as it just helps us make things more resilient in the future. |
Started a backport for the pinger patch in https://src.fedoraproject.org/rpms/rust-fedora-coreos-pinger/pull-request/4 so we can unblock this.
Yeah, I think that makes sense. |
We originally did this in coreos#326 because we wanted to support booting the live ISO without networking. This was solved on the initramfs side by the conditional networking work (coreos#426). But for the real root, this was still useful because if booting the ISO interactively on a system without any network, or a non-DHCP network, we didn't want the user to have to wait until the service timed out before getting a shell. The core issue however is that we're requesting `network-online.target` at all. It's an "active unit" which means that it's only pulled in the transaction, possibly delaying boot, if another systemd unit needs it. And ideally, no service would need it as per: https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/ In our case, this unit was fedora-coreos-pinger. We drop that requirement here: coreos/fedora-coreos-pinger#41 With that, we no longer pull in `network-online.target` and so no longer delay reaching the console even if NetworkManager isn't able to get an active connection for whatever reason. This matches how it works on traditional Fedora as well. Having a short timeout actually also had a counterproductive effect in the automated install case. There, `coreos-installer.service` does pull in `network-online.target` (which with coreos/coreos-installer#565 we could consider dropping as advised by systemd, though we probably should bump the number of retries some more in that case), but because of the short timeout, we genuinely may not yet have the network fully up before we run (see https://bugzilla.redhat.com/show_bug.cgi?id=1967483).
a672d84
to
a0a785f
Compare
Updated this now with a test and fast-tracking the fixed fedora-coreos-pinger! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The `Wants=network-online.target` predates the addition of fetch retries. Now that we retry HTTP requests indefinitely in the automated flow, let's follow best practices and stop pulling in `network-online.target`. Still keep it as an `After` though; *if* something pulls in `network-online.target`, then we might as well be smarter and start our retries after we know we're online. Also add `network.target` so that if `network-online.target` *isn't* pulled in, we still have a reasonable lower bound on when we start retrying. Related: coreos/fedora-coreos-config#1088 Related: coreos#565 (comment) Closes: https://github.com/coreos/coreos-installer/issues/1334
The `Wants=network-online.target` predates the osmet work which enabled the now default fully offline install flow. It also predates the addition of fetch retries. So even in an online flow, now that we retry HTTP requests indefinitely, we don't really need this. Let's follow best practices and stop pulling in `network-online.target`. Still keep it as an `After` though; *if* something pulls in `network-online.target`, then we might as well be nicer and start our retries after we know we're online. Also add `network.target` so that if `network-online.target` *isn't* pulled in, we still have a reasonable lower bound on when we start retrying. Related: coreos/fedora-coreos-config#1088 Related: coreos#565 (comment) Related: https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/ Closes: https://github.com/coreos/coreos-installer/issues/1334
The `Wants=network-online.target` predates the osmet work which enabled the now default fully offline install flow. It also predates the addition of fetch retries. So even in an online flow, now that we retry HTTP requests indefinitely, we don't really need this. Let's follow best practices and stop pulling in `network-online.target`. We still need to keep pulling in `NetworkManager.service` though. It's enabled by default in `multi-user.target` but not `coreos-installer-post.target` (which is what we boot to in an automated install). NetworkManager is capable of handling offline environments just fine and won't block the install just because a connection isn't available (yet or ever). Related: coreos/fedora-coreos-config#1088 Related: coreos#565 (comment) Related: https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/ Closes: https://github.com/coreos/coreos-installer/issues/1334
The `Wants=network-online.target` predates the osmet work which enabled the now default fully offline install flow. It also predates the addition of fetch retries. So even in an online flow, now that we retry HTTP requests indefinitely, we don't really need this. Let's follow best practices and stop pulling in `network-online.target`. We still need to keep pulling in `NetworkManager.service` though. It's enabled by default in `multi-user.target` but not `coreos-installer-post.target` (which is what we boot to in an automated install). NetworkManager is capable of handling offline environments just fine and won't block the install just because a connection isn't available (yet or ever). Related: coreos/fedora-coreos-config#1088 Related: coreos#565 (comment) Related: https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/ Closes: https://github.com/coreos/coreos-installer/issues/1334
We originally did this in #326 because we wanted to support booting the
live ISO without networking. This was solved on the initramfs side by
the conditional networking work (#426). But for the real root, this was
still useful because if booting the ISO interactively on a system
without any network, or a non-DHCP network, we didn't want the user to
have to wait until the service timed out before getting a shell.
The core issue however is that we're requesting
network-online.target
at all. It's an "active unit" which means that it's only pulled in the
transaction, possibly delaying boot, if another systemd unit needs it.
And ideally, no service would need it as per:
https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/
In our case, this unit was fedora-coreos-pinger. We drop that
requirement here:
coreos/fedora-coreos-pinger#41
With that, we no longer pull in
network-online.target
and so no longerdelay reaching the console even if NetworkManager isn't able to get an
active connection for whatever reason. This matches how it works on
traditional Fedora as well.
Having a short timeout actually also had a counterproductive effect in
the automated install case. There,
coreos-installer.service
does pullin
network-online.target
(which withcoreos/coreos-installer#565 we could consider
dropping as advised by systemd, though we probably should bump the
number of retries some more in that case), but because of the short
timeout, we genuinely may not yet have the network fully up before we
run (see https://bugzilla.redhat.com/show_bug.cgi?id=1967483).