[RFE] failure during ignition first_boot causes loop, no way to capture logs #454

ericb-summit · 2021-07-29T06:06:34Z

Current situation

I am deploying flatcar using terraform, vsphere and ignition. If first_boot fails (for example, due to bad ignition config, or any error during first_boot), errors rapidly scroll on the console. Then it pauses a few seconds, and the scripts appear to retry again, repeatedly, in a loop with no way to see the underlying cause. (I should point out, if I power off + start the VM, it boots fine.)

Impact

It is very difficult to troubleshoot the root cause, as I have no logs (only the virtual console).

Ideal future situation

It would be useful to have some way to tell the boot process not to retry on failure, (i.e. additional grub parameter) or, some other mechanism to abort the boot to access journalctl.

t-lo · 2021-07-29T07:42:41Z

Though options are limited (and we have roadmap items to address this in the future) did you have a look at Mayday reports?

tormath1 · 2021-07-29T07:55:10Z

In case you have access to a large console buffer, you can also edit the kernel command line to provide this options systemd.journald.max_level_console=debug console=ttyS0 - it will increase the ignition verbosity.

In case you miss it, Ignition V3 is in progress (see: #387), you can assert that your configuration is < V3.

Let us know how it goes !

ericb-summit · 2021-07-29T13:26:51Z

Hi guys. I've solved my original problem, and for context it was caused by the terraform partition ignition directive instructing ignition to create a new partition at each boot.

However, I didn't solve this because I was able to identify that from the logs. It was pretty much trial and error.

I read about Mayday, but I could never actually get to a shell console to run it.

This is what I mean -- in some cases it seems ignition failures are considered temporary. It keeps retrying forever, no login prompt ever appears, and I have no way of collecting logs other than scrolling back on the virtual console (shift page up). And even that isn't an option, because as soon as new console output appears, the vty scrolls to the end.

Aaaand I fat finger closed the issue, that wasn't intentional. I've run into this many times and probably will again. What is the proper strategy for collecting first boot journalctl logs for a first boot that never completes?

pothos · 2021-07-29T13:38:56Z

As you said, it's a problem that Ignition failures result in a loop and the way forward is to do something like failing the boot so that it drops to a dracut rescue shell prompt as with other initramfs boot errors.

jepio · 2021-07-29T13:50:00Z

Similar issue was reported here #434

ericb-summit · 2021-07-29T14:44:51Z

Yes I had read #434. And in fact, if I specify an ignition file pointing to a remote URI, but provide no pre-boot IP or DHCP, I get the same behaviour I describe here.

Basically, any failure and I'm in Barney with ignition.

So, I take, there's no way to collect logs? Is there some magical param I can pass to grub to bypass first_boot, like, say, init=/bin/sh as I would do with other OS, and then I can scour /var/log for the logs?

ericb-summit closed this as completed Jul 29, 2021

pothos reopened this Jul 29, 2021

jepio mentioned this issue Sep 7, 2021

sys-kernel/bootengine: prevent boot loop on ignition failure flatcar-archive/coreos-overlay#1262

Merged

jepio closed this as completed in flatcar-archive/coreos-overlay#1262 Sep 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFE] failure during ignition first_boot causes loop, no way to capture logs #454

[RFE] failure during ignition first_boot causes loop, no way to capture logs #454

ericb-summit commented Jul 29, 2021 •

edited

Loading

t-lo commented Jul 29, 2021

tormath1 commented Jul 29, 2021

ericb-summit commented Jul 29, 2021 •

edited

Loading

pothos commented Jul 29, 2021

jepio commented Jul 29, 2021

ericb-summit commented Jul 29, 2021

[RFE] failure during ignition first_boot causes loop, no way to capture logs #454

[RFE] failure during ignition first_boot causes loop, no way to capture logs #454

Comments

ericb-summit commented Jul 29, 2021 • edited Loading

Current situation

Impact

Ideal future situation

t-lo commented Jul 29, 2021

tormath1 commented Jul 29, 2021

ericb-summit commented Jul 29, 2021 • edited Loading

pothos commented Jul 29, 2021

jepio commented Jul 29, 2021

ericb-summit commented Jul 29, 2021

ericb-summit commented Jul 29, 2021 •

edited

Loading

ericb-summit commented Jul 29, 2021 •

edited

Loading