Skip to content

Commit

Permalink
feat: introduce talos.exp.wipe kernel param to wipe system disk
Browse files Browse the repository at this point in the history
Fixes: #4399

Signed-off-by: Artem Chernyshev <[email protected]>
  • Loading branch information
Unix4ever committed Dec 29, 2021
1 parent c079eb3 commit da0b36e
Show file tree
Hide file tree
Showing 5 changed files with 38 additions and 0 deletions.
9 changes: 9 additions & 0 deletions hack/release.toml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,14 @@ preface = """\
title = "Component Updates"
description="""\
* Linux: 5.15.11
"""

[notes.wipe]
title = "Wipe System Kernel Parameter"
description="""\
Added new kernel parameter `talos.experimental.wipe=system` which can help resetting system disk for the machine
and start over with a fresh installation.
See [Resetting a Machine](https://www.talos.dev/docs/v0.15/guides/resetting-a-machine/#kernel-parameter) on how to use it.
"""

[make_deps]
Expand All @@ -34,3 +42,4 @@ preface = """\
[make_deps.extras]
variable = "EXTRAS"
repository = "github.com/talos-systems/extras"

Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,12 @@
package v1alpha1

import (
"github.com/talos-systems/go-procfs/procfs"

"github.com/talos-systems/talos/internal/app/machined/pkg/runtime"
machineapi "github.com/talos-systems/talos/pkg/machinery/api/machine"
"github.com/talos-systems/talos/pkg/machinery/config/types/v1alpha1/machine"
"github.com/talos-systems/talos/pkg/machinery/constants"
)

// Sequencer implements the sequencer interface.
Expand Down Expand Up @@ -164,6 +167,11 @@ func (*Sequencer) Install(r runtime.Runtime) []runtime.Phase {
func (*Sequencer) Boot(r runtime.Runtime) []runtime.Phase {
phases := PhaseList{}

wipe := procfs.ProcCmdline().Get(constants.KernelParamWipe).First()
if wipe != nil && *wipe == "system" {
return phases.Append("wipeSystemDisk", ResetSystemDisk).Append("reboot", Reboot)
}

phases = phases.AppendWhen(
r.State().Platform().Mode() != runtime.ModeContainer,
"saveStateEncryptionConfig",
Expand Down
4 changes: 4 additions & 0 deletions pkg/machinery/constants/constants.go
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,10 @@ const (
// kernel log delivery destination.
KernelParamLoggingKernel = "talos.logging.kernel"

// KernelParamWipe is the kernel parameter name for specifying the
// disk to wipe on the next boot and reboot.
KernelParamWipe = "talos.experimental.wipe"

// BoardNone indicates that the install is not for a specific board.
BoardNone = "none"

Expand Down
10 changes: 10 additions & 0 deletions website/content/docs/v0.15/Guides/resetting-a-machine.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ From time to time, it may be beneficial to reset a Talos machine to its "origina
Bear in mind that this is a destructive action for the given machine.
Doing this means removing the machine from Kubernetes, Etcd (if applicable), and clears any data on the machine that would normally persist a reboot.

## CLI

> WARNING: Running a `talosctl reset` on cloud VM's might result in the VM being unable to boot as this wipes the entire disk.
It might be more useful to just wipe the `STATE` and `EPHEMERAL` partitions on a cloud VM if not booting via `iPXE`.
`talosctl reset --system-labels-to-wipe STATE --system-labels-to-wipe EPHEMERAL`
Expand All @@ -25,3 +27,11 @@ The `graceful` flag is especially important when considering HA vs. non-HA Talos
If the machine is part of an HA cluster, a normal, graceful reset should work just fine right out of the box as long as the cluster is in a good state.
However, if this is a single node cluster being used for testing purposes, a graceful reset is not an option since Etcd cannot be "left" if there is only a single member.
In this case, reset should be used with `--graceful=false` to skip performing checks that would normally block the reset.
## Kernel Parameter
Another way to reset a machine is to specify `talos.experimental.wipe=system` kernel parameter.
If the machine got stuck in the boot loop and you access to the console you can use GRUB to specify this kernel argument.
Then when Talos boots for the next time it will reset system disk and reboot.
Next steps can be to install Talos either using PXE boot or by mounting an ISO.
7 changes: 7 additions & 0 deletions website/content/docs/v0.15/Reference/kernel.md
Original file line number Diff line number Diff line change
Expand Up @@ -115,3 +115,10 @@ Several of these are enforced by the Kernel Self Protection Project [KSPP](https
DHCP server, this can add significant boot delays.

This option may be specified multiple times for multiple network interfaces.

#### `talos.experimental.wipe`

Resets the disk before starting up the system.

Valid options are:
- `system` resets system disk.

0 comments on commit da0b36e

Please sign in to comment.