machine-specific machineconfigs #1720

cgwalters · 2020-05-08T17:44:56Z

Ironically today MachineConfig objects target a pool - not a specific machine.

Managing machine-specific configuration today

Follow up to this comment. Today if you want to manage per-machine configuration such as e.g. statically set hostname, the best way to do this is to take the worker.ign file generated by openshift-installer and create e.g .worker-foo.ign and worker-bar.ign copies, then modify them to include configuration specific to the machine, and provide that Ignition to the node.

For the "provide that Ignition to the node" phase, if you're using MachineSets via MachineAPI, then one would need to edit the machineset object to point to a new user data secret. Note this will eventually conflict with having the MCO manage userdata but that enhancement was reverted. But when we get there, we can teach the MCO to retain any additional config it finds in the machineset perhaps?

The OpenShift 4.8 documentation will describe how to use https://github.com/coreos/butane which is a high level tool for managing Ignition, but it isn't yet ergonomic to "backconvert" that pointer Ignition to butane, then output a new Ignition config.

Using the Live ISO

Additionally, one can use the Live ISO which can be programmed to have its own Ignition configuration that e.g. pulls a container which performs hardware inspection, and then dynamically generates a configuration which is passed to coreos-installer.

Background information

If we had a way to provide "machine specific machineconfigs" then the admins could provide those MCs as additional manifests and it'd all Just Work.

But...this gets into the "node identity" problem. We'd need to define a way for the MCS to identify the node it's serving the config to.

Perhaps the simplest thing is to assume anyone who wants this is statically assigning hostnames, and we change Ignition to include a header with the node's hostname.

(I guess we could do something like a reverse lookup of the requester's IP address too)

One messy aspect of this too is that we can't include these bits configs in the main rendered-$pool configs which means the MCS needs to generate it internally. (I guess we could create a separate rendered-node-$x config too?)

The text was updated successfully, but these errors were encountered:

michaelgugino · 2020-06-23T18:17:27Z

Perhaps the simplest thing is to assume anyone who wants this is statically assigning hostnames, and we change Ignition to include a header with the node's hostname.

Definitely we should do this with or without machine-specific machineconfigs.

(I guess we could do something like a reverse lookup of the requester's IP address too)

This is not guaranteed to work in all environments, and I think we discovered during troubleshooting 4.4 release that the IPs that show up in the MCS during first boot are VIPs, not instance IPs.

Including the hostname and IP on every request to the MCS would greatly aid in determining when a machine failed to boot/request an ignition vs failing to join cluster after getting ignition file.

cgwalters · 2020-07-01T19:03:50Z

One way to do this today is - for each machine one wants to have a custom config for, manually set things up to pass it a custom derived pointer config.

For example in a PXE install scenario, take the Ignition output from openshift-install create ignition-configs and edit that, rather than trying to customize the full Ignition config returned from the MCS.

openshift-bot · 2020-10-24T13:10:40Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

dhellmann · 2020-10-28T17:26:05Z

/remove-lifecycle stale

dhellmann · 2020-11-05T18:11:02Z

/lifecycle frozen

cgwalters · 2020-11-06T14:18:07Z

Perhaps one simple approach here is:

user injects /etc/hostname in the pointer Ignition config
Ignition detects this and adds it as a header when doing requests for merged configs (like the MCO)
MCO reads this header and adds in to the rendered Ignition (from all machineconfig objects with a label for that hostname)

However, we could simplify things significantly if we said the MCD wouldn't be aware of this - we wouldn't support "day 2" reconfiguration for machine-specific configs. IOW if e.g. you want to change the static IP address for a node, you fully reprovision it.

Given that I think most of this per-node configuration wouldn't change often, that could be a useful balance.

xref openshift/machine-config-operator#1720 This also came up in e.g. https://github.com/coreos/fedora-coreos-docs/pull/264/files#diff-089ac9657fd668d3f0f2d3dcb663fe1c75e72aaefac2ff1d78ae70c9cf96e46eR185

larsks · 2022-06-23T23:52:01Z

@cgwalters what's the state of this issue? I need to configure bonding on the primary interface on a small openshift (4.10) cluster, and the interface names on the nodes aren't identical...so I need a couple of different machineconfig resources, applying to different subsets of nodes.

I was hoping I could create node-specific pools, but if I try something like...

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
  name: ctl-0
spec:
  machineConfigSelector:
    matchLabels:
      kubernetes.io/hostname: ctl-0
  nodeSelector:
    matchLabels:
      kubernetes.io/hostname: ctl-0
  paused: false

...then MCO refuses to apply because:

W0623 23:44:20.353529       1 node_controller.go:827] can't get pool for node "ctl-0": node ctl-0 has both master role and custom role ctl-0

I guess the workaround is to bundle the desired configurations into a shell script that does something like...

#!/bin/bash

if [[ $HOSTNAME = ctl-0 ]]; then
  cp /etc/files/config-for-host0 /etc/actual/path/config
elif [[ $HOSTNAME = ctl-1 ]]; then
  cp /etc//files/config-for-host1 /etc/actual/path/config
fi

Etc.

Is there a better way?

cgwalters · 2022-06-23T23:59:41Z

@larsks See the top comment #1720 (comment) (which I reworked to clarify with the best solution today)

(Networking specifically also touches on nmstate, which is another thing)

larsks · 2022-06-24T00:22:22Z

(Networking specifically also touches on nmstate, which is another thing)

Right, and for this I would love to be able to use nmstate (in particular because it's easy to implement host-specific configs), but that explicitly can't be used to configure the primary host interface. The recommendation is to use machineconfigs :).

larsks · 2022-06-24T00:30:11Z

Reading through the top comment...

For the "provide that Ignition to the node" phase, if you're using MachineSets via MachineAPI, then one would need to edit the machineset object to point to a new user data secret...

For a cluster installed using e.g. the assisted installer, there is no MachineSet that covers the controllers. For a three-node cluster:

$ oc get -A machineset
NAMESPACE               NAME                            DESIRED   CURRENT   READY   AVAILABLE   AGE
openshift-machine-api   nerc-ocp-infra-5xv2p-worker-0   0         0                             28d

I also see the implication there that you're describing an install-time procedure, rather than something that could be a post-install configuration task.

My reading of that comment is that "today, it is generally not possible to apply machine-specific machineconfig resources". It sounds like I'm going to need to go with my hacky shell script (although even that plan was complicated by the fact that I can't use the directories section of an ignition config to create directories, so I need to find pre-existing directories in /etc into which I can copy files in a non-crazy fashion...)

We want to use bonded interface pairs on these system. The nodes aren't yet wired for it, but setting this up now will allow us to refer to the `bond0` interface in e.g. VLAN configurations (and means we won't have to re-work those later). Because we're using OVNKubernetes, we can't use nmstate [1] to enact the configuration. The recommendation is to apply the configuration using a MachineConfig [2] resource, but this is complicated by the fact that our nodes don't all have the same interface names, and it's not possible to apply node-specific machineconfigs [3]. We work around this solution by: 1. Copying nmconnection files for *all hosts* to *every host*, but placing them in `/etc/mco` (just because that's a convenient available directory, it seems relatively topical, and it's not possible to create new directories using the `directories` section of an ignition config). 2. Installing a systemd unit that runs a shell script at boot that copies the host-specific configs from `/etc/mco` into `/etc/NetworkManager/system-connections`. [1]: https://docs.openshift.com/container-platform/4.10/networking/k8s_nmstate/k8s-nmstate-about-the-k8s-nmstate-operator.html [2]: https://docs.openshift.com/container-platform/4.10/post_installation_configuration/machine-configuration-tasks.html [3]: openshift/machine-config-operator#1720 x-branch: feature/bond0

cgwalters · 2022-06-24T01:42:39Z

Today, nothing stops you from writing persistent files into /etc outside of MachineConfigs. So, one approach today is to make the change directly live, via ssh or a privileged container. We're unlikely to make a change that would break that anytime soon without an opt-in. But then, in order to ensure your system is reprovisionable, it'd be good to aim to also make the change in the Ignition config provided to each node.

For a cluster installed using e.g. the assisted installer, there is no MachineSet that covers the controllers.

Yeah, though I think the assisted installer can and should expose a way to customize the Ignition provided to each node...I thought it does something like this internally.

We want to use bonded interface pairs on these system. The nodes aren't yet wired for it, but setting this up now will allow us to refer to the `bond0` interface in e.g. VLAN configurations (and means we won't have to re-work those later). Because we're using OVNKubernetes, we can't use nmstate [1] to enact the configuration. The recommendation is to apply the configuration using a MachineConfig [2] resource, but this is complicated by the fact that our nodes don't all have the same interface names, and it's not possible to apply node-specific machineconfigs [3]. We work around this solution by: 1. Copying nmconnection files for *all hosts* to *every host*, but placing them in `/etc/mco` (just because that's a convenient available directory, it seems relatively topical, and it's not possible to create new directories using the `directories` section of an ignition config [4]). 2. Installing a systemd unit that runs a shell script at boot that copies the host-specific configs from `/etc/mco` into `/etc/NetworkManager/system-connections`. [1]: https://docs.openshift.com/container-platform/4.10/networking/k8s_nmstate/k8s-nmstate-about-the-k8s-nmstate-operator.html [2]: https://docs.openshift.com/container-platform/4.10/post_installation_configuration/machine-configuration-tasks.html [3]: openshift/machine-config-operator#1720 [4]: https://github.com/openshift/machine-config-operator/blob/master/docs/MachineConfigDaemon.md#supported-vs-unsupported-ignition-config-changes x-branch: feature/bond0

jlebon · 2022-07-13T15:52:22Z

This came up yet again internally. Would another hackaround for this be to add a URL to an Ignition config in ignition.config.merge[] (in a MachineConfig of course) which can provide a different config based on IP or MAC? (Obviously then requires setting up that secondary service.)

cgwalters mentioned this issue Jun 8, 2020

RFE: Move user-data secret creation from the installer into the MCO #683

Closed

cgwalters mentioned this issue Jul 22, 2020

tests/e2e: Refactor and reduce test run time #1884

Merged

cgwalters mentioned this issue Aug 31, 2020

docs/user/vsphere: Add static IP via Afterburn info openshift/installer#4121

Closed

cgwalters mentioned this issue Sep 29, 2020

Bug 1881703: Revert https://github.com/openshift/machine-config-operator/pull/1792 #2126

Merged

hardys mentioned this issue Oct 7, 2020

rhcos: add rhcos-inject enhancement openshift/enhancements#492

Closed

openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 24, 2020

openshift-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 28, 2020

openshift-ci-robot added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Nov 5, 2020

This was referenced Nov 18, 2020

Installer/MCO: store pointer ignition customizations in MachineConfig openshift/enhancements#540

Merged

MCO: manage machines userdata secret openshift/enhancements#368

Merged

This was referenced Dec 17, 2020

Document how to configure kdump on CoreOS openshift/openshift-docs#28164

Merged

single-node deployment with bootstrap-in-place openshift/enhancements#565

Merged

hardys mentioned this issue Jan 6, 2021

First iteration of running the Assisted Installer in end-user clusters. openshift/enhancements#574

Merged

hardys mentioned this issue Apr 22, 2021

Enable Kubernetes NMstate by default for selected platforms openshift/enhancements#747

Closed

cgwalters mentioned this issue Jul 27, 2021

docs: Add a "Creating derived configs" section coreos/butane#265

Open

cgwalters mentioned this issue Aug 25, 2021

IBM Cloud image variant should be 120GB coreos/fedora-coreos-tracker#931

Closed

cgwalters mentioned this issue Oct 20, 2021

Baremetal IPI Network Configuration for Day-1 openshift/enhancements#817

Merged

cgwalters mentioned this issue Jan 4, 2022

Declarative layering builder/interface coreos/fedora-coreos-tracker#1054

Open

cgwalters mentioned this issue Mar 28, 2022

Bug 1949827: Add KUBELET_NODEIP_HINT to nodeip-configuration #2888

Merged

larsks mentioned this issue Jun 24, 2022

Configure bond0 interface OCP-on-NERC/nerc-ocp-config#20

Closed

jlebon mentioned this issue Jan 13, 2023

Consider supporting multiple --ignition-* switches in coreos-installer install for parity with coreos-installer iso customize coreos/coreos-installer#1091

Open

jlebon mentioned this issue Oct 12, 2023

docs/faq: add question about unpredictable block device names openshift/os#1382

Merged

cgwalters mentioned this issue Oct 27, 2023

remote config via configmap and secrets containers/bootc#22

Open

jlebon mentioned this issue Apr 23, 2024

WARNING: SCSI device dm-0 has no device ID on LUKS device coreos/fedora-coreos-tracker#1712

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

machine-specific machineconfigs #1720

machine-specific machineconfigs #1720

cgwalters commented May 8, 2020 •

edited

Loading

michaelgugino commented Jun 23, 2020

cgwalters commented Jul 1, 2020

openshift-bot commented Oct 24, 2020

dhellmann commented Oct 28, 2020

dhellmann commented Nov 5, 2020

cgwalters commented Nov 6, 2020

larsks commented Jun 23, 2022

cgwalters commented Jun 23, 2022

larsks commented Jun 24, 2022

larsks commented Jun 24, 2022

cgwalters commented Jun 24, 2022

jlebon commented Jul 13, 2022 •

edited

Loading

machine-specific machineconfigs #1720

machine-specific machineconfigs #1720

Comments

cgwalters commented May 8, 2020 • edited Loading

Managing machine-specific configuration today

Using the Live ISO

Background information

michaelgugino commented Jun 23, 2020

cgwalters commented Jul 1, 2020

openshift-bot commented Oct 24, 2020

dhellmann commented Oct 28, 2020

dhellmann commented Nov 5, 2020

cgwalters commented Nov 6, 2020

larsks commented Jun 23, 2022

cgwalters commented Jun 23, 2022

larsks commented Jun 24, 2022

larsks commented Jun 24, 2022

cgwalters commented Jun 24, 2022

jlebon commented Jul 13, 2022 • edited Loading

cgwalters commented May 8, 2020 •

edited

Loading

jlebon commented Jul 13, 2022 •

edited

Loading