Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

machine-specific machineconfigs #1720

Open
cgwalters opened this issue May 8, 2020 · 12 comments
Open

machine-specific machineconfigs #1720

cgwalters opened this issue May 8, 2020 · 12 comments
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.

Comments

@cgwalters
Copy link
Member

cgwalters commented May 8, 2020

Ironically today MachineConfig objects target a pool - not a specific machine.

Managing machine-specific configuration today

Follow up to this comment. Today if you want to manage per-machine configuration such as e.g. statically set hostname, the best way to do this is to take the worker.ign file generated by openshift-installer and create e.g .worker-foo.ign and worker-bar.ign copies, then modify them to include configuration specific to the machine, and provide that Ignition to the node.

For the "provide that Ignition to the node" phase, if you're using MachineSets via MachineAPI, then one would need to edit the machineset object to point to a new user data secret. Note this will eventually conflict with having the MCO manage userdata but that enhancement was reverted. But when we get there, we can teach the MCO to retain any additional config it finds in the machineset perhaps?

The OpenShift 4.8 documentation will describe how to use https://github.com/coreos/butane which is a high level tool for managing Ignition, but it isn't yet ergonomic to "backconvert" that pointer Ignition to butane, then output a new Ignition config.

Using the Live ISO

Additionally, one can use the Live ISO which can be programmed to have its own Ignition configuration that e.g. pulls a container which performs hardware inspection, and then dynamically generates a configuration which is passed to coreos-installer.

Background information

If we had a way to provide "machine specific machineconfigs" then the admins could provide those MCs as additional manifests and it'd all Just Work.

But...this gets into the "node identity" problem. We'd need to define a way for the MCS to identify the node it's serving the config to.

Perhaps the simplest thing is to assume anyone who wants this is statically assigning hostnames, and we change Ignition to include a header with the node's hostname.

(I guess we could do something like a reverse lookup of the requester's IP address too)

One messy aspect of this too is that we can't include these bits configs in the main rendered-$pool configs which means the MCS needs to generate it internally. (I guess we could create a separate rendered-node-$x config too?)

@michaelgugino
Copy link
Contributor

Perhaps the simplest thing is to assume anyone who wants this is statically assigning hostnames, and we change Ignition to include a header with the node's hostname.

Definitely we should do this with or without machine-specific machineconfigs.

(I guess we could do something like a reverse lookup of the requester's IP address too)

This is not guaranteed to work in all environments, and I think we discovered during troubleshooting 4.4 release that the IPs that show up in the MCS during first boot are VIPs, not instance IPs.

Including the hostname and IP on every request to the MCS would greatly aid in determining when a machine failed to boot/request an ignition vs failing to join cluster after getting ignition file.

@cgwalters
Copy link
Member Author

One way to do this today is - for each machine one wants to have a custom config for, manually set things up to pass it a custom derived pointer config.

For example in a PXE install scenario, take the Ignition output from openshift-install create ignition-configs and edit that, rather than trying to customize the full Ignition config returned from the MCS.

@openshift-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci-robot openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 24, 2020
@dhellmann
Copy link

/remove-lifecycle stale

@openshift-ci-robot openshift-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 28, 2020
@dhellmann
Copy link

/lifecycle frozen

@openshift-ci-robot openshift-ci-robot added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Nov 5, 2020
@cgwalters
Copy link
Member Author

Perhaps one simple approach here is:

  • user injects /etc/hostname in the pointer Ignition config
  • Ignition detects this and adds it as a header when doing requests for merged configs (like the MCO)
  • MCO reads this header and adds in to the rendered Ignition (from all machineconfig objects with a label for that hostname)

However, we could simplify things significantly if we said the MCD wouldn't be aware of this - we wouldn't support "day 2" reconfiguration for machine-specific configs. IOW if e.g. you want to change the static IP address for a node, you fully reprovision it.

Given that I think most of this per-node configuration wouldn't change often, that could be a useful balance.

@larsks
Copy link

larsks commented Jun 23, 2022

@cgwalters what's the state of this issue? I need to configure bonding on the primary interface on a small openshift (4.10) cluster, and the interface names on the nodes aren't identical...so I need a couple of different machineconfig resources, applying to different subsets of nodes.

I was hoping I could create node-specific pools, but if I try something like...

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
  name: ctl-0
spec:
  machineConfigSelector:
    matchLabels:
      kubernetes.io/hostname: ctl-0
  nodeSelector:
    matchLabels:
      kubernetes.io/hostname: ctl-0
  paused: false

...then MCO refuses to apply because:

W0623 23:44:20.353529       1 node_controller.go:827] can't get pool for node "ctl-0": node ctl-0 has both master role and custom role ctl-0

I guess the workaround is to bundle the desired configurations into a shell script that does something like...

#!/bin/bash

if [[ $HOSTNAME = ctl-0 ]]; then
  cp /etc/files/config-for-host0 /etc/actual/path/config
elif [[ $HOSTNAME = ctl-1 ]]; then
  cp /etc//files/config-for-host1 /etc/actual/path/config
fi

Etc.

Is there a better way?

@cgwalters
Copy link
Member Author

@larsks See the top comment #1720 (comment) (which I reworked to clarify with the best solution today)

(Networking specifically also touches on nmstate, which is another thing)

@larsks
Copy link

larsks commented Jun 24, 2022

(Networking specifically also touches on nmstate, which is another thing)

Right, and for this I would love to be able to use nmstate (in particular because it's easy to implement host-specific configs), but that explicitly can't be used to configure the primary host interface. The recommendation is to use machineconfigs :).

@larsks
Copy link

larsks commented Jun 24, 2022

Reading through the top comment...

For the "provide that Ignition to the node" phase, if you're using MachineSets via MachineAPI, then one would need to edit the machineset object to point to a new user data secret...

For a cluster installed using e.g. the assisted installer, there is no MachineSet that covers the controllers. For a three-node cluster:

$ oc get -A machineset
NAMESPACE               NAME                            DESIRED   CURRENT   READY   AVAILABLE   AGE
openshift-machine-api   nerc-ocp-infra-5xv2p-worker-0   0         0                             28d

I also see the implication there that you're describing an install-time procedure, rather than something that could be a post-install configuration task.

My reading of that comment is that "today, it is generally not possible to apply machine-specific machineconfig resources". It sounds like I'm going to need to go with my hacky shell script (although even that plan was complicated by the fact that I can't use the directories section of an ignition config to create directories, so I need to find pre-existing directories in /etc into which I can copy files in a non-crazy fashion...)

larsks added a commit to larsks/nerc-ocp-config that referenced this issue Jun 24, 2022
We want to use bonded interface pairs on these system. The nodes
aren't yet wired for it, but setting this up now will allow us to
refer to the `bond0` interface in e.g. VLAN configurations (and means
we won't have to re-work those later).

Because we're using OVNKubernetes, we can't use nmstate [1] to enact
the configuration. The recommendation is to apply the configuration
using a MachineConfig [2] resource, but this is complicated by the
fact that our nodes don't all have the same interface names, and it's
not possible to apply node-specific machineconfigs [3].

We work around this solution by:

1. Copying nmconnection files for *all hosts* to *every host*, but
  placing them in `/etc/mco` (just because that's a convenient
  available directory, it seems relatively topical, and it's not
  possible to create new directories using the `directories` section
  of an ignition config).

2. Installing a systemd unit that runs a shell script at boot that
  copies the host-specific configs from `/etc/mco` into
  `/etc/NetworkManager/system-connections`.

[1]: https://docs.openshift.com/container-platform/4.10/networking/k8s_nmstate/k8s-nmstate-about-the-k8s-nmstate-operator.html
[2]: https://docs.openshift.com/container-platform/4.10/post_installation_configuration/machine-configuration-tasks.html
[3]: openshift/machine-config-operator#1720

x-branch: feature/bond0
larsks added a commit to larsks/nerc-ocp-config that referenced this issue Jun 24, 2022
We want to use bonded interface pairs on these system. The nodes
aren't yet wired for it, but setting this up now will allow us to
refer to the `bond0` interface in e.g. VLAN configurations (and means
we won't have to re-work those later).

Because we're using OVNKubernetes, we can't use nmstate [1] to enact
the configuration. The recommendation is to apply the configuration
using a MachineConfig [2] resource, but this is complicated by the
fact that our nodes don't all have the same interface names, and it's
not possible to apply node-specific machineconfigs [3].

We work around this solution by:

1. Copying nmconnection files for *all hosts* to *every host*, but
  placing them in `/etc/mco` (just because that's a convenient
  available directory, it seems relatively topical, and it's not
  possible to create new directories using the `directories` section
  of an ignition config).

2. Installing a systemd unit that runs a shell script at boot that
  copies the host-specific configs from `/etc/mco` into
  `/etc/NetworkManager/system-connections`.

[1]: https://docs.openshift.com/container-platform/4.10/networking/k8s_nmstate/k8s-nmstate-about-the-k8s-nmstate-operator.html
[2]: https://docs.openshift.com/container-platform/4.10/post_installation_configuration/machine-configuration-tasks.html
[3]: openshift/machine-config-operator#1720

x-branch: feature/bond0
larsks added a commit to larsks/nerc-ocp-config that referenced this issue Jun 24, 2022
We want to use bonded interface pairs on these system. The nodes
aren't yet wired for it, but setting this up now will allow us to
refer to the `bond0` interface in e.g. VLAN configurations (and means
we won't have to re-work those later).

Because we're using OVNKubernetes, we can't use nmstate [1] to enact
the configuration. The recommendation is to apply the configuration
using a MachineConfig [2] resource, but this is complicated by the
fact that our nodes don't all have the same interface names, and it's
not possible to apply node-specific machineconfigs [3].

We work around this solution by:

1. Copying nmconnection files for *all hosts* to *every host*, but
  placing them in `/etc/mco` (just because that's a convenient
  available directory, it seems relatively topical, and it's not
  possible to create new directories using the `directories` section
  of an ignition config).

2. Installing a systemd unit that runs a shell script at boot that
  copies the host-specific configs from `/etc/mco` into
  `/etc/NetworkManager/system-connections`.

[1]: https://docs.openshift.com/container-platform/4.10/networking/k8s_nmstate/k8s-nmstate-about-the-k8s-nmstate-operator.html
[2]: https://docs.openshift.com/container-platform/4.10/post_installation_configuration/machine-configuration-tasks.html
[3]: openshift/machine-config-operator#1720

x-branch: feature/bond0
larsks added a commit to larsks/nerc-ocp-config that referenced this issue Jun 24, 2022
We want to use bonded interface pairs on these system. The nodes
aren't yet wired for it, but setting this up now will allow us to
refer to the `bond0` interface in e.g. VLAN configurations (and means
we won't have to re-work those later).

Because we're using OVNKubernetes, we can't use nmstate [1] to enact
the configuration. The recommendation is to apply the configuration
using a MachineConfig [2] resource, but this is complicated by the
fact that our nodes don't all have the same interface names, and it's
not possible to apply node-specific machineconfigs [3].

We work around this solution by:

1. Copying nmconnection files for *all hosts* to *every host*, but
  placing them in `/etc/mco` (just because that's a convenient
  available directory, it seems relatively topical, and it's not
  possible to create new directories using the `directories` section
  of an ignition config).

2. Installing a systemd unit that runs a shell script at boot that
  copies the host-specific configs from `/etc/mco` into
  `/etc/NetworkManager/system-connections`.

[1]: https://docs.openshift.com/container-platform/4.10/networking/k8s_nmstate/k8s-nmstate-about-the-k8s-nmstate-operator.html
[2]: https://docs.openshift.com/container-platform/4.10/post_installation_configuration/machine-configuration-tasks.html
[3]: openshift/machine-config-operator#1720

x-branch: feature/bond0
larsks added a commit to larsks/nerc-ocp-config that referenced this issue Jun 24, 2022
We want to use bonded interface pairs on these system. The nodes
aren't yet wired for it, but setting this up now will allow us to
refer to the `bond0` interface in e.g. VLAN configurations (and means
we won't have to re-work those later).

Because we're using OVNKubernetes, we can't use nmstate [1] to enact
the configuration. The recommendation is to apply the configuration
using a MachineConfig [2] resource, but this is complicated by the
fact that our nodes don't all have the same interface names, and it's
not possible to apply node-specific machineconfigs [3].

We work around this solution by:

1. Copying nmconnection files for *all hosts* to *every host*, but
  placing them in `/etc/mco` (just because that's a convenient
  available directory, it seems relatively topical, and it's not
  possible to create new directories using the `directories` section
  of an ignition config).

2. Installing a systemd unit that runs a shell script at boot that
  copies the host-specific configs from `/etc/mco` into
  `/etc/NetworkManager/system-connections`.

[1]: https://docs.openshift.com/container-platform/4.10/networking/k8s_nmstate/k8s-nmstate-about-the-k8s-nmstate-operator.html
[2]: https://docs.openshift.com/container-platform/4.10/post_installation_configuration/machine-configuration-tasks.html
[3]: openshift/machine-config-operator#1720

x-branch: feature/bond0
larsks added a commit to larsks/nerc-ocp-config that referenced this issue Jun 24, 2022
We want to use bonded interface pairs on these system. The nodes
aren't yet wired for it, but setting this up now will allow us to
refer to the `bond0` interface in e.g. VLAN configurations (and means
we won't have to re-work those later).

Because we're using OVNKubernetes, we can't use nmstate [1] to enact
the configuration. The recommendation is to apply the configuration
using a MachineConfig [2] resource, but this is complicated by the
fact that our nodes don't all have the same interface names, and it's
not possible to apply node-specific machineconfigs [3].

We work around this solution by:

1. Copying nmconnection files for *all hosts* to *every host*, but
  placing them in `/etc/mco` (just because that's a convenient
  available directory, it seems relatively topical, and it's not
  possible to create new directories using the `directories` section
  of an ignition config).

2. Installing a systemd unit that runs a shell script at boot that
  copies the host-specific configs from `/etc/mco` into
  `/etc/NetworkManager/system-connections`.

[1]: https://docs.openshift.com/container-platform/4.10/networking/k8s_nmstate/k8s-nmstate-about-the-k8s-nmstate-operator.html
[2]: https://docs.openshift.com/container-platform/4.10/post_installation_configuration/machine-configuration-tasks.html
[3]: openshift/machine-config-operator#1720

x-branch: feature/bond0
larsks added a commit to larsks/nerc-ocp-config that referenced this issue Jun 24, 2022
We want to use bonded interface pairs on these system. The nodes
aren't yet wired for it, but setting this up now will allow us to
refer to the `bond0` interface in e.g. VLAN configurations (and means
we won't have to re-work those later).

Because we're using OVNKubernetes, we can't use nmstate [1] to enact
the configuration. The recommendation is to apply the configuration
using a MachineConfig [2] resource, but this is complicated by the
fact that our nodes don't all have the same interface names, and it's
not possible to apply node-specific machineconfigs [3].

We work around this solution by:

1. Copying nmconnection files for *all hosts* to *every host*, but
  placing them in `/etc/mco` (just because that's a convenient
  available directory, it seems relatively topical, and it's not
  possible to create new directories using the `directories` section
  of an ignition config).

2. Installing a systemd unit that runs a shell script at boot that
  copies the host-specific configs from `/etc/mco` into
  `/etc/NetworkManager/system-connections`.

[1]: https://docs.openshift.com/container-platform/4.10/networking/k8s_nmstate/k8s-nmstate-about-the-k8s-nmstate-operator.html
[2]: https://docs.openshift.com/container-platform/4.10/post_installation_configuration/machine-configuration-tasks.html
[3]: openshift/machine-config-operator#1720

x-branch: feature/bond0
@cgwalters
Copy link
Member Author

Today, nothing stops you from writing persistent files into /etc outside of MachineConfigs. So, one approach today is to make the change directly live, via ssh or a privileged container. We're unlikely to make a change that would break that anytime soon without an opt-in. But then, in order to ensure your system is reprovisionable, it'd be good to aim to also make the change in the Ignition config provided to each node.

For a cluster installed using e.g. the assisted installer, there is no MachineSet that covers the controllers.

Yeah, though I think the assisted installer can and should expose a way to customize the Ignition provided to each node...I thought it does something like this internally.

larsks added a commit to larsks/nerc-ocp-config that referenced this issue Jun 24, 2022
We want to use bonded interface pairs on these system. The nodes
aren't yet wired for it, but setting this up now will allow us to
refer to the `bond0` interface in e.g. VLAN configurations (and means
we won't have to re-work those later).

Because we're using OVNKubernetes, we can't use nmstate [1] to enact
the configuration. The recommendation is to apply the configuration
using a MachineConfig [2] resource, but this is complicated by the
fact that our nodes don't all have the same interface names, and it's
not possible to apply node-specific machineconfigs [3].

We work around this solution by:

1. Copying nmconnection files for *all hosts* to *every host*, but
  placing them in `/etc/mco` (just because that's a convenient
  available directory, it seems relatively topical, and it's not
  possible to create new directories using the `directories` section
  of an ignition config [4]).

2. Installing a systemd unit that runs a shell script at boot that
  copies the host-specific configs from `/etc/mco` into
  `/etc/NetworkManager/system-connections`.

[1]: https://docs.openshift.com/container-platform/4.10/networking/k8s_nmstate/k8s-nmstate-about-the-k8s-nmstate-operator.html
[2]: https://docs.openshift.com/container-platform/4.10/post_installation_configuration/machine-configuration-tasks.html
[3]: openshift/machine-config-operator#1720
[4]: https://github.com/openshift/machine-config-operator/blob/master/docs/MachineConfigDaemon.md#supported-vs-unsupported-ignition-config-changes

x-branch: feature/bond0
@jlebon
Copy link
Member

jlebon commented Jul 13, 2022

This came up yet again internally. Would another hackaround for this be to add a URL to an Ignition config in ignition.config.merge[] (in a MachineConfig of course) which can provide a different config based on IP or MAC? (Obviously then requires setting up that secondary service.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.
Projects
None yet
Development

No branches or pull requests

7 participants