Installer/MCO: store pointer ignition customizations in MachineConfig #540

hardys · 2020-11-16T17:54:51Z

This is an alternative to #467

hardys · 2020-11-16T17:56:28Z

/cc @kirankt @runcom @yuqi-zhang @cgwalters @stbenjam

enhancements/machine-config/custom-ignition-machineconfig.md

cgwalters · 2020-11-18T19:31:06Z

enhancements/machine-config/custom-ignition-machineconfig.md

+
+#### Story 2
+
+As a baremetal IPI user, I need to deploy in an environment where network


Yes but...as discussed ideally we move baremetal IPI to using the Live ISO flow which already has techniques to address this by effectively having two Ignition runs.

xref coreos/ignition#979

(OK this is covered below, thanks)

Yes that's the long term plan and discussions are in-progress about how we may integrate the IPI baremetal automation with the assisted-install flow (which uses the live iso) - the idea with this enhancement is an interim solution so we can solve the problem for customers in 4.7 (as bond+vlans is a common requirement on baremetal)

Also note that if you embed the pointer into the live iso that still doesn't solve the problem in the case where the MCS URL is on a tagged network, so you need to wire in keyfiles to coreos-install --copy-network - which then aren't managed via the MCO.

So it may be preferable in the eventual IPI+ISO flow to generate an iso on the fly with the rendered config embedded (or provided via a config drive alongside the ISO) - in which case the changes made here are still relevant.

metal3-io/metal3-docs#150 is some early exploratory work in that direction FYI, but it's something for 4.8+

enhancements/machine-config/custom-ignition-machineconfig.md

cgwalters · 2020-11-18T19:35:03Z

enhancements/machine-config/custom-ignition-machineconfig.md

+config where config-drive size limits allow.  However this results in the [same
+issue with losing any user customizations](https://bugzilla.redhat.com/show_bug.cgi?id=1833483) provided directly via the pointer config.
+
+Discussions indicate we may want to deprecate/remove this ignition-config method of


I don't see a way to do this without solving openshift/machine-config-operator#1720
People definitely rely on customizing the pointer config with networking in UPI scenarios (as well as controlling the kernel cmdline).

Has the machine-specific problem been addressed in some other way in baremetal IPI?

No we don't have a solution for the machine-specific problem in IPI - I'm hoping we can make progress on the MCO issue you mentioned to resolve that 👍

The aim of this enhancement is mostly to solve the bond+vlan case for IPI, where you don't need per-machine configs (like static ips), but you do need to avoid the pointer ignition download, because the vlan isn't configured yet.

cgwalters · 2020-11-19T12:47:22Z

Overall if all we're getting out of this is "baremetal IPI can do not-machine-specific bonding"...that's not very much. What if we went for a much more tactical short term fix like having baremetal IPI support injecting network configuration in the same way that it works with the CoreOS ISO? In the end we just want NM configs injected into the target system's /boot - BMIPI could accept that as a tarball even.

hardys · 2020-11-19T16:07:06Z

Overall if all we're getting out of this is "baremetal IPI can do not-machine-specific bonding"...that's not very much

@cgwalters that's not all it gives us, it also unblocks the previously reverted work to manage the pointer ignition config in the MCO?

@runcom what are your thoughts, IMO one of the advantages of this approach is it also solves the problems that caused that work to be reverted? That's a benefit to all platforms AFAICT?

Also, this means side-channel customizations performed via the installer in the IPI flow become visible (on all platforms) at the MachineConfig level, and in the cases where the MCD supports it those configs can even be updated on day 2?

What if we went for a much more tactical short term fix like having baremetal IPI support injecting network configuration in the same way that it works with the CoreOS ISO? In the end we just want NM configs injected into the target system's /boot - BMIPI could accept that as a tarball even.

The interfaces to do that don't currently exist in Ironic, there's no way to inject arbitrary files into the image unless we implement some custom hook in the deploy ramdisk (and even if we do that, it's not clear where the data should come from, e.g do we invent some new configuration interface when users are already familiar with ignition/MachineConfig?)

hardys · 2020-11-19T16:16:32Z

@cgwalters followup question - what would an openshift-install interface to provide arbitrary NM keyfiles to pass into coreos-install --copy-network look like?

I'm not aware of any other platforms doing that, so it's another point of friction if we're considering such an approach (regardless of the underlying tool that installs RHCOS and writes those files)

kirankt · 2020-11-20T16:29:32Z

FYI. I have a WIP branch of an implementation of this enhancement here: https://github.com/kirankt/installer/tree/ignition-update-check

cgwalters · 2020-11-20T17:16:15Z

In UPI scenarios, the pointer Ignition config is provided "manually' by the admin. For example in a bare metal PXE scenario it might be manually copied to a static webserver. The pointer configuration is also hence not managed by the cluster. Bringing that under cluster management has bootstrapping problems. I don't see how this enhancement could solve it in general.

For IPI scenarios, I think we are in total control of everything end-to-end; we shouldn't support manually customizing the Ignition config, only injecting MachineConfig. Being in total control over the pointer config will allow us to do something better than #443 in IPI.

I think BMIPI has created a special case exception between these two because it seems like it's effectively re-injecting customized pointer configs into the cluster and we currently need networking before fetching the final config.

So with this enhancement would the BMIPI code look specifically for this machineconfig object name and include it in the pointer config?

kirankt · 2020-11-20T20:12:41Z

In UPI scenarios, the pointer Ignition config is provided "manually' by the admin. For example in a bare metal PXE scenario it might be manually copied to a static webserver. The pointer configuration is also hence not managed by the cluster. Bringing that under cluster management has bootstrapping problems. I don't see how this enhancement could solve it in general.

For IPI scenarios, I think we are in total control of everything end-to-end; we shouldn't support manually customizing the Ignition config, only injecting MachineConfig. Being in total control over the pointer config will allow us to do something better than #443 in IPI.

I think BMIPI has created a special case exception between these two because it seems like it's effectively re-injecting customized pointer configs into the cluster and we currently need networking before fetching the final config.

So with this enhancement would the BMIPI code look specifically for this machineconfig object name and include it in the pointer config?

From what I understand, converting the pointerignition to machineconfig during the 'create' target (which is what the above linked implementation does) preserves any customization done by the user. We have couple options before we do the conversion. 1) We can remove the URL and CA Cert parts and just send any customization as a MachineConfig, which from my tests has rendered correctly by MCO. 2) Send the whole pointer ignition as-is. This seems to work OK as well because the MCO, is smart to fetch the contents of the data URL and render the final config.
Perhaps, we can consider this conversion an intermediate step, where we can log/warn the users that we will deprecate direct customization of the pointer ignition and perhaps shepherd them towards creating machineconfigs instead. Further down the road, we can completely do away with the pointer ignition -> machineconfigs code completely.

hardys · 2020-11-23T09:56:31Z

The pointer configuration is also hence not managed by the cluster. Bringing that under cluster management has bootstrapping problems

I'm not sure I follow - the enhancement to manage the pointer ignition was already proposed, accepted and implemented.

It got reverted because it broke this pointer ignition customization interface (for all platforms) without deprecation/warning - this enhancement is an attempt to resolve that.

Note that the changes under discussion in this enhancement would not change the flow at all for UPI, it's only about reflecting pointer ignition customizations via the MCO rendered config in IPI deployments.

For IPI scenarios, I think we are in total control of everything end-to-end; we shouldn't support manually customizing the Ignition config, only injecting MachineConfig. Being in total control over the pointer config will allow us to do something better than #443 in IPI.

That's why I discussed potentially deprecating this customization interface (potentially only for IPI), but we can't just remove an existing interface without deprecation because we know users are using it, rightly or wrongly.

Currently there's no way to detect the pointer ignition is customized and warn the user to do something else, but if we modify the installer as described then we could tell the user to use a MachineConfig instead.

Note that injecting network configuration via MachineConfig doesn't actually solve the problem we're interested in for baremetal, e.g the case where the controlplane network is on a non-default vlan (because networking needs to be configured at the point of evaluating the pointer config merge URL)

I think BMIPI has created a special case exception between these two because it seems like it's effectively re-injecting customized pointer configs into the cluster and we currently need networking before fetching the final config.

BM IPI just injects the pointer ignition, there's nothing special about that compared to other platforms.

To enable bond+vlan configuration, we already implemented a solution where we just injected the MCO rendered config, but had to revert it because some users in the field are using pointer ignition customizations.

For the same reason the work from @runcom last cycle to manage the pointer ignition inside the MCO was also reverted.

The only special thing about BM IPI is that we currently don't use coreos-install, but AFAICT that doesn't help for IPI because there's no way to pass NM keyfiles to the --copy-network interface via the openshift-install anyway? That's just a UPI thing atm, right?

So with this enhancement would the BMIPI code look specifically for this machineconfig object name and include it in the pointer config?

No, the idea is we'd just consume the MCO rendered config, and ignore the pointer config, but ensure any user customizations injected via create ignition-configs is included in that config (to avoid breaking the existing interface, even if we document/warn that MachineConfig is preferred)

runcom · 2020-11-23T10:09:51Z

In UPI scenarios, the pointer Ignition config is provided "manually' by the admin. For example in a bare metal PXE scenario it might be manually copied to a static webserver. The pointer configuration is also hence not managed by the cluster. Bringing that under cluster management has bootstrapping problems. I don't see how this enhancement could solve it in general.

that drawback was indeed covered in https://github.com/openshift/enhancements/blob/master/enhancements/machine-config/user-data-secret-managed.md#drawbacks
with this enhancement, we're going to be able to manage the pointer ignition (as we would have done and we'll do for migrating it from ign v2 to v3 but we couldn't because we were missing any customization made). Now the customizations will be lifecycled with MCs and can just translate up/down as needed (not that we forsee any new migration soon but still) as well as anything else.
I think the UPI case was and it's still a drawback in this scenario but we're not covering that here (or in my other PR that was reverted)

cgwalters · 2020-11-23T14:24:31Z

I'm not sure I follow - the enhancement to manage the pointer ignition was already proposed, accepted and implemented.

Yes. I would clarify that as "manage the pointer config in IPI". Per above we simply cannot rely on doing so in UPI. And that means we need to be careful about what we can design that relates to the pointer config.

runcom · 2020-11-23T16:54:19Z

Yes. I would clarify that as "manage the pointer config in IPI". Per above we simply cannot rely on doing so in UPI. And that means we need to be careful about what we can design that relates to the pointer config.

so Colin, this just suffers from the same UPI drawbacks for which we'll really need a better story - for UPI, as soon as the cluster is up and running (which won't then care about where the ignition pointer is) it's still going to work normally as it's still reading the pointer config from an external server anyway on scale up.

In the case where an IPI user customizes the pointer config, this config is only persisted via the user-data secret, which was an issue when moving to an MCO templated pointer config: openshift#4228 It was also a problem for attempts for IPI baremetal to consume the MCO rendered config directly, with the aim of enabling network configuration via MachineConfig: openshift#3589 With this new approach, we detect the case where a change has been made to the pointer config by the user, and in that case it is stored via a MachineConfig manifest, such that any customizations are reflected in the MCO rendered config. Implements: openshift/enhancements#540 Co-Authored-By: Steven Hardy <[email protected]>

hardys · 2020-11-24T15:40:12Z

Installer PR up for review - openshift/installer#4413

Testing that and it seems to work well - when I inject a file via a customization to the master/worker pointer ignition I see:

$ oc get MachineConfig | grep installer
99-installer-ignition-master                                                                  3.1.0             46m
99-installer-ignition-worker                                                                  3.1.0             46m

And that change then shows up as expected in the MCS/rendered output

In the case where an IPI user customizes the pointer config, this config is only persisted via the user-data secret, which was an issue when moving to an MCO templated pointer config: openshift#4228 It was also a problem for attempts for IPI baremetal to consume the MCO rendered config directly, with the aim of enabling network configuration via MachineConfig: openshift#3589 With this new approach, we detect the case where a change has been made to the pointer config by the user, and in that case it is stored via a MachineConfig manifest, such that any customizations are reflected in the MCO rendered config. Implements: openshift/enhancements#540 Co-Authored-By: Steven Hardy <[email protected]>

cgwalters · 2020-11-24T16:23:11Z

OK now my concern here is around (mostly UPI) setups that are modifying the pointer config today for per-machine state. I want to be sure we're not breaking them.

The SOP today should be to run create ignition-configs and copy that rather than doing any modifications in place; you'd basically have to do that to have one per machine.

So I think this should be fine. The logic of storing it only if modified makes sense.

I would say though in the end this enhancement is mostly an aid to metal IPI.

Also, related to advanced networking configs - one option is to use a single NIC at install time by injecting kernel arguments just for the firstboot, then switch to bonding via MachineConfig. If metal IPI can inject kernel arguments only for the firstboot it could do this today without this enhancement.

Anyways though
/approve

hardys · 2020-11-25T10:10:45Z

OK now my concern here is around (mostly UPI) setups that are modifying the pointer config today for per-machine state. I want to be sure we're not breaking them.

The SOP today should be to run create ignition-configs and copy that rather than doing any modifications in place; you'd basically have to do that to have one per machine.

So I think this should be fine. The logic of storing it only if modified makes sense.

Yeah I don't think this will impact UPI setups at all.

I would say though in the end this enhancement is mostly an aid to metal IPI.

FWIW the reason I spent time on this proposal as an alternative to #467 is that I think it does provide some benefit beyond solving the day-1 networking problem for metal IPI:

It unblocks previously reverted MCO work from @runcom to have MCO manage the pointer ignition
It provides us a means to potentially warn IPI users they should prefer MachineConfig over pointer customizations in future

Also, related to advanced networking configs - one option is to use a single NIC at install time by injecting kernel arguments just for the firstboot, then switch to bonding via MachineConfig. If metal IPI can inject kernel arguments only for the firstboot it could do this today without this enhancement.

Yeah we don't have a clean way to do that in metal IPI atm, and there's no openshift-install configuration interface that supports it?

As previously mentioned, I've started work to enable metal IPI to use the RHCOS live-iso via metal3-io/metal3-docs#150 which may give us more options in future, but we'll need to figure out the openshift-install interfaces.

Anyways though
/approve

Thanks!

In the case where an IPI user customizes the pointer config, this config is only persisted via the user-data secret, which was an issue when moving to an MCO templated pointer config: openshift#4228 It was also a problem for attempts for IPI baremetal to consume the MCO rendered config directly, with the aim of enabling network configuration via MachineConfig: openshift#3589 With this new approach, we detect the case where a change has been made to the pointer config by the user, and in that case it is stored via a MachineConfig manifest, such that any customizations are reflected in the MCO rendered config. Implements: openshift/enhancements#540 Co-Authored-By: Steven Hardy <[email protected]>

This restores the work which was previously done via openshift#3276 but then reverted via openshift#3589 due to breaking users who customized the pointer ignition config in IPI deployments. A solution to that has been proposed via openshift#4413 - see openshift/enhancements#540 for more details. Note that some additional changes beyond the initial implementation were required due to the MCO now supporting multiple ignition versions, thus this depends on openshift-metal3/terraform-provider-ironic#46 Co-Authored-By: Steven Hardy <[email protected]>

hardys · 2020-12-03T10:55:01Z

This was implemented via openshift/installer#4413

In the case where an IPI user customizes the pointer config, this config is only persisted via the user-data secret, which was an issue when moving to an MCO templated pointer config: openshift#4228 It was also a problem for attempts for IPI baremetal to consume the MCO rendered config directly, with the aim of enabling network configuration via MachineConfig: openshift#3589 With this new approach, we detect the case where a change has been made to the pointer config by the user, and in that case it is stored via a MachineConfig manifest, such that any customizations are reflected in the MCO rendered config. Implements: openshift/enhancements#540 Co-Authored-By: Steven Hardy <[email protected]>

This restores the work which was previously done via openshift#3276 but then reverted via openshift#3589 due to breaking users who customized the pointer ignition config in IPI deployments. A solution to that has been proposed via openshift#4413 - see openshift/enhancements#540 for more details. Note that some additional changes beyond the initial implementation were required due to the MCO now supporting multiple ignition versions, thus this depends on openshift-metal3/terraform-provider-ironic#46 Co-Authored-By: Steven Hardy <[email protected]>

hardys · 2020-12-09T10:06:06Z

@kirankt since this is now implemented, perhaps you can give it the final lgtm so it merges?

kirankt · 2021-01-28T15:28:37Z

/lgtm

stbenjam · 2021-01-29T14:49:09Z

/test markdownlint

stbenjam · 2021-01-29T14:49:35Z

/retest

stbenjam · 2021-01-29T15:02:01Z

There's a few markdown issues to correct:

 enhancements/machine-config/custom-ignition-machineconfig.md:70:1 MD007/ul-indent Unordered list indentation [Expected: 0; Actual: 1]
enhancements/machine-config/custom-ignition-machineconfig.md:72:1 MD007/ul-indent Unordered list indentation [Expected: 0; Actual: 1]
enhancements/machine-config/custom-ignition-machineconfig.md:75:1 MD007/ul-indent Unordered list indentation [Expected: 0; Actual: 1]
enhancements/machine-config/custom-ignition-machineconfig.md:125:1 MD007/ul-indent Unordered list indentation [Expected: 0; Actual: 2]
enhancements/machine-config/custom-ignition-machineconfig.md:126:1 MD007/ul-indent Unordered list indentation [Expected: 0; Actual: 2]
enhancements/machine-config/custom-ignition-machineconfig.md:157:1 MD007/ul-indent Unordered list indentation [Expected: 0; Actual: 2]
enhancements/machine-config/custom-ignition-machineconfig.md:159:1 MD007/ul-indent Unordered list indentation [Expected: 0; Actual: 2]

This is an alternative to openshift#467

stbenjam · 2021-01-29T17:08:09Z

/approve
/lgtm

openshift-ci-robot · 2021-01-29T17:08:28Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cgwalters, kirankt, stbenjam

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [stbenjam]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci-robot requested review from adambkaplan and stbenjam November 16, 2020 17:55

openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 16, 2020

openshift-ci-robot requested review from cgwalters, kirankt, runcom and yuqi-zhang November 16, 2020 17:56

stbenjam reviewed Nov 17, 2020

View reviewed changes

enhancements/machine-config/custom-ignition-machineconfig.md Show resolved Hide resolved

cgwalters reviewed Nov 18, 2020

View reviewed changes

kirankt mentioned this pull request Nov 20, 2020

baremetal: send full ignition to masters openshift/installer#4359

Closed

cgwalters mentioned this pull request Nov 24, 2020

Add oc ignition convert3 openshift/oc#628

Closed

hardys mentioned this pull request Nov 24, 2020

Create a machineconfig for IPI pointer ignition customizations openshift/installer#4413

Merged

This was referenced Nov 27, 2020

baremetal: send full ignition to masters openshift/installer#4427

Merged

Add MCO Flattened Ignition proposal #467

Closed

hardys force-pushed the ignition_machineconfig branch from aed19d3 to e4ff697 Compare December 3, 2020 10:54

markmc mentioned this pull request Jan 4, 2021

First iteration of running the Assisted Installer in end-user clusters. #574

Merged

jlebon mentioned this pull request Jan 8, 2021

nm-initrd-generator should support rd.net.timeout.carrier karg coreos/fedora-coreos-tracker#708

Closed

openshift-ci-robot assigned kirankt Jan 28, 2021

openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jan 28, 2021

kirankt approved these changes Jan 28, 2021

View reviewed changes

openshift-ci-robot removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 28, 2021

Installer/MCO: store pointer ignition customizations in MachineConfig

7b61959

This is an alternative to openshift#467

hardys force-pushed the ignition_machineconfig branch from e4ff697 to 7b61959 Compare January 29, 2021 16:45

openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Jan 29, 2021

openshift-ci-robot assigned stbenjam Jan 29, 2021

openshift-ci-robot added lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jan 29, 2021

openshift-merge-robot merged commit 382dbf0 into openshift:master Jan 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Installer/MCO: store pointer ignition customizations in MachineConfig #540

Installer/MCO: store pointer ignition customizations in MachineConfig #540

hardys commented Nov 16, 2020

hardys commented Nov 16, 2020

cgwalters Nov 18, 2020

hardys Nov 19, 2020

hardys Nov 19, 2020

cgwalters Nov 18, 2020

hardys Nov 19, 2020

cgwalters commented Nov 19, 2020

hardys commented Nov 19, 2020 •

edited

Loading

hardys commented Nov 19, 2020

kirankt commented Nov 20, 2020

cgwalters commented Nov 20, 2020 •

edited

Loading

kirankt commented Nov 20, 2020

hardys commented Nov 23, 2020 •

edited

Loading

runcom commented Nov 23, 2020

cgwalters commented Nov 23, 2020

runcom commented Nov 23, 2020

hardys commented Nov 24, 2020

cgwalters commented Nov 24, 2020

hardys commented Nov 25, 2020 •

edited

Loading

hardys commented Dec 3, 2020

hardys commented Dec 9, 2020

kirankt commented Jan 28, 2021

stbenjam commented Jan 29, 2021

stbenjam commented Jan 29, 2021

stbenjam commented Jan 29, 2021

stbenjam commented Jan 29, 2021

openshift-ci-robot commented Jan 29, 2021


		#### Story 2

		As a baremetal IPI user, I need to deploy in an environment where network

Installer/MCO: store pointer ignition customizations in MachineConfig #540

Installer/MCO: store pointer ignition customizations in MachineConfig #540

Conversation

hardys commented Nov 16, 2020

hardys commented Nov 16, 2020

cgwalters Nov 18, 2020

Choose a reason for hiding this comment

hardys Nov 19, 2020

Choose a reason for hiding this comment

hardys Nov 19, 2020

Choose a reason for hiding this comment

cgwalters Nov 18, 2020

Choose a reason for hiding this comment

hardys Nov 19, 2020

Choose a reason for hiding this comment

cgwalters commented Nov 19, 2020

hardys commented Nov 19, 2020 • edited Loading

hardys commented Nov 19, 2020

kirankt commented Nov 20, 2020

cgwalters commented Nov 20, 2020 • edited Loading

kirankt commented Nov 20, 2020

hardys commented Nov 23, 2020 • edited Loading

runcom commented Nov 23, 2020

cgwalters commented Nov 23, 2020

runcom commented Nov 23, 2020

hardys commented Nov 24, 2020

cgwalters commented Nov 24, 2020

hardys commented Nov 25, 2020 • edited Loading

hardys commented Dec 3, 2020

hardys commented Dec 9, 2020

kirankt commented Jan 28, 2021

stbenjam commented Jan 29, 2021

stbenjam commented Jan 29, 2021

stbenjam commented Jan 29, 2021

stbenjam commented Jan 29, 2021

openshift-ci-robot commented Jan 29, 2021

hardys commented Nov 19, 2020 •

edited

Loading

cgwalters commented Nov 20, 2020 •

edited

Loading

hardys commented Nov 23, 2020 •

edited

Loading

hardys commented Nov 25, 2020 •

edited

Loading