-
Notifications
You must be signed in to change notification settings - Fork 471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[machine-config-operator/baremetal] MCO declarative network configuration #399
Conversation
/cc @cgwalters |
/retitle [machine-config-operator] MCO declarative network configuration |
There was a past nmstate related enhancement. Why did this one get created vs evolving the existing one? |
@russellb as I recall, the previous enhancement was for kubernetes-nmstate, whereas this one only mentions nmstate (the non-kubernetes variant) as an implementation detail. My biggest concern with this approach is that this takes us from MachineConfigs (typically) being used to describe a pool of machines to them describing a single machine. It's going to be very cumbersome to use this mechanism to configure a cluster full of unique machines (e.g. each machine needs to be configured with a static IP address) and I imagine we are almost immediately going to be asked to build a templating mechanism so that customers don't need to create a MachineConfigPool for every machine. |
Right, but I remember feedback pushing that enhancement toward discussing the problem more generally and also evolving the proposal to be better integrated with MCO, so I expected something like this just to come in as another revision of the existing enhancement. At a minimum I'd expect them to link to each other to explain the relationship. Is the other one now deprecated and should be closed? Are they to be considered as alternatives to each other? |
Yes. The biggest objection to the original enhancement, which proposed adding the kubernetes upstream kubernetes-nmstate, was that it added a new way to configure Nodes separate from MachineConfig and that it potentially delayed problems to the following reboot. It had other issues such as not having a rollout functionality (though it has rollback).
This issue was not present in the kubernetes-nmstate Enhancement proposal. The current proposal basically keeps the API that kubernetes-nmstate users are familiar with and adds and solves the other objections. The biggest change for kubernetes-nmstate users, an a significant one at that, is the granularity. Doing anything with MachineConfig that does not target the entire MachinePool is not really possible at the moment. Of course, there's ways to work around it. Of the top of my head, we could keep having the user do NodeNetworkConfigurationPolicy and the controller (as in the current proposal), writes multiple files to the MachineConfig for the appropriate MachineConfigPool. That implies that each node in a MachinePool will have several of these files written to its network configuration directory, with each file being targeted at a specific node. I believe that this is fine since:
|
That is a good question. I could see both, but I thought that this supersedes the old one. I certainly hoped for the approach proposed here to get an approval quickly since it addresses the objections to the other. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying to get up to speed: left some questions/comments. I'm not clear on :is this 1 new rendered-config per node? or 1 new rendered config with a bunch of different NM settings for each node? The former seems like a very confusing ui, the latter means that a MC is no longer an easy to grok state for a set of nodes but needs to be parsed closely to figure out what is on each node with different nodes potentially succeeding and failing from the same config? (Note: if 1 node fails on a config we stop applying it to the pool - what do we do in this case?)
This is proposing 2 major things the MCO doesn't actually do yet at once: per node configuration and rebootless updates. Is there a way to tackle this in stages?
I don't doubt that this can be implemented in some way, but I do worry about the maintainability & what issues this will raise when troubleshooting - MCO team will be tasked with maintaining all of this when we have no expertise along with the issues that will come from those 2 very major changes. I also have a concern that the MCO will start being tasked with per node configurations across entire clusters which is the opposite of the MCO (and the problem that I thought we were moving away from).
I'd like to hear about the alternative Option A: "include kubernetes-nmstate as a standalone component within OpenShift" Would this make the network configuration more easily maintainable, etc… and why this is a bad choice for this specific chagne. Especially if as Alex mentioned this might require some tooling to create configs, it seems like this nmstate standalone component might make sense from a maintainability perspective?
There's a whole lot going on here, but overall |
/cc @cgwalters @runcom @kikisdeliveryservice @ericavonb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok so I watched the video and re-read this several times and Option C seems like a pretty stable approach at least vis-a-vis the MCO
On the MCO side, you'd be making changes to /templates/common/baremetal (or somewhere in templates) to laydown the files/create .mount unit, then post kubelet leverage kubernetes-nmstate to make changes to that /tmp/nm-system-connections?
If eventually the user lays down some other MC that causes a reboot the MCO goes thru the same thing above, /tmp/nm-system-connection changes would be lost and the kubernetes-nmstate post-kubelet would do it's thing and reconcile the /tmp/nm-system.. again?
But just to be clear those initial template changes laid down by the MCO would happen during an upgrade to 4.x (which would obviously incur a reboot), but the idea is that when the user decides to leverage kubernetes-nmstate they won't incur an additional reboot to set that using NodeNetworkConfigurationPolicy via nmstate.io ? or that if they want to make changes they'd be using the NNCP to make the change also not incurring a reboot.
Do I have it correct?
If you need persistent changes, you do them with MCO. | ||
Kubernetes-nmstate will only affect post-kubelet so we comply with the "MC owns every configuration until boot time". | ||
|
||
[Link](https://asciinema.org/a/uMqwIpvfhuI67ShT12csYXm3h) to an asciicast showing this option. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This video is really really helpful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you have it correct. We skip the reboot, and the configuration for kubernetes-nmstate is ephemeral on disk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awesome thanks!
2. Create two temporary directories for overlay purposes. | ||
3. Mount an overlay combining the NetworkManager standard /etc/NetworkManager/system-connections and mount it at the path pointed to by 1. e.g., /etc/NetworkManager/system-connections-merged | ||
|
||
kubernetes-nmstate will then operate as is, but the keyfiles that ultimately are written by nmstate would effectively be ephemeral. When a node is rebooted, the kubernetes-nmstate-handler will re-process any existing NodeNetworkConfigurationPolicy CRs, and put the configuration back in place. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside but recently I wrote: https://blog.verbum.org/2020/08/22/immutable-%E2%86%92-reprovisionable-anti-hysteresis/
It's not that we want the configuration written by nmstate to be ephemeral per se - it's that the "source of truth" should live in one place - in this case, etcd (and not /etc
). By having the implementation of this be ephemeral we're avoiding hysteresis during a node boot - there's no dependency during the boot on the previous configuration.
ac9712e
to
c62ef34
Compare
|
||
* Replace SRIOV operator | ||
* Configure and control primary interface on Day 1, including bonds and VLANs. | ||
* Part of the effort for that: [Add MCO Flattened Ignition proposal](https://github.com/openshift/enhancements/pull/467) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bcrochet thanks for adding the notes here, one minor follow-up, I think the proposal is not going to work in day 2 for primary node interface, agree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would assume not. But don't quote me on that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to update this statement to Configure and control primary interface on Day 1 and Day 2
as the non-goal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any protection against trying to use this to adjust primary interface configuration if it's not intended to be used that way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there is a protection mechanism in current proposal (this is also true for additional host interfaces if multiple components try to configure the same device), my understanding of the flow here is it will fail in the case that network configuration is wrong, deleting the NNCP + rebooting node(s) would erase the ephemeral network configuration and recover.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the failure scenario, it wouldn't even require a reboot. But to answer @russellb , no, there is no protection other than that the admin would have to affirmatively try to do it.
…tion Enhancement proposal to extend MCO for declarative network configuration.
c62ef34
to
bcb5c85
Compare
/assign @kikisdeliveryservice |
authors: | ||
- "@bcrochet" | ||
reviewers: | ||
- "@cwalters" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's get some MCO team members added here as well. @runcom, @kikisdeliveryservice, @yuqi-zhang, @ericavonb, @sinnykumari
I'm happy with Option C and really excited you were able to come with it. Thank you for your hard work on this @bcrochet ! /lgtm |
somehow @cgwalters approval didnt end up tagging this? |
ooh we need someone from the Owners file for that final approval.. /assign @dgoodwin |
/approve |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bcrochet, cgwalters, dgoodwin, kikisdeliveryservice The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Enhancement proposal to extend MCO for declaritive network configuration.