-
Notifications
You must be signed in to change notification settings - Fork 254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ Pass preprovisioningNetworkData to Ironic #1380
Conversation
@dtantsur @Rozzii - as discussed on slack - in my environment this allows me to specify a secret with network_data.json contents via Also @zaneb if you could take a look that would be much appreciated, thanks! |
/test-centos-e2e-integration-main |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In principle, I'm in favour of this. The idea behind preprovisioningNetworkData is that it is the default for the networkData during provisioning, and the networkData itself prior to provisioning, so this would be consistent with that.
However, there is an issue with this implementation - during provisioning at least, we will bounce ironic back and forth between the preprovisioningNetworkData and the actual networkData (assuming they are different) on every reconcile. It'll set the former during ValidateManagementAccess() and switch back to the latter immediately afterwards during Provision(). I'd assume this carries a risk of races.
We could restrict the setting of the preprovisioningNetworkData to certain BMO states (presumably Inspecting and Deprovisioning), but I think we'd still have an issue where during provisioning if we have to boot IPA again then it might get different network data than if we don't. I think to get consistent behaviour we have to do this only when moving the ironic node to manageable (from either enrolled, clean failed, inspect failed, adopt failed, or available) or deleting.
If the ironic provisioner is making the decision of whether it needs the preprovisiningNetworkData, it may also pay to let it call the function to retrieve it when required (similar to provisioner.HostConfigData) rather than calling it unconditionally from the controller.
I'm not sure this is correct - the code I'm adding here sets the Node network_data (which is not currently set at all by BMO) - @dtantsur can confirm but IIUC this API is only used to provide network configuration to the IPA Ramdisk pre-provisioning (so it makes sense to only consume Then on Provision we'll use the existing code, which specifies a config drive with the Am I missing something? By setting currently-unset the Node network_data, I don't see how we can introduce a race with the config-drive data? |
7ef443e
to
3d86d50
Compare
Oh, sorry, you're right. I assumed that this was how we were passing the networkData today, but evidently that's set in the ConfigDrive instead. |
3d86d50
to
fd944ad
Compare
/test-centos-e2e-integration-main |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we have discussed it prior outside this PR, I am in favor of this approach, via custom patches this is already a way of using "preprovisioningNetworkData" in some downstream organizations as far as I have heard. Thank you @hardys for implementing this for the upstream repo! I can't really give a confident opinion about the go code, as far as I can tell it is fine I don't see really any inconsistencies but I am not that familiar with the BMO code.
In any case from my POV it LGTM!
} | ||
numUpdates := len(updater.Updates) | ||
updater.SetTopLevelOpt("network_data", networkData, ironicNode.NetworkData) | ||
if len(updater.Updates) != numUpdates { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe we should change SetTopLevelOpt to return a boolean (updated or not)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I thought the same - I can perhaps look at a follow-up which refactors things to do that, but it seemed outside of the scope of this PR.
/lgtm |
Adds a convenience function similar to the existing NetworkData()
When no PreprovisioningImage integration is enabled we have no way currently to retrieve the preprovisioningNetworkData in the IPA ramdisk which prevents the Ironic DHCP-less flow[1] from being possible. With this change we'll pass the preprovisioningNetworkData to Ironic so it can be injected into the ramdisk image without requiring any custom image-build controller. [1] https://docs.openstack.org/ironic/latest/admin/dhcp-less.html
fd944ad
to
7de3670
Compare
/test-centos-e2e-integration-main |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
@Rozzii: adding LGTM is restricted to approvers and reviewers in OWNERS files. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dtantsur, Rozzii The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
if hcd.host.Spec.PreprovisioningNetworkDataName == "" { | ||
return "", nil | ||
} | ||
networkDataRaw, err := hcd.getSecretData( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did some more digging into the difference between ObtainSecret()
(used here) and AcquireSecret()
, after having got it wrong several times 😄
Both add a label to the Secret so that it gets watched and cached.
Only AcquireSecret()
adds an owner reference so that the BMH gets reconciled if the Secret changes.
In the case of network/user/metaData, what matters is the value at a point in time, when you start provisioning. It's the start of provisioning that kicks off the reconcile, so there's no need to re-reconcile when the Secret changes.
There's a strong argument that preprovisioningNetworkData is the same, and what matters is the value at the point in time where you start inspecting/preparing/deprovisioning.
There's probably another argument that since ironic may reboot the node in response to its own internal events (not the BMO), then we should always provide it with the most up-to-date data we know of. In practice we will get there eventually because we always requeue every few minutes at most, but there may be situations where something unintuitive happens.
This would be interesting to discuss further but not worth blocking on.
What this PR does / why we need it:
Ironic supports DHCP-less provisioning, which I expected to work via Metal3, particularly since we already provide the IPAM controller which can be used to statically assign control plane IPs.
Recent discussions however highlighted that this flow is not currently possible via Metal3, unless you implement a custom
PreprovisioningImage
controller, or have other downstream customisations.In the case where
--build-preprov-image
is passed to BMO, we enable creation of thePreprovisioningImage
CR and an internal default controller (which does not embed thepreprovisioningNetworkData
, but downstream custom controllers are doing so, so we don't want to break existing users opting into this flow)In the case where
--build-preprov-image
is not passed, Ironic can be configured to injectnetwork_data
into the Ramdisk, and IMO this seems like reasonable default behaviour when no image controller is enabled.In this mode if the user provides
preprovsioningNetworkDataName
we'll set the Nodenetwork_data
on initial registration of the Ironic node.This means when IPA boots for inspection, the virtual media device can be mounted and the
openstack/latest/network_data.json
file retrieved to configure the network (e.g byglean
in the upstream Ironic flow, but users can build a Ramdisk image with any tool/script that consumes this file)Note that I don't think the current upstream IPA image currently includes simple-init/glean so this will still require a custom Ramdisk image, but it does mean we can support DHCP-less without mandating a custom
PreprovisioningImage
controller.