Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add BMH live-iso proposal #150

Merged
merged 1 commit into from
Jan 22, 2021
Merged

Conversation

hardys
Copy link
Member

@hardys hardys commented Nov 10, 2020

Adds a proposal to expose an Ironic API that allows us to boot an arbitrary live-iso image, vs only specifying an image to be written to disk by IPA.

@metal3-io-bot metal3-io-bot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 10, 2020
@hardys
Copy link
Member Author

hardys commented Nov 10, 2020

/cc @zaneb @dhellmann

design/baremetal-operator/bmh_boot_iso.md Outdated Show resolved Hide resolved
design/baremetal-operator/bmh_boot_iso.md Outdated Show resolved Hide resolved
design/baremetal-operator/bmh_boot_iso.md Outdated Show resolved Hide resolved
@hardys hardys removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 11, 2020
design/baremetal-operator/bmh_boot_iso.md Outdated Show resolved Hide resolved
design/baremetal-operator/bmh_boot_iso.md Outdated Show resolved Hide resolved
design/baremetal-operator/bmh_boot_iso.md Outdated Show resolved Hide resolved
design/baremetal-operator/bmh_boot_iso.md Outdated Show resolved Hide resolved
design/baremetal-operator/bmh_boot_iso.md Outdated Show resolved Hide resolved
fast-track provisioning in this workflow (since it's booting two different iso images)
so this adds a reboot into the process. This can potentially be avoided in some cases
where we can provide a status annotation on creation of the BMH such that inspection
is not performed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, currently fast-track does not really work with virtual media at all (reboots always happen). I'll be working on it in the coming weeks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dtantsur thanks, if you have a link for this please let me know, and I'll add it to the list of Ironic roadmap items

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

design/baremetal-operator/bmh_boot_iso.md Outdated Show resolved Hide resolved
design/baremetal-operator/bmh_boot_iso.md Outdated Show resolved Hide resolved
@metal3-io-bot metal3-io-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 18, 2020
@hardys
Copy link
Member Author

hardys commented Nov 18, 2020

Ok thanks for the feedback all - I've pushed an update which I think addresses the feedback so far :)

Copy link
Member

@dhellmann dhellmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with the state of this. There are a few details we could fill in as updates to this PR or as separate PRs later.

/approve

iso is booted, we don't have a way to detach the virtualmedia and will have to
rely e.g on efibootmgr in whatever iso gets booted to ensure the correct boot device.

- The inspection of a BMH will still rely on booting IPA and there's no support for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We wouldn't do it as part of this enhancement, but should we add a flag to disable inspection to support workflows that get the hardware details another way?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we could do, I was thinking we'd rely on the existing interface e.g pass some status annotation at the point of creating the BMH, but then we'd need to change behavior so it can be updated after the hardware details are collected.

A flag may be clearer though, I can either include that here or in a follow-up proposal when we have the boot-iso part working?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the use case you're working on, the host resource needs to exist before the ISO with the alternative agent is booted, so there's no opportunity to collect valid data in advance. We could pass fake data, but that feels wrong.

How about if we say that by default if there is a live image on the host we would not perform inspection, under the assumption that we either don't care about hardware details in those cases or the details will come from somewhere else later. Then we can use other design docs to cover an API for setting hardware details after the first reconciliation and to add a flag like spec.liveImage.enableInspection to restore the ability to (optionally) perform inspection with IPA before attaching the ISO.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure making inspection conditional on the spec.liveImage will work, because we may create the BMH resource then set the image reference later, and in that case we want the BMH to enroll and be ready to boot the iso, but not perform the existing inspection steps?

Perhaps adding an explicit flag elsewhere in the spec would be clearer?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just have an annotation that causes inspection to be skipped. That's simple to do, works for all use cases, and avoids making it part of the Spec which is a bit icky.
There is a proposal I can't find right now to add an annotation to force a new inspection. Maybe we could use the same key with different values (e.g. never to skip it even the first time vs. refresh to trigger a new one).

Note that to add the data externally later on you'll need RBAC write access to the Status, which is unusual.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zaneb I think this deserves a separate enhancement proposal - I suspect we need a way to skip inspection, but also add the data via the existing status annotation (avoiding the status RBAC issue you mention), which beyond the scope of this PR - I'll start a new one if that sounds reasonable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with moving the inspection discussion to another enhancement.

design/baremetal-operator/bmh_live_boot.md Outdated Show resolved Hide resolved
Copy link
Member

@zaneb zaneb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve


Note that in this mode, `rootDeviceHints` and `userData` will not be used
since the image won't be written to disk, and Ironic doesn't
support passing user-data in addition to the iso attachment at present.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's about time we added a webhook to validate this kind of stuff.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a period of time when validation web hooks aren't applied, between the time the CRD is installed (and the API server supports the new resource type) and when the web hook server is registered and starts.

iso is booted, we don't have a way to detach the virtualmedia and will have to
rely e.g on efibootmgr in whatever iso gets booted to ensure the correct boot device.

- The inspection of a BMH will still rely on booting IPA and there's no support for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just have an annotation that causes inspection to be skipped. That's simple to do, works for all use cases, and avoids making it part of the Spec which is a bit icky.
There is a proposal I can't find right now to add an annotation to force a new inspection. Maybe we could use the same key with different values (e.g. never to skip it even the first time vs. refresh to trigger a new one).

Note that to add the data externally later on you'll need RBAC write access to the Status, which is unusual.

@hardys
Copy link
Member Author

hardys commented Dec 11, 2020

Ok I added the Ironic links, and if folks agree I think the skipping inspection workflow would be best discussed in a new proposal - implementation of this doesn't directly depend on it, so I'll create a new PR where we can decide the best approach for that.

hardys pushed a commit to hardys/ironic-image that referenced this pull request Dec 11, 2020
iso is booted, we don't have a way to detach the virtualmedia and will have to
rely e.g on efibootmgr in whatever iso gets booted to ensure the correct boot device.

- The inspection of a BMH will still rely on booting IPA and there's no support for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with moving the inspection discussion to another enhancement.

@metal3-io-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dhellmann, hardys, zaneb

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [dhellmann,hardys,zaneb]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@hardys
Copy link
Member Author

hardys commented Dec 16, 2020

/retitle Add BMH LiveImage proposal

@metal3-io-bot metal3-io-bot changed the title Add BMH boot-iso proposal Add BMH LiveImage proposal Dec 16, 2020
hardys pushed a commit to hardys/metal3-dev-env that referenced this pull request Dec 18, 2020
Add initial support for downloading an FCOS iso, related to
development of metal3-io/metal3-docs#150
hardys pushed a commit to hardys/metal3-dev-env that referenced this pull request Dec 18, 2020
Add initial support for downloading an FCOS iso, related to
development of metal3-io/metal3-docs#150
hardys pushed a commit to hardys/metal3-dev-env that referenced this pull request Dec 21, 2020
Add initial support for downloading an FCOS iso, related to
development of metal3-io/metal3-docs#150
@hardys
Copy link
Member Author

hardys commented Dec 22, 2020

I think this is now ready to merge?

We'll also need some way to disable cleaning as discussed via #151 and I'll raise a new enhancement to discuss how we can skip inspection.

@dhellmann
Copy link
Member

I think this is now ready to merge?

I think so. @zaneb approved an earlier version and I'm happy with this version. It would be good to have a thumbs-up from someone else, too.

Maybe we can start the lazy consensus process this week?

@hardys
Copy link
Member Author

hardys commented Jan 7, 2021

Discussion on the implementation PR and slack indicates we may want to go with a different interface so I'll hold this until that direction is agreed.

/hold

@metal3-io-bot metal3-io-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 7, 2021

### Risks and Mitigations

- Currently there is no detachment API in ironic, so in the case where an installer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is now! Not sure we have any documentation about it, but ping @bfournie or myself for pointers.

avoided in some cases where we can provide a status annotation on creation of
the BMH such that inspection is not performed.

- Similarly cleaning of a BMH will rely on IPA, so will require a reboot which may
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For clarity: in-band cleaning. There is also out-of-band cleaning that is done via talking to the BMC, and we'll keep using that. It is only available as manual cleaning now, so it's fine to disable the automatic one.


[Allow config-drive with virtualmedia ramdisk deploy](https://storyboard.openstack.org/#!/story/2008380)

[Method to detach virtualmedia after initial boot](https://storyboard.openstack.org/#!/story/2008363)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merged

these don't block the initial implementation but may be desirable to make
this feature more flexible:

[Allow config-drive with virtualmedia ramdisk deploy](https://storyboard.openstack.org/#!/story/2008380)
Copy link
Member

@dtantsur dtantsur Jan 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On review, will try to get it merged this week.

UPD: merged

@hardys
Copy link
Member Author

hardys commented Jan 12, 2021

Alternative BMH API proposed via metal3-io/baremetal-operator#759 - if we're in agreement with that I'll rework this to reflect the new interface.

@hardys
Copy link
Member Author

hardys commented Jan 12, 2021

Updated to reflect interface in metal3-io/baremetal-operator#759

/hold cancel

@metal3-io-bot metal3-io-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 12, 2021
@hardys
Copy link
Member Author

hardys commented Jan 13, 2021

/retitle Add BMH live-iso proposal

@metal3-io-bot metal3-io-bot changed the title Add BMH LiveImage proposal Add BMH live-iso proposal Jan 13, 2021
@hardys
Copy link
Member Author

hardys commented Jan 21, 2021

The implementation landed ref metal3-io/baremetal-operator#759 so it'd be good to merge this - @zaneb could you revisit perhaps? Thanks!

@dhellmann
Copy link
Member

/retest
/lgtm

Since the code has merged, let's land this and update it in another PR if we're missing any details.

@metal3-io-bot metal3-io-bot added the lgtm Indicates that a PR is ready to be merged. label Jan 22, 2021
@zaneb
Copy link
Member

zaneb commented Jan 22, 2021

Netlify tests are failing with make: *** No rule to make target 'build'. Stop., so I would infer that this needs to be rebased to pick up whatever tooling support was added in the repo for those jobs.
(I miss Zuul.)

@metal3-io-bot metal3-io-bot removed the lgtm Indicates that a PR is ready to be merged. label Jan 22, 2021
@hardys
Copy link
Member Author

hardys commented Jan 22, 2021

Netlify tests are failing with make: *** No rule to make target 'build'. Stop., so I would infer that this needs to be rebased to pick up whatever tooling support was added in the repo for those jobs.
(I miss Zuul.)

Thanks, now rebased and the jobs completed OK

@dhellmann
Copy link
Member

/lgtm

@metal3-io-bot metal3-io-bot added the lgtm Indicates that a PR is ready to be merged. label Jan 22, 2021
@metal3-io-bot metal3-io-bot merged commit 62dc0de into metal3-io:master Jan 22, 2021
derekhiggins pushed a commit to derekhiggins/ironic-image that referenced this pull request Feb 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants