-
Notifications
You must be signed in to change notification settings - Fork 410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFE: Move user-data secret creation from the installer into the MCO #683
Comments
I had forgotten about this issue, but actually came back to this idea as part of #736 |
I thought about this more, and it makes sense to do this - but I don't see how we get out of the installer understanding Ignition because it needs to generate it for the bootstrap machine and currently the masters. And improving this actually introduces a new cross-dependency between the MCO and machineAPI operator. Which is a good thing I think but currently the installer pregenerating the userdata implicitly avoids a dependency. |
This is prep for fixing openshift#683 Today the installer generates this "pointer Ignition" which ends up as e.g. user-data in AWS, which is just a pair of (MCS URL, root CA). We need to do this because of size limitations on AWS user data. It makes a lot of sense for the MCO to be in control of generating the pointer ignition config too, as it helps centralize knowledge of Ignition. Ultimately, the goal here is tighter integration between machineAPI and the the MCO; e.g. the machineAPI asks to generate the pointer Ignition config, rather than having it stored as a static secret. This could then unlock the ability for the MCO to inject e.g. one time auth tokens.
PR in #759 |
Discussed this with some machineAPI folks today. One thing that came out of that is that a good likely design is for the MCO to watch for creation of new And we'd clean up the generated secrets when a machine was deleted. |
@cgwalters i like that option. seems promising. |
by reading the code in the machine-api-operator - it's not the MAO which would spin but all the providers must be tweaked to spin (https://github.com/openshift/installer/blob/master/pkg/asset/machines/aws/machines.go#L106) which makes it trickier afaict - I'll sync with Alberto on this |
the change of name here to be per-machine will imply also that MachineSets aren't holding that anymore (meaning more changes in MachineSets as well I guess) |
yeh you are right, but that's our burden, it's a limitation in our API and we'll have to deal with it, e.g we could let providers return a well known error type and let the core machine controller react accordingly, or consider deprecating the providers userData field, or have a new backward compatible machine.enableAutomaticBootstrapping=true...
sgtm. Do you think we could try to extend this identity awareness to completely remove the need for the machineApprover? Would the MCO have the ability to track if that owned secret resulted in a successful node or a failure? If so could we let this new controller this signal back to the machine object i.e as a condition to leave a breadcrumb for users on why that machine didn't become a node? |
As far as I understand this, as Machine API, we would need to drop the user-data secret reference in each of our providers and instead have some sort of role field so that MCO knows whether the machine is a worker or master. This would probably be better on the machine rather than the providerspec so that MCO can read it easily and MAPI can spin waiting for it before calling to the actuators. We wouldn't need to update the actuator interface, they would just need to know how to find the secret from the MachineSpec rather than ProviderSpec. If we didn't deprecate the user-data-secret field, it would somehow need to be modified as machines are created wouldn't it? We would still want it to point to a valid secret name, but the MachineSet can't have it set to the right name. A webhook could do this, but it feels a little clunky?
Would your suggestion be to move to unique boostrap tokens per machine and move back to the built in K8s CRS approving mechanisms?
If it didn't track that a node joined the cluster, how would it clean up resources that are no longer needed? Time based? Owner Ref to Machine? |
This issue predates openshift/enhancements, but if there is renewed interest it might make sense to port to an enhancement so you can freeze out places which have consensus. I'm fine either way, but thought I'd mention it in case folks were interested. |
I'm going to create an enhancements but I've been thinking a lot about this the past few days. Given the current schedule I think coupling MCO/MAO is going to be non-trivial to accomplish in the timeframe we have to ship ignition v3, or well, it will require more than than this idea I came up with w/o tightly coupling Machines/Sets into MCO:
I think the above is a good approach and we leverage the controllers we already have in place in MCO/MAO and we use the Status's of the CRD to synchronize between controllers and it requires less changes to both projects (MAO: 2 major? changes) Let me know what you all think meanwhile (/cc @yuqi-zhang) |
Overall 👍 from me! This sounds like a simple integration, which is good because it'll be easy to implement and understand, and we can do something more sophisticated later. But, let's at least briefly look at
Basically if we can create a flow where it's more like the MAO can use an annotation on its machineset that selects the secret (which we need for control plane vs worker anyways) then we can extend it to generic pools and ideally eventually machine-specific configs. |
don't get me wrong, I 100% support that per-machine secret and it indeed would allow for more sophisticated scenarios like the one highlighted there! I think that design would be best once we move forward but I feel the migration to ignition v3 brings alone a lot of complexity and pitfalls that I'd love to remedy by starting with something simpler (hope it makes sense). I also think the per-machine secret would still be valid once we go the simpler path, the simpler path itself isn't a stopgap but a simpler implementation so it makes sense in the future to allow per-machine secrets to implement things like #1619 #1720 |
I just realized we can do this only in the code but the installer should still create the master|worker.ign for UPI to work correctly and the MCO isn't running until bootkube. |
The abstraction that I'd like to see implemented can be summarized as such: ObjRef:
Name: string
Namespace: string
Kind: string (or some other concrete kind type)
KeyName: string
PrependSelfName: bool
Machine:
...
ProviderSpec: {}
UserObjects: [] ObjRef The machine-controller could fetch 1 or more objects per machine with the dynamic client during create. These objects would be passed to each actuator (cloud provider) as map[string]interface. Each actuator would verify the type and presence of the items fetch. Consider the fields in ObjRef. Name and Namespace are obvious. KeyName is what goes in the key portion of 'map[string]interface'. The actuator knows already what keys it wants. PrependSelfName is meant to prepend the machine's own name to the object's Name it's looking for. Example:
This would cause the machine-controller to request the secret named "worker-1-user-data" from namespace openshift-machine-api. If that object is found, that object is placed into the map we pass to the actuator. EG: c.Get(mything)
userObjects["instance-user-data"] = mything
actuator.Create(userObjects ...) This kind of abstraction could be widely supported, I'm sure there are other k8s components that need to do similar things. In this manner, we could support all the clouds similarly, and since it supports arbitrary objects, is quite extensible. |
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
Stale issues rot after 30d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle rotten |
Rotten issues close after 30d of inactivity. Reopen the issue by commenting /close |
@openshift-bot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Currently the installer creates pointer Ignition configs which end up as user data secrets referenced by the installer-generated Machine(Sets). That makes things like the Ignition v2 (spec 3) transition tricky, because you need to touch both the installer (openshift/installer#1468) and MCO (#583) with some kind of ratcheting.
I think we should drop the installer's pointer Ignition-config secrets and instead have the MCO create pointer Ignition-config secrets for any roles it supports (currently master and worker). This consolidates knowledge of supported roles their configuration in the MCO and simplifies the installer->MCO interface. The MCO can put the MCS cert into the pointer configs without having to worry about cross-repo communication (and it is better positioned to update the secrets if/when the MCS cert rotates). The MCO can switch Ignition versions without involving the installer. The MCO can move the MCS to a new URI scheme without involving the installer. The interface is reduced to just the installer relying on "the cluster will have
master-user-data
andworker-user-data
secrets".Thoughts?
The text was updated successfully, but these errors were encountered: