Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARO-9570: Add a controller to the ARO operator to lay down etchosts #3781

Merged
merged 5 commits into from
Aug 28, 2024

Conversation

lranjbar
Copy link
Collaborator

Which issue this PR addresses:

Fixes ARO-9570

What this PR does / why we need it:

This PR adds a MachineConfigs for the ARO operator to lay down:

99-master-aro-etc-hosts-gateway-domains
99-worker-aro-etc-hosts-gateway-domains

During the UDR installation incident we have gone back and forth on how resolve the root issue. The root cause is that the gateway domains are not populated for OpenShift images to be pulled and installed. These domains are IPs so we can pass them into /etc/hosts to resolve them before other DNS resolution options are available.

Test plan for issue:

I've some minimal smoke testing on the operator with this PR.

Is there any documentation that needs to be updated for this PR?

How do you know this will function as expected in production?

@lranjbar lranjbar force-pushed the lranjbar/ARO-9570-udr-etchosts-operator-patch branch from 0456936 to 9a4ad4a Compare August 20, 2024 17:09
Comment on lines 212 to 244
b, err := json.Marshal(ignConfig)
if err != nil {
return nil, err
}

// canonicalise the machineconfig payload the same way as MCO
var i interface{}
err = json.Unmarshal(b, &i)
if err != nil {
return nil, err
}

rawExt := runtime.RawExtension{}
rawExt.Raw, err = json.Marshal(i)
if err != nil {
return nil, err
}

return &mcfgv1.MachineConfig{
TypeMeta: metav1.TypeMeta{
APIVersion: mcfgv1.SchemeGroupVersion.String(),
Kind: "MachineConfig",
},
ObjectMeta: metav1.ObjectMeta{
Name: fmt.Sprintf("99-%s-aro-etc-hosts-gateway-domains", role),
Labels: map[string]string{
"machineconfiguration.openshift.io/role": role,
},
},
Spec: mcfgv1.MachineConfigSpec{
Config: rawExt,
},
}, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not strictly necessary for this PR if we're in a rush to get this merged, but this portion seems like it could be factored out into a shared utility function that accepts the ignition config, the role, and the MachineConfig name and returns the MachineConfig.

@bitoku
Copy link
Collaborator

bitoku commented Aug 22, 2024

we have to change this file as well to add this operator to aro-operator, don't we?

ARO-RP/cmd/aro/operator.go

Lines 96 to 101 in 1b6e0e7

if role == pkgoperator.RoleMaster {
if err = (genevalogging.NewReconciler(
log.WithField("controller", genevalogging.ControllerName),
client, dh)).SetupWithManager(mgr); err != nil {
return fmt.Errorf("unable to create controller %s: %v", genevalogging.ControllerName, err)
}

@lranjbar lranjbar force-pushed the lranjbar/ARO-9570-udr-etchosts-operator-patch branch from 0fe87e7 to dcd741e Compare August 22, 2024 20:15
@lranjbar lranjbar force-pushed the lranjbar/ARO-9570-udr-etchosts-operator-patch branch from dcd741e to 7361eae Compare August 22, 2024 20:25
Comment on lines +69 to +70
err = r.Client.Get(ctx, types.NamespacedName{Namespace: "openshift-machine-api", Name: "99-master-aro-etc-hosts-gateway-domains"}, mc)
if kerrors.IsNotFound(err) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the error handling here. It looks like we will drop any error from line 69 that isn't a kerrors.IsNotFound error. Should we make changes to handle other errors on the Get?

Here's an example of where we use IsNotFound, but then ensure to catch any other errors too:
https://github.com/Azure/ARO-RP/blob/master/pkg/operator/controllers/autosizednodes/autosizednodes_controller.go#L79-L91


// If 99-worker-aro-etc-hosts-gateway-domains doesn't exist, create it
err = r.Client.Get(ctx, types.NamespacedName{Namespace: "openshift-machine-api", Name: "99-worker-aro-etc-hosts-gateway-domains"}, mc)
if kerrors.IsNotFound(err) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above.

@mociarain
Copy link
Collaborator

/azp run ci,e2e

Copy link

Azure Pipelines successfully started running 2 pipeline(s).

Copy link
Collaborator

@cadenmarchese cadenmarchese left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My previous comments aren't blocking, so I'll go ahead and approve. thanks!

@mociarain mociarain merged commit 2f651e7 into master Aug 28, 2024
21 checks passed
@mociarain mociarain deleted the lranjbar/ARO-9570-udr-etchosts-operator-patch branch August 28, 2024 12:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
loki next-release To be included in the next RP release rollout
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants