Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop Remediation When NHC Timed Out Annotation Exists #72

Merged
merged 3 commits into from
Aug 1, 2023
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions controllers/fenceagentsremediation_controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ import (
"fmt"

"github.com/go-logr/logr"
commonAnnotations "github.com/medik8s/common/pkg/annotations"

apiErrors "k8s.io/apimachinery/pkg/api/errors"
"k8s.io/apimachinery/pkg/runtime"
Expand All @@ -35,6 +36,8 @@ import (
)

const (
// errors
errorNhcTimedOut = "stop remediation when NHC timed out annotaion exists"
errorMissingParams = "nodeParameters or sharedParameters or both are missing, and they cannot be empty"
errorMissingNodeParams = "node parameter is required, and cannot be empty"
SuccessFAResponse = "Success: Rebooted"
Expand Down Expand Up @@ -129,6 +132,13 @@ func (r *FenceAgentsRemediationReconciler) Reconcile(ctx context.Context, req ct
return emptyResult, nil
}

// Check NHC timeout annotation
if isTimedOutByNHC(far) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can go before the if block above?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it can

r.Log.Info("FAR remediation was stopped by Node Healthcheck Operator")
// TODO: update status and return its error
return emptyResult, errors.New(errorNhcTimedOut)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why returning an error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that if we wouldn't return an error, then on the next reconcile the CR will be processed and will pass this if on the way to execute FA and delete workloads. But thinking about it again, the CR will be stopped here until the NHC annotation would be removed from the CR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the CR will be stopped here until the NHC annotation would be removed from the CR

exactly, and removing the annotation will trigger a reconcile automatically, no need to requeue ourself 🙂

}

// Fetch the FAR's pod
r.Log.Info("Fetch FAR's pod")
pod, err := utils.GetFenceAgentsRemediationPod(r.Client)
Expand Down Expand Up @@ -176,6 +186,15 @@ func (r *FenceAgentsRemediationReconciler) Reconcile(ctx context.Context, req ct
return emptyResult, nil
}

// isTimedOutByNHC checks if NHC set a timeout annotation on the CR
func isTimedOutByNHC(far *v1alpha1.FenceAgentsRemediation) bool {
if far != nil && far.Annotations != nil && far.DeletionTimestamp == nil {
_, isTimeoutIssued := far.Annotations[commonAnnotations.NhcTimedOut]
return isTimeoutIssued
}
return false
}

// buildFenceAgentParams collects the FAR's parameters for the node based on FAR CR, and if the CR is missing parameters
// or the CR's name don't match nodeParamter name or it has an action which is different than reboot, then return an error
func buildFenceAgentParams(far *v1alpha1.FenceAgentsRemediation) ([]string, error) {
Expand Down
2 changes: 1 addition & 1 deletion go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ go 1.20

require (
github.com/go-logr/logr v1.2.4
github.com/medik8s/common v1.2.0
github.com/medik8s/common v1.7.0
github.com/onsi/ginkgo/v2 v2.11.0
github.com/onsi/gomega v1.27.8
github.com/openshift/api v0.0.0-20230621174358-ea40115b9fa6
Expand Down
4 changes: 2 additions & 2 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -97,8 +97,8 @@ github.com/mailru/easyjson v0.7.7 h1:UGYAvKxe3sBsEDzO8ZeWOSlIQfWFlxbzLZe7hwFURr0
github.com/mailru/easyjson v0.7.7/go.mod h1:xzfreul335JAWq5oZzymOObrkdz5UnU4kGfJJLY9Nlc=
github.com/matttproud/golang_protobuf_extensions v1.0.4 h1:mmDVorXM7PCGKw94cs5zkfA9PSy5pEvNWRP0ET0TIVo=
github.com/matttproud/golang_protobuf_extensions v1.0.4/go.mod h1:BSXmuO+STAnVfrANrmjBb36TMTDstsz7MSK+HVaYKv4=
github.com/medik8s/common v1.2.0 h1:xgQOijD3rEn+PfCd4lJuV3WFdt5QA6SIaqF01rRp2as=
github.com/medik8s/common v1.2.0/go.mod h1:ZT/3hfMXJLmZEcqmxRWB5LGC8Wl+qKGGQ4zM8hOE7PY=
github.com/medik8s/common v1.7.0 h1:JwimhWigPTAszGG2jYJmh+uVHq2nFUPRRljw7ZhlQhg=
github.com/medik8s/common v1.7.0/go.mod h1:ZT/3hfMXJLmZEcqmxRWB5LGC8Wl+qKGGQ4zM8hOE7PY=
github.com/moby/spdystream v0.2.0 h1:cjW1zVyyoiM0T7b6UoySUFqzXMoqRckQtXwGPiBhOM8=
github.com/moby/spdystream v0.2.0/go.mod h1:f7i0iNDQJ059oMTcWxx8MA/zKFIuD/lY+0GqbN2Wy8c=
github.com/modern-go/concurrent v0.0.0-20180228061459-e0a39a4cb421/go.mod h1:6dJC0mAP4ikYIbvyc7fijjWJddQyLn8Ig3JB5CqoB9Q=
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion vendor/modules.txt
Original file line number Diff line number Diff line change
Expand Up @@ -92,8 +92,9 @@ github.com/mailru/easyjson/jwriter
# github.com/matttproud/golang_protobuf_extensions v1.0.4
## explicit; go 1.9
github.com/matttproud/golang_protobuf_extensions/pbutil
# github.com/medik8s/common v1.2.0
# github.com/medik8s/common v1.7.0
## explicit; go 1.20
github.com/medik8s/common/pkg/annotations
github.com/medik8s/common/pkg/labels
# github.com/moby/spdystream v0.2.0
## explicit; go 1.13
Expand Down