Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Verification of FA Success Response #70

Merged
merged 3 commits into from
Aug 1, 2023

Conversation

razo7
Copy link
Member

@razo7 razo7 commented Jul 30, 2023

checkFarLogs was used to verify whether FAR pod had a success response from FA command.

  • When we are running more than one remediation then checkFarLogs function could have find the success message of the first CR since we have searched for the response without any CR identification (see old discussion). Now we are using the node name to verify that (since we won't remediate a node more than once).

  • When the FAR pod resides in the unhealthy node, then it will be evicted and one FA command is run only once (after Add Status Conditions for FAR CR #69), then when it will be run again on a new node it won't try to execute the FA command. Therefore, for this scenario we are skipping the check of checkFarLogs function.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jul 30, 2023

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@razo7
Copy link
Member Author

razo7 commented Jul 30, 2023

/test 4.13-openshift-e2e

Copy link
Contributor

@clobrano clobrano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some typos and an observation

test/e2e/far_e2e_test.go Outdated Show resolved Hide resolved
test/e2e/far_e2e_test.go Outdated Show resolved Hide resolved
test/e2e/far_e2e_test.go Outdated Show resolved Hide resolved
func checkFarLogs(logString string) {
// buildExpectedLogOutput returns a string with a node identifier and a success message for the reboot action
func buildExpectedLogOutput(nodeName, sucesssMessage string) string {
expectedString := "\"Node name\": \"" + nodeName + "\", \"Response\": \"" + sucesssMessage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason not to use fmt.Sprintf? I should be easier to read

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point

@razo7
Copy link
Member Author

razo7 commented Jul 31, 2023

/test 4.13-openshift-e2e

When reboot is running only once and it is running on FAR node, then FAR pod will be recreated on a new node and since the FA command won't be executed again, then the log won't include any success message, so we won't verify the FAR success message for this scenario
checkFarLogs will use node name in order to verify a successful response message, since prior to this change checkFarLogs searched for the existance of success message which could have been from an earlier remediation
@razo7
Copy link
Member Author

razo7 commented Jul 31, 2023

/test 4.13-openshift-e2e

@razo7 razo7 marked this pull request as ready for review August 1, 2023 05:42
EventuallyWithOffset(1, func() string {
pod, err := utils.GetFenceAgentsRemediationPod(k8sClient)
if err != nil {
log.Error(err, "failed to get FAR pod. Might try again")
return ""
}
if pod.Spec.NodeName == farNodeName {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: IIUC farNodeName is the node that being remidiated.
I think that the name is a bit confusing in that context (initially I thought farNodeName is the node that Far Pod is running on).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, it was indeed confusing.

Variable name and function declaration
Copy link
Contributor

@clobrano clobrano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
Giving others a chance to review old comments, feel free to unhold
/hold

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 1, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: clobrano, razo7

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@razo7
Copy link
Member Author

razo7 commented Aug 1, 2023

I think it is safe to merge
/unhold

@openshift-merge-robot openshift-merge-robot merged commit 7abfb43 into medik8s:main Aug 1, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants