Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ExternalNode] Implement SupportbundleCollection status on Controller #4249

Merged
merged 1 commit into from
Nov 17, 2022

Conversation

wenyingd
Copy link
Contributor

Manage support bundle collection status on Controller side.

  1. Aggregate the SupportBundleCollectionNodeStatus to updat to CRD status.
  2. Report CollectionFailure condition when any Node/ExternalNode reports a failure

Signed-off-by: wenyingd [email protected]

@codecov
Copy link

codecov bot commented Sep 27, 2022

Codecov Report

Merging #4249 (9e1da34) into main (6003cfa) will increase coverage by 0.57%.
The diff coverage is 90.51%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #4249      +/-   ##
==========================================
+ Coverage   64.53%   65.10%   +0.57%     
==========================================
  Files         397      398       +1     
  Lines       56239    56484     +245     
==========================================
+ Hits        36292    36775     +483     
+ Misses      17287    17051     -236     
+ Partials     2660     2658       -2     
Flag Coverage Δ
e2e-tests 38.76% <2.19%> (?)
integration-tests 34.60% <ø> (+0.04%) ⬆️
kind-e2e-tests 48.44% <2.56%> (+0.95%) ⬆️
unit-tests 48.97% <89.05%> (+0.24%) ⬆️
Impacted Files Coverage Δ
cmd/antrea-controller/controller.go 0.00% <0.00%> (ø)
...istry/controlplane/supportbundlecollection/rest.go 79.48% <ø> (ø)
...g/controller/supportbundlecollection/controller.go 81.99% <90.15%> (+6.06%) ⬆️
pkg/apiserver/apiserver.go 91.09% <100.00%> (+0.09%) ⬆️
...ntrolplane/supportbundlecollection/subresources.go 100.00% <100.00%> (ø)
pkg/agent/cniserver/ipam/ipam_service.go 76.40% <0.00%> (-12.36%) ⬇️
pkg/controller/networkpolicy/tier.go 50.00% <0.00%> (-5.00%) ⬇️
pkg/agent/interfacestore/interface_cache.go 97.36% <0.00%> (-2.64%) ⬇️
...gent/controller/networkpolicy/status_controller.go 79.16% <0.00%> (-2.50%) ⬇️
pkg/agent/openflow/multicluster.go 44.85% <0.00%> (-1.87%) ⬇️
... and 35 more

@wenyingd wenyingd changed the title Implement SupportbundleCollection status on Controller [ExternalNode] Implement SupportbundleCollection status on Controller Sep 28, 2022
@tnqn tnqn added this to the Antrea v1.9 release milestone Oct 11, 2022
@wenyingd wenyingd requested a review from tnqn October 24, 2022 09:31
"antrea.io/antrea/pkg/apis/controlplane"
)

// StatusREST implements the REST endpoint for getting NetworkPolicy's obj.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need update

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reminder

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated.

@@ -169,6 +173,8 @@ func (c *Controller) Run(stopCh <-chan struct{}) {

go wait.Until(c.worker, time.Second, stopCh)

go c.StatusController.Run(stopCh)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks a bit strange. If we decouple supportBundleController and statusBundleStatusController, it doesn't make sense to inject the latter to the former and let the former to start both. As seen from the change, the only purpose of this injection is to start the status controller.

But here I feel there is no need to have a separate statusController, as the existing controller would also update status. If we just add some logic to the existing controller, many repeated code could be saved and no need to worry about race conditions between the two controllers.

@wenyingd wenyingd force-pushed the supportbundle_status branch 2 times, most recently from e0a7989 to 5ef40c9 Compare November 3, 2022 04:01
"antrea.io/antrea/pkg/apis/controlplane"
)

// StatusREST implements the REST endpoint for getting NetworkPolicy's obj.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reminder

Comment on lines 317 to 318
// Update internal SupportBundleCollection status. This is triggered when Agent reports status.
// The event that SupportBundleCollection CR status update will not trigger this logic.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the comments are not accurate and maybe unnecessary.
It could be also triggered by the scheduling for expiry handling, and it seems now the expiry handling would never be executed as it's in createInternalSupportBundleCollection. I think the function name createInternalSupportBundleCollection is not accurate and some logic should be moved to updateStatus.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the comments, and move the call of updateStatus out of createInternalSupportBundleCollection, and call updateStatus only in syncSupportBundleCollection

pkg/controller/supportbundlecollection/controller.go Outdated Show resolved Hide resolved
pkg/controller/supportbundlecollection/controller.go Outdated Show resolved Hide resolved
pkg/controller/supportbundlecollection/controller.go Outdated Show resolved Hide resolved
}

func conditionSliceEqualsIgnoreLastTransitionTime(as, bs []v1alpha1.SupportBundleCollectionCondition) bool {
sort.Slice(as, func(i, j int) bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A way to simplify this is to always set full condition list whenever updating status, instead of only setting Completed to True but never False, and always using the same order to store status in slice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is difficult to provide a full condition list any time when updating the status, for example, if the Started is false, it is meaningless to set a condition that CollectionFailure is false, and completed is false. Besides, we expect to leverage CollectionFailure=False as part of the identifier for all Nodes have uploaded the files to server. So I think the condition status is set as false when all Nodes are succeeded to upload. But if it is set in advance ( because no failure was found at that time, but maybe happens in future), it is easier to introduce misunderstanding.

I change the logic to sort the conditions before executing update operation to reduce the complexity, and set condition status of CollectionComplete and bundleCollected whenever updating the status, but for CollectionFailure, the status is set when a failure is reported from any Node or no failure is reported from all Nodes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the condition is there regardless of the stage of the realization, it seems you want to use "unset" to represent another state in addition to True and False. The condition value doesn't have to mean the final state, it could change based on the actual state. For example, when Started is false, it doesn't seem wrong that CollectionFailure is False and Completed is False as it just means there is no collection failure yet and the collection is not completed yet, they are obvious and redundant but not wrong and not ambiguous.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, updated.

Comment on lines 379 to 382
_, oldInternalBundleExists, _ := c.supportBundleCollectionStore.Get(bundle.Name)
if oldInternalBundleExists {
klog.InfoS("Internal SupportBundleCollection already exists", "name", bundle.Name)
return nil
return nil, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can never happen since you have checked the object doesn't exist before calling create method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thanks for catching it, removed.

}

func conditionSliceEqualsIgnoreLastTransitionTime(as, bs []v1alpha1.SupportBundleCollectionCondition) bool {
sort.Slice(as, func(i, j int) bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the condition is there regardless of the stage of the realization, it seems you want to use "unset" to represent another state in addition to True and False. The condition value doesn't have to mean the final state, it could change based on the actual state. For example, when Started is false, it doesn't seem wrong that CollectionFailure is False and Completed is False as it just means there is no collection failure yet and the collection is not completed yet, they are obvious and redundant but not wrong and not ambiguous.

pkg/controller/supportbundlecollection/controller.go Outdated Show resolved Hide resolved
pkg/controller/supportbundlecollection/controller.go Outdated Show resolved Hide resolved
return nil
}
toUpdate.Status = *updatedStatus
klog.V(2).InfoS("Updating SupportBundleCollection", "supportBundleCollection", name, "status", klog.KObj(toUpdate))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

klog.KObj will just print the object name, just like the existing name, instead of the status part of the object.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated.

tnqn
tnqn previously approved these changes Nov 15, 2022
Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines 700 to 708
updateStatusFunc := func(currentNodes, desiredNodes int, updatedConditions []v1alpha1.SupportBundleCollectionCondition) error {
status := &v1alpha1.SupportBundleCollectionStatus{
SucceededNodes: int32(currentNodes),
DesiredNodes: int32(desiredNodes),
Conditions: updatedConditions,
}
klog.V(2).InfoS("Updating SupportBundleCollection status", "supportBundleCollection", internalBundleCollection.Name, "status", status)
return c.updateSupportBundleCollectionStatus(internalBundleCollection.Name, status)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it seems no benefit gained from having this as a function now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, removed.

@wenyingd
Copy link
Contributor Author

/test-all

tnqn
tnqn previously approved these changes Nov 15, 2022
Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tnqn
Copy link
Member

tnqn commented Nov 15, 2022

@jianjuns @mengdie-song please let me know if you will take another look or if this looks good to you.

Copy link
Contributor

@mengdie-song mengdie-song left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to confirm a RBAC question. For now, it seems that controller only has update privilege for supportbundlecollections/status. Then if agent side calls UpdateStatus function, it should use create privilege?

@wenyingd
Copy link
Contributor Author

I want to confirm a RBAC question. For now, it seems that controller only has update privilege for supportbundlecollections/status. Then if agent side calls UpdateStatus function, it should use create privilege?

Controller needs update privilege to update CR SupportBundleCollection status, and Agent needs create privilege on the internal resource supportbundlecollections/status

@wenyingd
Copy link
Contributor Author

I want to confirm a RBAC question. For now, it seems that controller only has update privilege for supportbundlecollections/status. Then if agent side calls UpdateStatus function, it should use create privilege?

Thanks for proposing it, the failure catches one issue in Controller apiserver in my code. Resolved in my latest change.

Copy link
Contributor

@mengdie-song mengdie-song left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wenyingd I see controller.go is updated, any change in it I need to recheck?

@wenyingd
Copy link
Contributor Author

@wenyingd I see controller.go is updated, any change in it I need to recheck?

No logic change happens in controller.go. The change is about the Field name ( SucceededNodes -> CollectedNodes ) in SupportBundleCollectionStatus, which was not matching the name defined in the CRD.

Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tnqn
Copy link
Member

tnqn commented Nov 17, 2022

/test-all

@tnqn
Copy link
Member

tnqn commented Nov 17, 2022

/skip-conformance which failed on an irrelevant case (and was env issue)

@tnqn tnqn merged commit 073dbf9 into antrea-io:main Nov 17, 2022
GraysonWu pushed a commit to GraysonWu/antrea that referenced this pull request Jan 27, 2023
heanlan pushed a commit to heanlan/antrea that referenced this pull request Mar 29, 2023
@wenyingd wenyingd deleted the supportbundle_status branch May 30, 2023 07:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants