Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report NetworkPolicy destined for Services as invalid when proxyAll is disabled #5591

Closed

Conversation

hongliangl
Copy link
Contributor

@hongliangl hongliangl commented Oct 17, 2023

When AntreaProxy proxyAll is not enabled, the following usages are invalid:

  • For an ingress NetworkPolicy, which is applied to Services. Note that,
    such ingress NetworkPolicy can be only applied to NodePort Services. As
    a result, proxyAll should be also enabled.
  • For an egress NetworkPolicy, which is to Services. Note that, headless
    Services are not supported.

Signed-off-by: Hongliang Liu [email protected]

@@ -863,6 +870,11 @@ func (c *ruleCache) GetCompletedRule(ruleID string) (completedRule *CompletedRul

r := obj.(*rule)

if !c.antreaProxyEnabled && r.Direction == v1beta.DirectionOut && len(r.To.ToServices) != 0 {
klog.V(2).InfoS("Field `ToServices` can be only used when AntreaProxy is enabled", "ruleID", ruleID)
return nil, false, false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We support reporting realization failure to NetworkPolicy condition. The error should be reported to antrea-controller to update the condition, instead of ignoring the rule without any error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any existing methods to report the message like this from agent to controller currently?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We haven't done it before in agent but I think the method is there, just change v1beta2.NetworkPolicyStatus to reflect the failure. It could be generic for any failure we get from the reconciler.
BTW, I think cache.go is not a good place to perform the validation. It's implementation specific so should be validated by reconciler.go

@hongliangl hongliangl changed the title Validate the egress Antrea policy rule with field ToServices Report invalid egress NetworkPolicy for Services when AntreaProxy is disabled Oct 30, 2023
@hongliangl hongliangl requested a review from tnqn October 30, 2023 10:33
@hongliangl hongliangl added this to the Antrea v1.15 release milestone Oct 30, 2023
@hongliangl hongliangl changed the title Report invalid egress NetworkPolicy for Services when AntreaProxy is disabled Report invalid ingress NetworkPolicy destined for Services when AntreaProxy is disabled Oct 31, 2023
@hongliangl hongliangl changed the title Report invalid ingress NetworkPolicy destined for Services when AntreaProxy is disabled Report NetworkPolicy destined for Services as invalid when AntreaProxy is disabled Oct 31, 2023
pkg/agent/controller/networkpolicy/status_controller.go Outdated Show resolved Hide resolved
NodeName: testNode1,
Generation: 1,
RealizationFailure: true,
Message: `[{"rule_id":"2e8b15c49b690893","reason":"AntreaProxy is not enabled"}]`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, this Message will be included in the Status for the network policy CR, using a NetworkPolicyCondition.
Should we make some transformation in the controller, given that the rule ID is not meaningful to users?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make some transformation in the controller, given that the rule ID is not meaningful to users?

You are right, the rule ID is not meaningful to users. Do you mean that we should provide more information in the message? Like Service name and namespace? Or something else?

@@ -646,6 +646,14 @@ func (c *Controller) syncRule(key string) error {
return nil
}

if !c.antreaProxyEnabled && rule.isToServicesPolicyRule() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we handle both cases (applied to service / selects services as peers with toServices) separately?
It would make sense to me because:

  1. the first case (appliedTo) doesn't only require AntreaProxy, I think it also requires proxyAll? Should we add this to the validation?
  2. we may be able to generate a better error message
  3. isToServicesPolicyRule is not a great function name IMO; the name implies that we only consider the second case (toServices)

Copy link
Contributor Author

@hongliangl hongliangl Nov 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the first case (appliedTo) doesn't only require AntreaProxy, I think it also requires proxyAll? Should we add this to the validation?

The feature applied to Service should only require AntreaProxy. It is added in this #2755. Do you mean that we should add validation here

rule, effective, realizable := c.ruleCache.GetCompletedRule(key)

If so, for the first case, the expected result of realizable should be false. But it is a little strange that adding antreaProxyEnabled member to struct ruleCache.

we may be able to generate a better error message

Do you mean that we should generate more detailed error log here?

isToServicesPolicyRule is not a great function name IMO; the name implies that we only consider the second case (toServices)

How about isAppliedToServiceIngressPolicyRule() and isToServicesEgressPolicyRule()?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the appliedTo case (different from toServices as an egress rule) was added in #3997, and does require ProxyAll?
@tnqn am I missing something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@antoninbas Sorry for updating late. I have verified it with @GraysonWu. It does require proxyAll for appliedTo case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I believe both cases need AntreaProxy, and AppliedToService also needs proxyAll.

pkg/agent/controller/networkpolicy/status_controller.go Outdated Show resolved Hide resolved
@hongliangl hongliangl force-pushed the 20231017-to-service-validate branch 3 times, most recently from da42ea1 to 143fc44 Compare November 1, 2023 07:46
Comment on lines 652 to 665
if !(c.antreaProxyEnabled && c.proxyAllEnabled) && rule.isAppliedToServicesIngressPolicyRule() {
klog.InfoS("Ingress policy rule applied to Services is valid only when both AntreaProxy and proxyAll are enabled, skipping", "ruleID", key)
if c.statusManagerEnabled {
c.statusManager.SetRuleAsInvalid(key, rule.PolicyUID, rule.PolicyName, rule.Name, "AntreaProxy or ProxyAll is not enabled")
}
return nil
}
if !c.antreaProxyEnabled && rule.isToServicesEgressPolicyRule() {
klog.InfoS("Egress policy rule to Services is valid only when AntreaProxy is enabled, skipping", "ruleID", key)
if c.statusManagerEnabled {
c.statusManager.SetRuleAsInvalid(key, rule.PolicyUID, rule.PolicyName, rule.Name, "AntreaProxy is not enabled")
}
return nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we shouldn't add these implementatic specific checks in controller. Such special handling could eventually make syncRule extremely long and very specific to implementation, especially when the controller is extended to support more scenarios, like Node policy.

It should be the reconciler instead of the controller to do such check as it's the one knowing how to implement the rule. The reconciler should return an error to controller when it's unable to implement it and the controller should just report the error in a generic way to antrea-controller, without the need to know what's the error. In this way, the workflow can be reused by any realization failure in the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just found I commented the same before: #5591 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you meant that we should do such checks in reconciler.go. If a rule is invalid, it should return an error, then we report the error to antrea-controller through statusController? If so, do we still need to requeue the rule? If statusController is not enabled, what should we do? Just log something?

Comment on lines 67 to 68
// invalidRules keeps track of the invalid NetworkPolicy rules.
invalidRules cache.Indexer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we change realizedRules to track both realized and failed rules? otherwise you will have to ensure the two storage are exclusive. We could also change SetRuleRealization to support setting both success and failure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense, I'll try that.

…s disabled

When AntreaProxy is not enabled, the following usages are invalid:

- For an ingress NetworkPolicy, which is applied to Services. Note that,
  such ingress NetworkPolicy can be only applied to NodePort Services. As
  a result, proxyAll should be also enabled.
- For an egress NetworkPolicy, which is to Services. Note that, headless
  Services are not supported.

Signed-off-by: Hongliang Liu <[email protected]>
@hongliangl hongliangl changed the title Report NetworkPolicy destined for Services as invalid when AntreaProxy is disabled Report NetworkPolicy destined for Services as invalid when proxyAll is disabled Nov 17, 2023
…s disabled

When proxyAll is not enabled, the following usages are invalid:

- For an ingress NetworkPolicy, which is applied to Services. Note that,
  such ingress NetworkPolicy can be only applied to NodePort Services. As
  a result, proxyAll should be also enabled.
- For an egress NetworkPolicy, which is to Services. Note that, headless
  Services are not supported.

Signed-off-by: Hongliang Liu <[email protected]>
Copy link
Contributor

github-actions bot commented Apr 9, 2024

This PR is stale because it has been open 90 days with no activity. Remove stale label or comment, or this will be closed in 90 days

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 9, 2024
@hongliangl hongliangl removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 15, 2024
@hongliangl hongliangl removed this from the Antrea v2.0 release milestone Apr 15, 2024
Copy link
Contributor

This PR is stale because it has been open 90 days with no activity. Remove stale label or comment, or this will be closed in 90 days

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 15, 2024
@github-actions github-actions bot closed this Oct 13, 2024
@luolanzone
Copy link
Contributor

luolanzone commented Oct 14, 2024

@hongliangl is this still a valid change? please reopen it if you plan to move forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants