-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add network policy related metadata to flow records #2163
Conversation
Codecov Report
@@ Coverage Diff @@
## main #2163 +/- ##
==========================================
+ Coverage 61.83% 65.42% +3.59%
==========================================
Files 276 276
Lines 21145 21183 +38
==========================================
+ Hits 13074 13858 +784
+ Misses 6702 5918 -784
- Partials 1369 1407 +38
Flags with carried forward coverage won't be shown. Click here to find out more.
|
4ae17e4
to
9bc8e51
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding more netpol info in flow records. Could you improve the flow aggregator e2e test as well, where we check the netpol info?
For sure. I'm also trying to add the e2e test, just facing some interface issue. If I understood correctly, we are using packages networking.v1, meta.v1 to initialize network policies, in flowaggregator_test.go. The interfaces include policy UID, but don’t include policy type, rule name, rule priority(I don't think we need to test on rule priority, since it's actually the ovs flow priority internally computed from tier priority, policy priority, rule priority). Do we need to change to other interfaces in order to construct network policies including policy type and rule name? I noticed antrea/pkg/apis/controlplane/types.go offers that kind of interface. To make those changes, I might need to change the createNetworkPolicy, deleteNetworkPolicy functions in framework.go, along with other e2e test files which make use of those two functions. |
Sure, I think those fields policy UID are present in Antrea network policy. Instead of changing the existing test with K8s policy, can you add a new test with the Antrea network policy for one of the flow cases? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't forget to test with elk flow collector. Some changes are needed as we are adding new fields.
conn.IngressNetworkPolicyRuleName = rule.Name | ||
// TODO: tier name, tier priority, policy priority to be added when support available | ||
if policy.Type == "K8sNetworkPolicy" { | ||
conn.IngressNetworkPolicyRulePriority = -1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to use the const defined in go-ipfix
pkg/flowaggregator/flowaggregator.go
Outdated
"ingressNetworkPolicyUID", | ||
"ingressNetworkPolicyType", | ||
"ingressNetworkPolicyRuleName", | ||
"ingressNetworkPolicyRulePriority", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe go-ipfix has not supported correlation of data type signed32
, we should add that before using it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if I get it. I have already changed rule priority's type to signed32 in go-ipfix. In the output of template set, it shows rule priority's length to be 4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get it.. Will make more correlated changes in go-ipfix
pkg/flowaggregator/flowaggregator.go
Outdated
"egressNetworkPolicyUID", | ||
"egressNetworkPolicyType", | ||
"egressNetworkPolicyRuleName", | ||
"egressNetworkPolicyRulePriority", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same above.
if policy == nil { | ||
// This should not happen because the rule flow ID to rule mapping is | ||
// preserved for max(5s, flowPollInterval) even after the rule deletion. | ||
klog.Warningf("Cannot find NetworkPolicy that has rule with egressOfID %v", egressOfID) | ||
} else { | ||
conn.EgressNetworkPolicyName = policy.Name | ||
conn.EgressNetworkPolicyNamespace = policy.Namespace | ||
conn.EgressNetworkPolicyUID = string(policy.UID) | ||
switch policy.Type { | ||
case "K8sNetworkPolicy": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to use const from definitions likepkg/apis/controlplane/types.go
instead of string here.
conn.EgressNetworkPolicyType = 3 | ||
} | ||
conn.EgressNetworkPolicyRuleName = rule.Name | ||
if policy.Type == "K8sNetworkPolicy" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here.
expConn.EgressNetworkPolicyUID = string(np2.UID) | ||
switch np2.Type { | ||
case cpv1beta.K8sNetworkPolicy: | ||
expConn.EgressNetworkPolicyType = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can use const defined in go-ipfix instead of self-assigned value, same for next several uses.
831ef4a
to
0b02e77
Compare
5dfd494
to
f99bac8
Compare
cadc529
to
19d8b58
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few comments in the e2e test, otherwise LGTM
} | ||
}) | ||
|
||
// InterNodeFlows tests the case, where Pods are deployed on different Nodes | ||
// and their flow information is exported as IPFIX flow records. | ||
t.Run("InterNodeFlows", func(t *testing.T) { | ||
np1, np2 := deployNetworkPolicies(t, data, "perftest-a", "perftest-c") | ||
anp1, anp2 := deployAntreaNetworkPolicies(t, data, "perftest-a", "perftest-c") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment that says Antrea network policy is being tested here right before the test and at "IntraNodeFlows" K8s network policy being tested would be useful.
test/e2e/flowaggregator_test.go
Outdated
@@ -153,33 +172,31 @@ func testHelper(t *testing.T, data *TestData, podAIPs *PodIPs, podBIPs *PodIPs, | |||
} | |||
}() | |||
// TODO: Skipping bandwidth check for Intra-Node flows as it is flaky. | |||
// test on K8s network policies |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move the comment to right before the test.
Change the name of deployNetworkPolicies
to deployK8sNetworkPolicies
and similarly for related functions.
test/e2e/flowaggregator_test.go
Outdated
// Check if records have both ingress and egress network policies. | ||
if !strings.Contains(record, ingressNetworkPolicyName) { | ||
t.Errorf("Record does not have NetworkPolicy name with ingress rule") | ||
} | ||
if !strings.Contains(record, fmt.Sprintf("%s: %s", "ingressNetworkPolicyNamespace", testNamespace)) { | ||
t.Errorf("Record does not have correct ingressNetworkPolicyNamespace") | ||
} | ||
if !strings.Contains(record, "ingressNetworkPolicyUID") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add a custom UID when adding configuring the policy and validate that here?
You have to change the underlying createNetworkPolicy
function a little bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For K8s network policy, the UID can be customized. But for Antrea network policy, the AntreaNetworkPolicySpecBuilder
doesn't support us to customize the UID. Do you think that's ok for us to only validate K8s np's UID? I feel testing on K8s np is indeed good enough, but it may looks a bit inconsistent, for testing on the different fields with different flow types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testing only for k8s NP is fine with me.
test/e2e/flowaggregator_test.go
Outdated
if !strings.Contains(record, egressNetworkPolicyName) { | ||
t.Errorf("Record does not have NetworkPolicy name with egress rule") | ||
} | ||
if !strings.Contains(record, fmt.Sprintf("%s: %s", "egressNetworkPolicyNamespace", testNamespace)) { | ||
t.Errorf("Record does not have correct egressNetworkPolicyNamespace") | ||
} | ||
if !strings.Contains(record, "egressNetworkPolicyUID") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
test/e2e/flowaggregator_test.go
Outdated
if !strings.Contains(record, fmt.Sprintf("%s: %s", "ingressNetworkPolicyNamespace", testNamespace)) { | ||
t.Errorf("Record does not have correct ingressNetworkPolicyNamespace") | ||
} | ||
if !strings.Contains(record, "ingressNetworkPolicyUID") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here.
test/e2e/framework.go
Outdated
busyboxImage = "projects.registry.vmware.com/library/busybox" | ||
nginxImage = "projects.registry.vmware.com/antrea/nginx" | ||
perftoolImage = "projects.registry.vmware.com/antrea/perftool" | ||
// ipfixCollectorImage = "projects.registry.vmware.com/antrea/ipfix-collector:v0.5.1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you not able to access harbor regisstry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have changed it back. Thanks for reminding!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you pushed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just pushed. I was waiting for Srikar's reply on the comments of e2e test. Sorry for the confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for the late review
test/e2e/framework.go
Outdated
busyboxImage = "projects.registry.vmware.com/library/busybox" | ||
nginxImage = "projects.registry.vmware.com/antrea/nginx" | ||
perftoolImage = "projects.registry.vmware.com/antrea/perftool" | ||
// ipfixCollectorImage = "projects.registry.vmware.com/antrea/ipfix-collector:v0.5.1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you pushed?
114: | ||
- :string | ||
- :ingressNetworkPolicyUID | ||
115: | ||
- :uint8 | ||
- :ingressNetworkPolicyType | ||
116: | ||
- :int32 | ||
- :ingressNetworkPolicyPriority | ||
117: | ||
- :string | ||
- :egressNetworkPolicyUID | ||
118: | ||
- :uint8 | ||
- :egressNetworkPolicyType | ||
119: | ||
- :int32 | ||
- :egressNetworkPolicyPriority |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while adding the policy type makes sense to me, I am not so sure about adding the UID and the priority... The UID is duplicate information once you have the type and the name. The priority is also unnecessary IMO since we can retrieve it based on the rule name. We are just growing the record size, but I don't see much benefit. If a tool needs to retrieve this information, it could be given access to the K8s API. Any opinion on this @jianjuns and @tnqn?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For UID, yes, it means nothing for most users and it cannot be used to get a resource from K8s API. Normally it's only useful to distinguish resources having same name while existing in different time. If the records are only used for flow visibility, I would agree UID has no much value.
For priority, it seems that it's the openflow priority which is dynamically assigned on each Node, so even same policy rule will have different values when agents export them. Will it be really useful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding UID, the use case we thought of earlier is not to show to the user but it's for integration with systems, where there is access to k8s API of the cluster and UID could be used to fetch the network policy object rather than using the name.
For priority, I was not aware that each agent computes its own priority for a given rule. If that is the case, I agree that it cannot be used. We should show policy priority or tier priority. IMO, we can remove this rule priority for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Normally it's only useful to distinguish resources having same name while existing in different time.
@tnqn Do you think it is a case that we need to handle here? If yes, it feels like we should also include the UID for the other K8s resources as well, e.g. source & destination Pods.
I feel like it's all or nothing. Either we include UIDs for all K8s resources, in which case it makes the records larger and adds complexity, or we include it for none of them. The advantage of including the UID is to make sure that we get the correct resource by name from the K8s API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding UID, the use case we thought of earlier is not to show to the user but it's for integration with systems, where there is access to k8s API of the cluster and UID could be used to fetch the network policy object rather than using the name.
@srikartati But UID in K8s can not be used to fetch object, the primary key in API and etcd is [namespace/]name.
@antoninbas I feel resources with same name usually mean same thing, and pods' names are usually auto-generated randomly so unlikely conflict.. Maybe we can start from not including it and consider adding it when user complains not being able to distinguish records of historial objects causes real problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@srikartati But UID in K8s can not be used to fetch object, the primary key in API and etcd is [namespace/]name.
@tnqn I meant in systems where the K8s objects are stored in a separate inventory and given that it is organized by using UID. At the same time, we can request a different inventory organization as well.. not a concrete use-case.
I see Antonin's point that we may need UID for all objects in that case. I agree that we can remove it for now and take it up in the future if there is a strong requirement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Let's remove it for now. It's easier to just add something later if we actually need it.
19d8b58
to
02c5a6f
Compare
"antrea.io/antrea/pkg/querier" | ||
"github.com/vmware/go-ipfix/pkg/registry" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wrong place
@@ -23,12 +23,15 @@ import ( | |||
corev1 "k8s.io/api/core/v1" | |||
"k8s.io/klog/v2" | |||
|
|||
"github.com/vmware/go-ipfix/pkg/registry" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be in the same group as "k8s.io/klog/v2"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed, thanks!
114: | ||
- :string | ||
- :ingressNetworkPolicyUID | ||
115: | ||
- :uint8 | ||
- :ingressNetworkPolicyType | ||
116: | ||
- :int32 | ||
- :ingressNetworkPolicyPriority | ||
117: | ||
- :string | ||
- :egressNetworkPolicyUID | ||
118: | ||
- :uint8 | ||
- :egressNetworkPolicyType | ||
119: | ||
- :int32 | ||
- :egressNetworkPolicyPriority |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For UID, yes, it means nothing for most users and it cannot be used to get a resource from K8s API. Normally it's only useful to distinguish resources having same name while existing in different time. If the records are only used for flow visibility, I would agree UID has no much value.
For priority, it seems that it's the openflow priority which is dynamically assigned on each Node, so even same policy rule will have different values when agents export them. Will it be really useful?
02c5a6f
to
15320b0
Compare
114: | ||
- :string | ||
- :ingressNetworkPolicyUID | ||
115: | ||
- :uint8 | ||
- :ingressNetworkPolicyType | ||
116: | ||
- :int32 | ||
- :ingressNetworkPolicyPriority | ||
117: | ||
- :string | ||
- :egressNetworkPolicyUID | ||
118: | ||
- :uint8 | ||
- :egressNetworkPolicyType | ||
119: | ||
- :int32 | ||
- :egressNetworkPolicyPriority |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding UID, the use case we thought of earlier is not to show to the user but it's for integration with systems, where there is access to k8s API of the cluster and UID could be used to fetch the network policy object rather than using the name.
For priority, I was not aware that each agent computes its own priority for a given rule. If that is the case, I agree that it cannot be used. We should show policy priority or tier priority. IMO, we can remove this rule priority for now.
test/e2e/flowaggregator_test.go
Outdated
networkingv1 "k8s.io/api/networking/v1" | ||
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" | ||
"k8s.io/apimachinery/pkg/util/wait" | ||
|
||
crdv1alpha1 "antrea.io/antrea/pkg/apis/crd/v1alpha1" | ||
. "antrea.io/antrea/test/e2e/utils" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you need a period here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed, thanks
test/e2e/flowaggregator_test.go
Outdated
// Check if records have both ingress and egress network policies. | ||
if !strings.Contains(record, ingressNetworkPolicyName) { | ||
t.Errorf("Record does not have NetworkPolicy name with ingress rule") | ||
} | ||
if !strings.Contains(record, fmt.Sprintf("%s: %s", "ingressNetworkPolicyNamespace", testNamespace)) { | ||
t.Errorf("Record does not have correct ingressNetworkPolicyNamespace") | ||
} | ||
if !strings.Contains(record, "ingressNetworkPolicyUID") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testing only for k8s NP is fine with me.
docs/network-flow-visibility.md
Outdated
| ingressNetworkPolicyType | 56506 | 115 | unsigned8 | | ||
| ingressNetworkPolicyRuleName | 56506 | 111 | string | | ||
| ingressNetworkPolicyRulePriority | 56506 | 116 | signed32 | | ||
| egressNetworkPolicyName | 56506 | 141 | string | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same above
docs/network-flow-visibility.md
Outdated
| ingressNetworkPolicyNamespace | 56506 | 111 | string | | ||
| ingressNetworkPolicyUID | 56506 | 114 | string | | ||
| ingressNetworkPolicyType | 56506 | 115 | unsigned8 | | ||
| ingressNetworkPolicyRuleName | 56506 | 111 | string | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the field ID correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed, thanks for reminding
test/e2e/flowaggregator_test.go
Outdated
ingressNetworkPolicyUID: ac42f0ba-f580-46ae-8336-2bfc45024a9a | ||
ingressNetworkPolicyType: 2 | ||
ingressNetworkPolicyRuleName: test-ingress-rule-name | ||
ingressNetworkPolicyRulePriority: 50000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra indent here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have no idea why there is an extra indent, which doesn't exist locally.
There is also extra indent in the main:
antrea/test/e2e/flowaggregator_test.go
Line 70 in c21592c
flowType: 1 |
test/e2e/flowaggregator_test.go
Outdated
egressNetworkPolicyUID: bbbc0320-b461-45e3-987a-d2d7caf7aad3 | ||
egressNetworkPolicyType: 2 | ||
egressNetworkPolicyRuleName: test-egress-rule-name | ||
egressNetworkPolicyRulePriority: 50000 | ||
flowType: 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here.
b632244
to
3702994
Compare
/test-all |
42aaf8e
to
63a951b
Compare
/test-all |
/test-e2e |
/test-networkpolicy |
63a951b
to
bc91393
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/test-all |
change to const in unit-test add e2e test add new fields to doc changes on elk-flow-collector side change dependency module and image name change to const change ipfix dependency back to before v0.5.1 fix api issue Signed-off-by: heanlan <[email protected]>
bc91393
to
2395aa1
Compare
/test-all |
jenkins-e2e failed on somewhere unrelated with this PR. TestFlowAggregator passed all the tests. --- PASS: TestFlowAggregator (225.18s) |
Flow aggregator tests have passed: TestUserProvidedCert test has failed. |
@srikartati Sure! |
Fixes #1545