Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Multicast] Use group as flow actions for multicast traffic #3508

Merged
merged 1 commit into from
May 6, 2022

Conversation

wenyingd
Copy link
Contributor

@wenyingd wenyingd commented Mar 23, 2022

  1. Add local Pod receivers into an OpenFlow type "all" group for each
    multicast group, and use such groups in the flow actions. Remove a Pod
    from group buckets if the Pod has left the multicast group or is deleted
    before leaving the multicast group.
  2. Improve e2e tests.

Signed-off-by: wenyingd [email protected]

@wenyingd
Copy link
Contributor Author

/test-all
/test-multicast-e2e
/skip-windows-all
/skip-ipv6-all
/skip-ipv6-only-all

@codecov-commenter
Copy link

codecov-commenter commented Mar 23, 2022

Codecov Report

Merging #3508 (95758d1) into main (be1116e) will decrease coverage by 14.85%.
The diff coverage is 36.78%.

Impacted file tree graph

@@             Coverage Diff             @@
##             main    #3508       +/-   ##
===========================================
- Coverage   64.50%   49.65%   -14.86%     
===========================================
  Files         278      391      +113     
  Lines       39500    55277    +15777     
===========================================
+ Hits        25481    27447     +1966     
- Misses      12026    25711    +13685     
- Partials     1993     2119      +126     
Flag Coverage Δ
integration-tests 38.18% <ø> (?)
kind-e2e-tests 32.19% <7.38%> (-19.96%) ⬇️
unit-tests 43.78% <34.71%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/agent/openflow/framework.go 88.77% <0.00%> (-0.92%) ⬇️
pkg/agent/openflow/multicast.go 0.00% <0.00%> (ø)
pkg/agent/openflow/pipeline.go 62.47% <0.00%> (-12.45%) ⬇️
pkg/agent/proxy/proxier.go 57.07% <0.00%> (-5.86%) ⬇️
pkg/agent/openflow/client.go 57.81% <5.88%> (-14.37%) ⬇️
pkg/agent/cniserver/pod_configuration.go 35.37% <50.00%> (-18.39%) ⬇️
pkg/agent/multicast/mcast_controller.go 61.38% <50.52%> (-6.07%) ⬇️
pkg/ovs/openflow/ofctrl_bridge.go 56.07% <60.00%> (-0.84%) ⬇️
pkg/agent/controller/egress/egress_controller.go 58.60% <100.00%> (-15.08%) ⬇️
pkg/agent/controller/networkpolicy/cache.go 85.71% <100.00%> (-3.02%) ⬇️
... and 237 more

@wenyingd
Copy link
Contributor Author

/test-multicast-e2e

@wenyingd wenyingd force-pushed the multicast_flexible_pipeline branch from 6becda3 to 995fb39 Compare March 28, 2022 09:16
@wenyingd
Copy link
Contributor Author

/test-all
/test-multicast-e2e
/skip-windows-all
/skip-ipv6-all
/skip-ipv6-only-all

@wenyingd wenyingd force-pushed the multicast_flexible_pipeline branch 2 times, most recently from 67ec708 to 0f7e808 Compare March 29, 2022 09:05
@wenyingd
Copy link
Contributor Author

/test-all
/test-multicast-e2e
/skip-windows-all
/skip-ipv6-all
/skip-ipv6-only-all

@wenyingd
Copy link
Contributor Author

/test-multicast-e2e

@wenyingd
Copy link
Contributor Author

/test-all
/test-multicast-e2e
/skip-windows-all
/skip-ipv6-all
/skip-ipv6-only-all

Copy link
Contributor

@jianjuns jianjuns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the commit message:

Add local pod receivers into an OpenFlow "all" type group for each
multicast group, and use such group in the flow action for multicast
traffic. Remove pod from group buckets if the pod has left the
multicast group or is deleted before leaving multicast group.

"all" type group -> type "all" group

such group in the flow action -> such groups in the flow actions

Remove pod -> Remove a Pod

pod -> Pod

leaving multicast group -> leaving the multicast group

pkg/agent/multicast/mcast_controller.go Outdated Show resolved Hide resolved
pkg/agent/multicast/mcast_controller.go Outdated Show resolved Hide resolved
pkg/agent/multicast/mcast_controller.go Outdated Show resolved Hide resolved
pkg/agent/multicast/mcast_controller.go Outdated Show resolved Hide resolved
pkg/agent/multicast/mcast_controller.go Outdated Show resolved Hide resolved
pkg/agent/multicast/mcast_controller.go Outdated Show resolved Hide resolved
pkg/agent/multicast/mcast_controller.go Outdated Show resolved Hide resolved
pkg/agent/multicast/mcast_controller.go Outdated Show resolved Hide resolved
pkg/agent/multicast/mcast_controller.go Outdated Show resolved Hide resolved
// leave event so that antrea can remove the corresponding interface from local multicast receivers on OVS. This function
// should be called if the removed Pod receiver fails to send IGMP leave message before deletion.
func (c *Controller) removeLocalInterface(ifConfig *interfacestore.InterfaceConfig) {
groupStatuses := c.getGroupMemberStatusesByPod(ifConfig.InterfaceName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a race condition that a Pod is recreated with the same interface name? Maybe the chance (a new interface is created and even joined groups) is too low to consider. Just asking.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The group cache is actually written by only one worker, so here we only constructs the event, and let the writable worker to modify cache.

In theory, the race condition should happens when a Pod with same name and namespace is created before we find the old Pod interfaces is completely removed from memory interface store. That requirs the time to receive the event is longer than preparing for a new Pod. But the event is triggered in memory. I don't think this race condition exists.

@wenyingd wenyingd force-pushed the multicast_flexible_pipeline branch from 6db7759 to 1f3ffd8 Compare April 1, 2022 15:33
pkg/agent/multicast/mcast_controller.go Outdated Show resolved Hide resolved
pkg/agent/multicast/mcast_controller.go Outdated Show resolved Hide resolved
@@ -189,8 +189,16 @@ type OFBridge struct {
multipartReplyChs map[uint32]chan *openflow13.MultipartReply
}

func (b *OFBridge) CreateGroupTypeAll(id GroupIDType) Group {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CreateGroupTypeAll(id GroupIDType) and CreateGroup(id GroupIDType) confuses me. Why not CreateGroupTypeAll(id GroupIDType) and CreateGroupTypeSelect(id GroupIDType) or just CreateGroup(id GroupIDType, , groupType ofctrl.GroupType) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since CreateGroup is an existing function to use type "select" when creating OpenFlow group, which is called in AntreaProxy function. I don't want to change the existing code unrelated with this change, that's why I didn't rename it. @hongliangl Would you share your point, do you think the existing function CreateGroup as CreateGroupTypeSelect is better?

Copy link
Contributor

@hongliangl hongliangl Apr 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can make another PR to rename CreateGroup to CreateGroupTypeSelect since we have another type of group.

@ceclinux
Copy link
Contributor

ceclinux commented Apr 7, 2022

/test-multicast-e2e

@ceclinux
Copy link
Contributor

ceclinux commented Apr 8, 2022

After checking the test code and running several rounds on e2e testbed,
I suggest adding two t.Parallel() in testMulticastBetweenPodsInThreeNodes and testMulticastBetweenPodsInTwoNodes to parallelize tests to further speed up e2e tests, cutting total running time in half approximately.

t.Run(mc.name, func(t *testing.T) {
		t.Parallel()
		runTestMulticastBetweenPods(t, data, mc, nodeMulticastInterfaces)
})

@wenyingd wenyingd force-pushed the multicast_flexible_pipeline branch 2 times, most recently from e52a6e0 to 0c43c42 Compare April 11, 2022 10:29
@wenyingd wenyingd force-pushed the multicast_flexible_pipeline branch from 98fcbe9 to bdb43b6 Compare April 25, 2022 07:11
@wenyingd
Copy link
Contributor Author

/test-all
/test-multicast-e2e
/skip-windows-all
/skip-ipv6-all
/skip-ipv6-only-all

@wenyingd
Copy link
Contributor Author

/test-networkpolicy
/test-e2e

@wenyingd wenyingd force-pushed the multicast_flexible_pipeline branch from bdb43b6 to 87629c0 Compare April 26, 2022 07:07
@wenyingd
Copy link
Contributor Author

/test-all
/test-multicast-e2e
/skip-windows-all
/skip-ipv6-all
/skip-ipv6-only-all

@@ -37,7 +37,20 @@ type Subscriber interface {

type Notifier interface {
// Notify sends an event to the channel.
Notify(string) bool
Notify(string, EventType, ...string) bool
Copy link
Member

@tnqn tnqn Apr 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The channel is designed to be generic. Having EventType would limit its usage scenarios.
Since string cannot meet your requirement. It could use a more generic struct: interface{}. A specific channel's consumer should convert it to a specific type, just like the generic workqueue and store interfaces.
The type could be defined in pkg/agent/types/event.go:

type CNIEvent struct {
	PodNamespace string
	PodName string
        IsAdd bool
        ContainerID string
}

Copy link
Contributor Author

@wenyingd wenyingd Apr 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sence, I would change. One question is, where would you suggest to place such event types, e.g., "CNIEvent", as it should be accssed by different packages? @tnqn

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By now, I put such event types in package pkg/util/channel

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The package for generic channel is not suitable for concrete event types. I suggested a place in above comment:

The type could be defined in pkg/agent/types/event.go

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And perhaps name it PodUpdate to avoid confusion between it and the real CNI event.

@wenyingd wenyingd force-pushed the multicast_flexible_pipeline branch 2 times, most recently from daa38db to 57a6bc5 Compare April 29, 2022 03:24
@wenyingd wenyingd requested review from tnqn and removed request for liu4480 April 29, 2022 14:23
Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this PR switches to use group for multicast, are the previous configurations like mcast_snooping_enable and mcast-snooping-disable-flood-unregistered still required?

// installedGroups saves the groups which are configured on both OVS and the host.
installedGroups sets.String
installedGroupsMutex sync.RWMutex
mRouteClient *MRouteClient
ovsBridgeClient ovsconfig.OVSBridgeClient
podDeletionCh chan *interfacestore.InterfaceConfig
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

forget to remove?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed.

pkg/agent/multicast/mcast_controller.go Outdated Show resolved Hide resolved
pkg/agent/multicast/mcast_controller.go Outdated Show resolved Hide resolved
pkg/agent/multicast/mcast_controller.go Outdated Show resolved Hide resolved
pkg/agent/openflow/multicast.go Show resolved Hide resolved
group := value.(binding.Group)
group.Reset()
if err := group.Add(); err != nil {
klog.Errorf("Error when replaying cached group %d: %v", id, err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
klog.Errorf("Error when replaying cached group %d: %v", id, err)
klog.ErrorS(err, "Error when replaying cached group", "group", id)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated.

@wenyingd wenyingd force-pushed the multicast_flexible_pipeline branch from 57a6bc5 to 0de756a Compare May 5, 2022 08:35
@wenyingd wenyingd requested a review from tnqn May 5, 2022 09:27
@wenyingd
Copy link
Contributor Author

wenyingd commented May 5, 2022

Since this PR switches to use group for multicast, are the previous configurations like mcast_snooping_enable and mcast-snooping-disable-flood-unregistered still required?

No, these configurations are not required. I removed the call in my latest update. But do you think we should also remove the settings in OVSDB for upgrade case? It doesn't take bad effect even if we don't remove them because we don't use "normal" actions in OpenFlow entries, so the traffic is actually forwarded by Antrea flows but not OVS multicast db cache.

@wenyingd wenyingd force-pushed the multicast_flexible_pipeline branch from 0de756a to 1ca467a Compare May 5, 2022 09:41
@tnqn
Copy link
Member

tnqn commented May 5, 2022

Since this PR switches to use group for multicast, are the previous configurations like mcast_snooping_enable and mcast-snooping-disable-flood-unregistered still required?

No, these configurations are not required. I removed the call in my latest update. But do you think we should also remove the settings in OVSDB for upgrade case? It doesn't take bad effect even if we don't remove them because we don't use "normal" actions in OpenFlow entries, so the traffic is actually forwarded by Antrea flows but not OVS multicast db cache.

I think no need to remove the setting for ugprade case. It was alpha and harmless as you pointed out.

@wenyingd wenyingd force-pushed the multicast_flexible_pipeline branch from 1ca467a to fa7bca7 Compare May 5, 2022 09:51
@wenyingd
Copy link
Contributor Author

wenyingd commented May 5, 2022

/test-all
/test-multicast-e2e
/skip-windows-all
/skip-ipv6-all
/skip-ipv6-only-all

tnqn
tnqn previously approved these changes May 5, 2022
Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, a minor comment

pkg/agent/cniserver/pod_configuration.go Outdated Show resolved Hide resolved
@wenyingd
Copy link
Contributor Author

wenyingd commented May 5, 2022

/test-all
/test-multicast-e2e
/skip-windows-all
/skip-ipv6-all
/skip-ipv6-only-all

1. Add local Pod receivers into an OpenFlow type "all" group for each
multicast group, and use such groups in the flow actions. Remove a Pod
from group buckets if the Pod has left the multicast group or is
deleted before leaving the multicast group.
2. Improve multicast e2e tests.

Signed-off-by: wenyingd <[email protected]>
Co-authored-by: Ruochen Shen <[email protected]>
@wenyingd wenyingd force-pushed the multicast_flexible_pipeline branch from 95758d1 to f50f7ca Compare May 6, 2022 00:21
@wenyingd
Copy link
Contributor Author

wenyingd commented May 6, 2022

/test-all
/test-multicast-e2e
/skip-windows-all
/skip-ipv6-all
/skip-ipv6-only-all

Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tnqn tnqn merged commit 50d33be into antrea-io:main May 6, 2022
@wenyingd wenyingd deleted the multicast_flexible_pipeline branch May 6, 2022 05:57
@wenyingd wenyingd mentioned this pull request Sep 14, 2022
12 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action/release-note Indicates a PR that should be included in release notes.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants