Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache Flow modification message in Antrea Agent #4495

Merged
merged 1 commit into from
May 19, 2023

Conversation

wenyingd
Copy link
Contributor

@wenyingd wenyingd commented Dec 19, 2022

he motivation for this patch is to reduce the memory used for storing the installed OpenFlow flow entries to replay them when OVS is re-connected. In the existing code, struct ofctrl.Flow is stored in the memory, which has taken a lot of memory because all kinds of the candidate Match fields are allocated in one such object although they may not use in the final Flow.

This patch is to store the FlowModification message as a substituation of the ofctrl.Flow objects. As the FlowModification message is the obejct which was sent to OVS via the OF connections in final, it contains enough information to replay the OpenFlow flow entry. Besides, a FlowModification only contains the necessary fields which are set with values.

A test with this patch may reduce about 140M memory on a setup with about 130K OpenFlow entries in antrea-agent (from 380M to 240M)

Signed-off-by: wenyingd [email protected]

@codecov
Copy link

codecov bot commented Dec 19, 2022

Codecov Report

Merging #4495 (656a1f9) into main (4de4392) will decrease coverage by 0.45%.
The diff coverage is n/a.

❗ Current head 656a1f9 differs from pull request most recent head 7740f91. Consider uploading reports for the commit 7740f91 to get more accurate results

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #4495      +/-   ##
==========================================
- Coverage   68.31%   67.87%   -0.45%     
==========================================
  Files         400      403       +3     
  Lines       58324    57690     -634     
==========================================
- Hits        39843    39155     -688     
- Misses      15708    15823     +115     
+ Partials     2773     2712      -61     
Flag Coverage Δ
e2e-tests ?
integration-tests 34.58% <0.00%> (+0.13%) ⬆️
kind-e2e-tests 46.69% <0.00%> (+0.11%) ⬆️
unit-tests 56.69% <0.00%> (-1.24%) ⬇️
Impacted Files Coverage Δ
cmd/antrea-agent/options_linux.go 0.00% <0.00%> (-100.00%) ⬇️
pkg/agent/wireguard/client_windows.go 0.00% <0.00%> (-100.00%) ⬇️
cmd/antrea-agent/util.go 0.00% <0.00%> (-85.00%) ⬇️
pkg/agent/util/ethtool/ethtool_linux.go 0.00% <0.00%> (-70.00%) ⬇️
pkg/monitor/controller.go 51.73% <0.00%> (-26.09%) ⬇️
pkg/monitor/agent.go 84.21% <0.00%> (-15.79%) ⬇️
pkg/agent/wireguard/client_linux.go 66.24% <0.00%> (-15.28%) ⬇️
cmd/antrea-agent/options.go 23.20% <0.00%> (-13.27%) ⬇️
pkg/agent/agent_linux.go 4.65% <0.00%> (-10.80%) ⬇️
pkg/graphviz/traceflow.go 55.28% <0.00%> (-7.62%) ⬇️
... and 95 more

@wenyingd
Copy link
Contributor Author

/test-all
/test-windows-all

@wenyingd wenyingd force-pushed the agent_performance branch 2 times, most recently from 98c694c to f69a41f Compare February 1, 2023 07:07
@wenyingd
Copy link
Contributor Author

wenyingd commented Feb 1, 2023

/test-all
/test-windows-all

@wenyingd
Copy link
Contributor Author

wenyingd commented Feb 7, 2023

/test-windows-all

pkg/agent/openflow/client_test.go Show resolved Hide resolved
pkg/agent/openflow/framework.go Outdated Show resolved Hide resolved
pkg/agent/openflow/framework.go Outdated Show resolved Hide resolved
pkg/ovs/openflow/interfaces.go Outdated Show resolved Hide resolved
pkg/ovs/openflow/ofctrl_bridge.go Outdated Show resolved Hide resolved
pkg/agent/openflow/client_test.go Outdated Show resolved Hide resolved
pkg/agent/openflow/client_test.go Outdated Show resolved Hide resolved
@@ -1046,6 +1116,10 @@ func preparePipelines() {
}
}
pipelineMap[pipelineID] = generatePipeline(pipelineID, requiredTables)
for _, obj := range tableCache.List() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we need to add this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set ofctrl.Table with a valid TableID to ensure the OpenFlow Modification messages can be generated successfully in tests. Since the TableID is used when generating OF messages, without this step, the test may be crashed. newFakeClient is not suitable to mock client in the related cases.

@wenyingd wenyingd force-pushed the agent_performance branch 2 times, most recently from 350b7ea to 6ab2d29 Compare March 24, 2023 08:28
@wenyingd wenyingd requested a review from tnqn March 27, 2023 02:15
@wenyingd wenyingd added this to the Antrea v1.12 release milestone Mar 27, 2023
@tnqn
Copy link
Member

tnqn commented Mar 28, 2023

@wenyingd this is a large size change, could you add more details about it for reviewers to understand?

  1. What's the purpose of the change? (I know it but other reviewers don't)
  2. How "caching flow modification message in Antrea Agent" could achieve the goal?
  3. A evaluation of the improvement, for example, on a Node with XXX flows, how much memory it consumes before and now?

@wenyingd
Copy link
Contributor Author

@wenyingd this is a large size change, could you add more details about it for reviewers to understand?

1. What's the purpose of the change? (I know it but other reviewers don't)

2. How "caching flow modification message in Antrea Agent" could achieve the goal?

3. A evaluation of the improvement, for example, on a Node with XXX flows, how much memory it consumes before and now?

Updated in the commit message.

Comment on lines 400 to 403
key string
message ofctrl.OpenFlowModMessage
table binding.Table
isDropFlow bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed the current usage of key, table, and isDropFlow. The first one seems only used by methods added for antctl, which can be easily changed to return flows directly instead. The latter two are used only by flowsToTrace, which could actually get the same information from structs like conjunction or context. So perhaps we don't need to create this struct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For table and isDropFlow fields, I don't think it is a good idea to re-generate the flows from conjunction or context for flowsToTrace which means consuming CPU calculation for the memory as the two fields are only two bytes per object in total. Instead, maybe I can get these information from the OpenFlowModMessage. Let me try to remove these two fields.

About key field, I would like to use another patch to construct from the flow modification mesage because it may introduce more changes. What do you think @tnqn

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the fields according to the offline discussion, and publish the functions to dynamically generate Flow string or match string from a FlowModification mesage. Could you review it again? @tnqn

@wenyingd wenyingd force-pushed the agent_performance branch 4 times, most recently from cbbf692 to ec409ec Compare April 4, 2023 02:07
@luolanzone luolanzone mentioned this pull request Apr 4, 2023
2 tasks
@wenyingd wenyingd force-pushed the agent_performance branch 5 times, most recently from 6809cd6 to 85a6be6 Compare May 16, 2023 07:08
pkg/agent/openflow/client.go Outdated Show resolved Hide resolved
pkg/agent/openflow/client.go Outdated Show resolved Hide resolved
pkg/agent/openflow/pipeline.go Outdated Show resolved Hide resolved
pkg/agent/openflow/pipeline.go Outdated Show resolved Hide resolved
pkg/ovs/openflow/ofctrl_flow.go Outdated Show resolved Hide resolved
Comment on lines 107 to 103
func (f *ofFlow) MatchString() string {
repr := fmt.Sprintf("table=%d", f.table.id)
if f.protocol != "" {
repr = fmt.Sprintf("%s,%s", repr, f.protocol)
}

if len(f.matchers) > 0 {
repr = fmt.Sprintf("%s,%s", repr, strings.Join(f.matchers, ","))
flowMod, err := f.getFlowMod()
if err != nil {
return ""
}
return repr
return FlowModMatchString(flowMod)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, now I see the called functions are different

Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the description to the commit message as well for future reference

pkg/ovs/openflow/utils.go Outdated Show resolved Hide resolved
Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit message is not formated properly. Follow the link in https://github.com/antrea-io/antrea/blob/main/CONTRIBUTING.md#getting-reviewers

@tnqn
Copy link
Member

tnqn commented May 16, 2023

The commit message is not wrapped with the same length limit. I will re-format it when merging. Would suggest to automate the formatting (e.g. set your .vimrc) to avoid efforts spent on this.

@wenyingd
Copy link
Contributor Author

/test-all
/test-windows-all
/test-ipv6-all
/test-multicluster-e2e

tnqn
tnqn previously approved these changes May 17, 2023
@tnqn
Copy link
Member

tnqn commented May 17, 2023

need rebase

tnqn
tnqn previously approved these changes May 18, 2023
@tnqn
Copy link
Member

tnqn commented May 18, 2023

/test-all

@tnqn
Copy link
Member

tnqn commented May 18, 2023

/test-ipv6-only-all
/test-ipv6-all
/test-windows-all

The motivation for this patch is to reduce the memory used for storing
the installed OpenFlow flow entries to replay them when OVS is re-connected.
In the existing code, struct ofctrl.Flow is stored in the memory, which
has taken a lot of memory because all kinds of the candidate Match fields
are allocated in one such object although they may not use in the final Flow.

This patch is to store the FlowModification message as a substituation of the
ofctrl.Flow objects. As the FlowModification message is the obejct which
was sent to OVS via the OF connections in final, it contains enough information
to replay the OpenFlow flow entry. Besides, a FlowModification only contains
the necessary fields which are set with values.

A test with this patch may reduce about 140M memory on a setup with about
130K OpenFlow entries in antrea-agent (from 380M to 240M)

Signed-off-by: wenyingd <[email protected]>
@tnqn
Copy link
Member

tnqn commented May 18, 2023

/test-all
/test-windows-all
/test-ipv6-all
/test-multicluster-e2e

@tnqn tnqn added the action/release-note Indicates a PR that should be included in release notes. label May 19, 2023
@tnqn tnqn merged commit 11026df into antrea-io:main May 19, 2023
@wenyingd wenyingd deleted the agent_performance branch May 30, 2023 06:45
ceclinux pushed a commit to ceclinux/antrea that referenced this pull request Jun 5, 2023
The motivation for this patch is to reduce the memory used for storing
the installed OpenFlow flow entries to replay them when OVS is re-connected.
In the existing code, struct ofctrl.Flow is stored in the memory, which
has taken a lot of memory because all kinds of the candidate Match fields
are allocated in one such object although they may not use in the final Flow.

This patch is to store the FlowModification message as a substituation of the
ofctrl.Flow objects. As the FlowModification message is the obejct which
was sent to OVS via the OF connections in final, it contains enough information
to replay the OpenFlow flow entry. Besides, a FlowModification only contains
the necessary fields which are set with values.

A test with this patch may reduce about 140M memory on a setup with about
130K OpenFlow entries in antrea-agent (from 380M to 240M)

Signed-off-by: wenyingd <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action/release-note Indicates a PR that should be included in release notes. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants