Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ARP flow not installed issue in networkpolicy-only mode #1575

Merged
merged 1 commit into from
Nov 19, 2020

Conversation

Dyanngg
Copy link
Contributor

@Dyanngg Dyanngg commented Nov 18, 2020

This PR fixes #1572

arpResponderStaticFlow is changed to be installed only in IPv4 mode by #1518, since 'ARP uses broadcast, but IPv6 doesn't support broadcast' #1272 (comment).
However, IsIPv4Enabled() is determined by c.nodeConfig.PodIPv4CIDR != nil (pkg/agent/openflow/client.go#L827), and in pka/agent/agent.go, when networkpolicy-only mode is true, setting nodeConfig.PodIPv4CIDR is skipped:

if i.networkConfig.TrafficEncapMode.IsNetworkPolicyOnly() {
		return nil
	}

	// Parse all PodCIDRs first, so that we could support IPv4/IPv6 dual-stack configurations.
	if node.Spec.PodCIDRs != nil {
		for _, podCIDR := range node.Spec.PodCIDRs {
			_, localSubnet, err := net.ParseCIDR(podCIDR)
			if err != nil {
				klog.Errorf("Failed to parse subnet from CIDR string %s: %v", node.Spec.PodCIDR, err)
				return err
			}
			if localSubnet.IP.To4() != nil {
				if i.nodeConfig.PodIPv4CIDR != nil {
					klog.Warningf("One IPv4 PodCIDR is already configured on this Node, ignore the IPv4 Subnet CIDR %s", localSubnet.String())
				}

As a result, the arpResponderStaticFlow is never installed for IPv4 networkpolicy-only mode, which causes pods to not able to communicate to apiserver:

 kubectl exec -it antrea-agent-vg6rf -n kube-system -c antrea-ovs -- ovs-ofctl dump-flows br-int table=20
 cookie=0x1000000000000, duration=2756.860s, table=20, n_packets=0, n_bytes=0, priority=190,arp actions=NORMAL
 cookie=0x1000000000000, duration=2756.861s, table=20, n_packets=0, n_bytes=0, priority=0 actions=drop
sonobuoy logs
namespace="sonobuoy" pod="sonobuoy" container="kube-sonobuoy"
time="2020-11-18T22:04:29Z" level=info msg="Scanning plugins in ./plugins.d (pwd: /)"
time="2020-11-18T22:04:29Z" level=info msg="Scanning plugins in /etc/sonobuoy/plugins.d (pwd: /)"
time="2020-11-18T22:04:29Z" level=info msg="Directory (/etc/sonobuoy/plugins.d) does not exist"
time="2020-11-18T22:04:29Z" level=info msg="Scanning plugins in ~/sonobuoy/plugins.d (pwd: /)"
time="2020-11-18T22:04:29Z" level=info msg="Directory (~/sonobuoy/plugins.d) does not exist"
time="2020-11-18T22:04:59Z" level=error msg="could not get api group resources: Get \"https://10.100.0.1:443/api?timeout=32s\": dial tcp 10.100.0.1:443: i/o timeout"
time="2020-11-18T22:04:59Z" level=info msg="no-exit was specified, sonobuoy is now blocking"

This PR fixes this issue by always installing the arpResponderStaticFlow. In IPv6 cases, this flow will simply not get hit.

@antrea-bot
Copy link
Collaborator

Thanks for your PR.
Unit tests and code linters are run automatically every time the PR is updated.
E2e, conformance and network policy tests can only be triggered by a member of the vmware-tanzu organization. Regular contributors to the project should join the org.

The following commands are available:

  • /test-e2e: to trigger e2e tests.
  • /skip-e2e: to skip e2e tests.
  • /test-conformance: to trigger conformance tests.
  • /skip-conformance: to skip conformance tests.
  • /test-all-features-conformance: to trigger conformance tests with all alpha features enabled.
  • /skip-all-features-conformance: to skip conformance tests with all alpha features enabled.
  • /test-whole-conformance: to trigger all conformance tests on linux.
  • /skip-whole-conformance: to skip all conformance tests on linux.
  • /test-networkpolicy: to trigger networkpolicy tests.
  • /skip-networkpolicy: to skip networkpolicy tests.
  • /test-windows-conformance: to trigger windows conformance tests.
  • /skip-windows-conformance: to skip windows conformance tests.
  • /test-windows-networkpolicy: to trigger windows networkpolicy tests.
  • /skip-windows-networkpolicy: to skip windows networkpolicy tests.
  • /test-hw-offload: to trigger ovs hardware offload test.
  • /skip-hw-offload: to skip ovs hardware offload test.
  • /test-all: to trigger all tests (except whole conformance).
  • /skip-all: to skip all tests (except whole conformance).

@codecov-io
Copy link

codecov-io commented Nov 18, 2020

Codecov Report

Merging #1575 (0a87501) into master (9d3d10b) will increase coverage by 0.79%.
The diff coverage is 48.78%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1575      +/-   ##
==========================================
+ Coverage   63.31%   64.11%   +0.79%     
==========================================
  Files         170      181      +11     
  Lines       14250    15188     +938     
==========================================
+ Hits         9023     9738     +715     
- Misses       4292     4429     +137     
- Partials      935     1021      +86     
Flag Coverage Δ
e2e-tests 48.22% <33.53%> (?)
kind-e2e-tests 52.73% <44.11%> (-2.66%) ⬇️
unit-tests 40.67% <10.53%> (-0.61%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
cmd/antrea-agent/agent.go 0.00% <0.00%> (ø)
cmd/antrea-agent/options.go 21.69% <0.00%> (+0.97%) ⬆️
pkg/agent/config/node_config.go 100.00% <ø> (ø)
pkg/agent/stats/collector.go 97.72% <ø> (ø)
pkg/features/antrea_features.go 16.66% <ø> (ø)
pkg/ovs/openflow/ofctrl_builder.go 60.58% <0.00%> (-1.59%) ⬇️
pkg/ovs/openflow/ofctrl_group.go 48.57% <0.00%> (-4.56%) ⬇️
pkg/ovs/openflow/ofctrl_action.go 68.90% <17.14%> (+7.55%) ⬆️
pkg/agent/agent.go 55.38% <30.00%> (+6.67%) ⬆️
pkg/agent/proxy/types/types.go 60.00% <33.33%> (-24.62%) ⬇️
... and 69 more

Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @Dyanngg

Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@antoninbas
Copy link
Contributor

/test-all

@antoninbas
Copy link
Contributor

/test-e2e

@tnqn tnqn merged commit a157663 into antrea-io:master Nov 19, 2020
antoninbas pushed a commit to antoninbas/antrea that referenced this pull request Nov 20, 2020
@Dyanngg Dyanngg deleted the arp-flow-fix branch November 3, 2021 00:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CI jobs for AKS & EKS are failing
6 participants