Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add NodePort support for Antrea Proxy on Linux #1471

Closed

Conversation

weiqiangt
Copy link
Contributor

@weiqiangt weiqiangt commented Nov 2, 2020

Note: To verify this PR, the NodePort support is enabled. The feature gate should be disabled before merging this PR.
Resolves #1463.
Signed-off-by: Weiqiang Tang [email protected]

@antrea-bot
Copy link
Collaborator

Thanks for your PR.
Unit tests and code linters are run automatically every time the PR is updated.
E2e, conformance and network policy tests can only be triggered by a member of the vmware-tanzu organization. Regular contributors to the project should join the org.

The following commands are available:

  • /test-e2e: to trigger e2e tests.
  • /skip-e2e: to skip e2e tests.
  • /test-conformance: to trigger conformance tests.
  • /skip-conformance: to skip conformance tests.
  • /test-whole-conformance: to trigger all conformance tests on linux.
  • /skip-whole-conformance: to skip all conformance tests on linux.
  • /test-networkpolicy: to trigger networkpolicy tests.
  • /skip-networkpolicy: to skip networkpolicy tests.
  • /test-windows-conformance: to trigger windows conformance tests.
  • /skip-windows-conformance: to skip windows conformance tests.
  • /test-windows-networkpolicy: to trigger windows networkpolicy tests.
  • /skip-windows-networkpolicy: to skip windows networkpolicy tests.
  • /test-hw-offload: to trigger ovs hardware offload test.
  • /skip-hw-offload: to skip ovs hardware offload test.
  • /test-all: to trigger all tests (except whole conformance).
  • /skip-all: to skip all tests (except whole conformance).

@codecov-io
Copy link

codecov-io commented Nov 2, 2020

Codecov Report

❗ No coverage uploaded for pull request base (main@1f7775e). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #1471   +/-   ##
=======================================
  Coverage        ?   41.09%           
=======================================
  Files           ?      116           
  Lines           ?    14717           
  Branches        ?        0           
=======================================
  Hits            ?     6048           
  Misses          ?     8148           
  Partials        ?      521           
Flag Coverage Δ
unit-tests 41.09% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

@weiqiangt
Copy link
Contributor Author

/test-all

@weiqiangt weiqiangt linked an issue Nov 2, 2020 that may be closed by this pull request
@weiqiangt weiqiangt added area/component/agent Issues or PRs related to the agent component kind/design Categorizes issue or PR as related to design. labels Nov 2, 2020
@antoninbas antoninbas added this to the Antrea v0.12.0 release milestone Nov 13, 2020
@weiqiangt weiqiangt force-pushed the antrea-proxy-nodeport-dup branch 9 times, most recently from 294c717 to d734e2b Compare January 15, 2021 07:51
Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't finished review. But I did some test based on the patch and found an issue:
When scaling down backend pods, the members in openflow group doesn't change, leading to connectivity issue.

@@ -116,3 +119,13 @@ featureGates:
# whenever a Pod's container defines a specific port to be exposed (each container can define a list of ports as pod.spec.containers[].ports),
# and all Node traffic directed to that port will be forwarded to the Pod.
#nplPortRange: 40000-41000

# The virtual IP for NodePort Service support. It must be a link-local IP otherwise the Agents will report error.
#nodePortVirtualIP: 169.254.169.110
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tried policy route approach? I did an experiment and it worked as we discussed.
If NAT is not likely to be the final solution, better not to add this argument in case compatbility issue in future. We can just use a reserved IP in alpha phase I think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we care it must be link-local?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tnqn, I tried the policy route way.
If the destination is a loopback address we still need to do a DNAT. Considering this is an ALPHA feature for now, we can discuss it more in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jianjuns, do you think we should use any forwardable address here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to understand the current design first. Why IPv4 must be loopback, and IPv6 must not be?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got more understanding after reading the design proposal. I think no need to restrict the IP must be link-local. We can recommend link-local though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jianjuns, updated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean there are possibilities we don't need to do DNAT in future like the two approaches we have discussed. Even if we still need it for loopback address ( maybe we don't need it for this case as well), I think hardcoding a reserved IP is enough at this stage.
I just want to avoid introducing implementation specific configurations when it's not stable so we don't have the complexity of upgrade (If user set it and we delete it in future, the process will crash).
And it doesn't seem that user can get what the configuration means and how to use it with the current comments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tnqn, thanks for the explanation. I removed these two options and made them constant for now.

Base automatically changed from master to main January 26, 2021 00:00
@weiqiangt weiqiangt force-pushed the antrea-proxy-nodeport-dup branch 3 times, most recently from 6d21ce5 to 550f9d2 Compare January 26, 2021 17:10
Copy link
Contributor

@jianjuns jianjuns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a design doc for the reviewers to understand the design?

@@ -116,3 +119,13 @@ featureGates:
# whenever a Pod's container defines a specific port to be exposed (each container can define a list of ports as pod.spec.containers[].ports),
# and all Node traffic directed to that port will be forwarded to the Pod.
#nplPortRange: 40000-41000

# The virtual IP for NodePort Service support. It must be a link-local IP otherwise the Agents will report error.
#nodePortVirtualIP: 169.254.169.110
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we care it must be link-local?

build/yamls/base/conf/antrea-agent.conf Outdated Show resolved Hide resolved
build/yamls/base/conf/antrea-agent.conf Outdated Show resolved Hide resolved
@@ -180,17 +184,27 @@ func run(o *Options) error {

var proxier k8sproxy.Provider
if features.DefaultFeatureGate.Enabled(features.AntreaProxy) {
var nodePortAddresses []*net.IPNet
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel we should have a separate feature gate to control NodePort by AntreaProxy or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I added a new feature gate AntreaProxyNodePort to control it. Also, I updated the workflow here. The parsing of NodeAddress will go only if both AntreaProxyNodePort and AntreaProxy are enabled.

@weiqiangt
Copy link
Contributor Author

Do you have a design doc for the reviewers to understand the design?

I attached the issue which includes the design.

@weiqiangt
Copy link
Contributor Author

I haven't finished review. But I did some test based on the patch and found an issue:
When scaling down backend pods, the members in openflow group doesn't change, leading to connectivity issue.

Thanks @tnqn. I have fixed this issue and added checks in e2e.

@weiqiangt
Copy link
Contributor Author

/test-all

@weiqiangt
Copy link
Contributor Author

/test-all

@weiqiangt
Copy link
Contributor Author

Windows conformance failed due to infrastructure failure, re-trigger it.
/test-windows-conformance

@weiqiangt weiqiangt force-pushed the antrea-proxy-nodeport-dup branch 2 times, most recently from e8ff683 to 41cbccb Compare February 3, 2021 16:26
@weiqiangt
Copy link
Contributor Author

/test-all

@weiqiangt
Copy link
Contributor Author

weiqiangt commented Feb 3, 2021

/jenkins-ipv6-ds-conformance
/jenkins-ipv6-only-conformance

@jianjuns
Copy link
Contributor

jianjuns commented Feb 3, 2021

I do not mean to ask for a design change for this PR, but have we considered using TC to redirect traffic (https://man7.org/linux/man-pages/man8/tc-mirred.8.html)? Could it be faster or slower than iptables? @tnqn

@weiqiangt
Copy link
Contributor Author

/test-all

Jenkins tests failed on some unrelated cases, re-trigger tests to have a double-check.

[k8s.io] InitContainer [NodeConformance] should not start app containers and fail the pod if init containers fail on a RestartNever pod [Conformance]
[k8s.io] InitContainer [NodeConformance] should invoke init containers on a RestartAlways pod [Conformance]
[k8s.io] Security Context When creating a pod with readOnlyRootFilesystem should run the container with writable rootfs when readOnlyRootFilesystem=false [NodeConformance] [Conformance]
[k8s.io] Probing container should be restarted with a exec "cat /tmp/health" liveness probe [NodeConformance] [Conformance]
[k8s.io] Kubelet when scheduling a busybox Pod with hostAliases should write entries to /etc/hosts [LinuxOnly] [NodeConformance] [Conformance]

@weiqiangt
Copy link
Contributor Author

/test-ipv6-ds-conformance
/test-ipv6-only-conformance

@weiqiangt
Copy link
Contributor Author

/test-windows-conformance

@@ -1283,6 +1283,9 @@ data:
# Service traffic.
# AntreaProxy: true

# Enable NodePort Service support in AntreaProxy in antrea-agent.
AntreaProxyNodePort: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest not to enable it by default. Even we enable it, it does not provide any value, until we have a solution to remove kube-proxy.
@tnqn @antoninbas

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the plan. I think Weiqiang only enables it for testing. See PR description "To verify this PR, the NodePort support is enabled. The feature gate should be disabled before merging this PR."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is enabled only for testing. I will disable it when we're ready to merge.

@weiqiangt weiqiangt force-pushed the antrea-proxy-nodeport-dup branch 2 times, most recently from df06836 to 533826a Compare February 18, 2021 08:20
@weiqiangt weiqiangt changed the title Add NodePort support for Antrea Proxy Add NodePort support for Antrea Proxy on Linux Feb 18, 2021
@weiqiangt weiqiangt force-pushed the antrea-proxy-nodeport-dup branch 3 times, most recently from 5b3b1fd to 2342515 Compare February 24, 2021 07:58
}

func newOptions() *Options {
return &Options{
nodePortVirtualIP: net.ParseIP(defaultNodePortVirtualIP),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably better to move these to setDefaults()? I am fine to keep these two options hardcoded for now, but maybe later we should still make them configurable.

@@ -29,50 +29,54 @@ import (
const NodePortLocalChain = "ANTREA-NODE-PORT-LOCAL"

// IPTableRules provides a client to perform IPTABLES operations
type iptablesRules struct {
type IPTableRules struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you remove "s" from "iptables" to IPTable" on purpose? I feel it should be "IPTablesRules".

@@ -532,6 +532,15 @@ func (c *client) InstallGatewayFlows() error {
// Add flow to ensure the liveness check packet could be forwarded correctly.
flows = append(flows, c.localProbeFlow(gatewayIPs, cookie.Default)...)
flows = append(flows, c.ctRewriteDstMACFlows(gatewayConfig.MAC, cookie.Default)...)
if c.enableProxy {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add the flows only when NodePort is enabled?

func (c *client) arpNodePortVirtualResponderFlow() binding.Flow {
return c.pipeline[arpResponderTable].BuildFlow(priorityNormal).MatchProtocol(binding.ProtocolARP).
MatchARPOp(1).
MatchARPTpa(c.nodePortVirtualIPv4).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When Node IP is v6, no need to add this flow.

@@ -657,3 +794,145 @@ func (c *Client) UnMigrateRoutesFromGw(route *net.IPNet, linkName string) error
}
return nil
}

func (c *Client) ReconcileNodePort(nodeIPs []net.IP, svcEntries []*proxytypes.ServiceInfo) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not see this func is called, but might be my mistake?

@@ -38,6 +39,20 @@ type Interface interface {
// It should do nothing if the routes don't exist, without error.
DeleteRoutes(podCIDR *net.IPNet) error

// AddRoutes should add the route to the NodePort virtual IP.
AddNodePortRoute(isIPv6 bool) error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you think adding a separate interface for AddNodePortRoute, AddNodePort, and DeleteNodePort, which can have different implementations in Linux (by Route Client) and Windows (by Openflow)? Or maybe on Windows they are not needed at all, as the existing Openflow Client interfaces can already cover these functions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/component/agent Issues or PRs related to the agent component kind/design Categorizes issue or PR as related to design.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Antrea Proxy NodePort Service Support
8 participants