-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove Agent's dependency on proxy to access Antrea Service #6361
Remove Agent's dependency on proxy to access Antrea Service #6361
Conversation
@tnqn This is related to the idea you brought up in #6342 (comment). I considered 2 solutions:
I don't really have a strong preference between the 2 and I can go either way. One issue with the current approach is the transient error logs when the Agent is started before the Controller, but maybe it's pretty minor:
If we resolve every time Let me know if you have a preference between the 2. |
endpointsInformer.Informer().AddEventHandler(cache.FilteringResourceEventHandler{ | ||
FilterFunc: func(obj interface{}) bool { | ||
// The Endpoints resource for a Service has the same name as the Service. | ||
if service, ok := obj.(*corev1.Service); ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should it be Endpoints?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks 😊 that explains why the tests are failing...
// We do not care about potential Status updates. | ||
if reflect.DeepEqual(newSvc.Spec, oldSvc.Spec) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe just check generation which is more efficient
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is Generation available for all core resources, I never know?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought setting/updating generation is automatic for all resources, getting that impression from Egress and AntreaNetworkPolicy CRs, but just confirmed I was wrong: Service doesn't have a generation, the generation calculation of core resources is added case-by-case.
} | ||
return false | ||
}, | ||
// Any change to Endpoints will trigger a resync. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems only Subsets change will trigger a resync.
if err != nil { | ||
return err | ||
} | ||
// The separate Load an Store calls are safe because there is a single |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// The separate Load an Store calls are safe because there is a single | |
// The separate Load and Store calls are safe because there is a single |
16a7c21
to
d41062f
Compare
@tnqn I addressed comments and did some improvements. I also added a few unit tests for the new code. PTAL. |
// the Endpoints resource is updated in a way that will cause this function to be called again. | ||
if errors.IsServiceUnavailable(err) { | ||
klog.ErrorS(err, "Cannot resolve endpoint because Service is unavailable", "service", klog.KRef(r.namespace, r.serviceName)) | ||
r.updateEndpointIfNeeded(nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tnqn I wanted to bring this to your attention, as it means that GetAntreaClient
can return (nil, non-nil error)
when the Antrea Service is not available (e.g. the antrea-controller Pod is restarted). I believe that with the previous behavior, GetAntreaClient
was guaranteed to always succeed after the first successful call (that doesn't mean that the Service could be accessed successfully).
Let me know what you think. I think this is the right thing to do, and I don't think that should impact existing consumers of GetAntreaClient
, but I can also remove the calls to r.updateEndpointIfNeeded(nil)
if you prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change makes sense to me.
With the latest code, the new logs look much better because we avoid retries when unnecessary. Agent starting before Controller:
Controller restart (Pod deletion, new Pod scheduled on different Node):
|
// the Endpoints resource is updated in a way that will cause this function to be called again. | ||
if errors.IsServiceUnavailable(err) { | ||
klog.ErrorS(err, "Cannot resolve endpoint because Service is unavailable", "service", klog.KRef(r.namespace, r.serviceName)) | ||
r.updateEndpointIfNeeded(nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change makes sense to me.
|
||
func NewEndpointResolver(kubeClient kubernetes.Interface, namespace, serviceName string, servicePort int32) *EndpointResolver { | ||
key := namespace + "/" + serviceName | ||
controllerName := fmt.Sprintf("ServiceEndpointResolver-%s", key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is ServiceEndpointResolver:kube-system/antrea
more readable than ServiceEndpointResolver-kube-system/antrea
?
|
||
serviceInformer.Informer().AddEventHandler(cache.FilteringResourceEventHandler{ | ||
// FilterFunc ignores all Service events which do not relate to the named Service. | ||
// It should be redudant given the filtering that we already do at the informer level. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/redudant/redundant, but do we still need to keep the filter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like it's ok to keep it. I saw the same pattern in ConfigMapCAController
, even though I agree it is not needed.
/test-all |
// If Antrea client is not ready within 5s, we assume that the Antrea Controller is not | ||
// available. We proceed with our watches, which are likely to fail. In turn, this will | ||
// trigger the fallback mechanism. | ||
// 5s should be more than enough if the Antrea Controller is running correctly. | ||
ctx, cancel := context.WithTimeout(wait.ContextForChannel(stopCh), 5*time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tnqn I tried a few things, but this was the simplest and most "correct" IMO.
Given that ConfigMapCAController
is third-party code, it's hard to come up with a better solution (my preferred solution would have been to wait until both controllers have "synced" and processed initial items).
This works well in practice: if the antrea-controller is already running, the client should be ready with 1 or 2 seconds so we don't hit the timeout; otherwise, the timeout is short enough that we don't wait too long before falling back to local policy files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks good to me.
/test-all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose this change is invisible to users and no need to be in release log?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM once typos are fixed.
// If Antrea client is not ready within 5s, we assume that the Antrea Controller is not | ||
// available. We proceed with our watches, which are likely to fail. In turn, this will | ||
// trigger the fallback mechanism. | ||
// 5s should be more than enough if the Antrea Controller is running correctly. | ||
ctx, cancel := context.WithTimeout(wait.ContextForChannel(stopCh), 5*time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks good to me.
I don't feel very strongly either way, but if it were me I would mention it. I will add the |
We add Endpoint resolution to the AntreaClientProvider, so that when running in-cluster, accessing the Antrea Service (i.e., accessing the Antrea Controller API) no longer depends on the ClusterIP functionality provided by the K8s proxy, whether it is kube-proxy or AntreaProxy. This gives us more flexibility during Agent initialization. For example, when kube-proxy is removed and ProxyAll is enable for AntreaProxy, accessing the Antrea Service no longer requires any routes or OVS flows installed by the Antrea Agent. To implement this functionality, we add a controller (EndpointResolver), to watch the Antrea Service and the corresponding Endpoints resource. For every relevant update, the Endpoint is resolved and the new URL is sent to the AntreaClientProvider. This is a similar model as the one we already use for CA bundle updates. Note that when the Service stops being available, we clear the Endpoint URL and notify listeners. This means that GetAntreaClient() can now return an error even if a previous call was successful. We also update the NetworkPolicyController in the Agent, so that we fallback to saved policies in case the Antrea client does not become ready within 5s. Signed-off-by: Antonin Bas <[email protected]>
Signed-off-by: Antonin Bas <[email protected]>
2469992
to
00220b9
Compare
/test-all |
Linux e2e tests currently failing because of missing script:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/test-all |
1 similar comment
/test-all |
/test-vm-e2e |
Until a set of "essential" flows has been installed. At the moment, we include NetworkPolicy flows (using podNetworkWait as the signal), Pod forwarding flows (reconciled by the CNIServer), and Node routing flows (installed by the NodeRouteController). This set can be extended in the future if desired. We leverage the wrapper around sync.WaitGroup which was introduced previously in antrea-io#5777. It simplifies unit testing, and we can achieve some symmetry with podNetworkWait. We can also start leveraging this new wait group (flowRestoreCompleteWait) as the signal to delete flows from previous rounds. However, at the moment this is incomplete, as we don't wait for all controllers to signal that they have installed initial flows. Because the NodeRouteController does not have an initial "reconcile" operation (like the CNIServer) to install flows for the initial Node list, we instead rely on a different mechanims provided by upstream K8s for controllers. When registering event handlers, we can request for the ADD handler to include a boolean flag indicating whether the object is part of the initial list retrieved by the informer. Using this mechanism, we can reliably signal through flowRestoreCompleteWait when this initial list of Nodes has been synced at least once. This change is possible because of antrea-io#6361, which removed the dependency on the proxy (kube-proxy or AntreaProxy) to access the Antrea Controller. Prior to antrea-io#6361, there would have been a circular dependency in the case where kube-proxy was removed: flow-restore-wait will not be removed until the Pod network is "ready", which will not happen until the NetworkPolicy controller has started its watchers, and that depends on antrea Service reachability which depends on flow-restore-wait being removed. Fixes antrea-io#6338 Signed-off-by: Antonin Bas <[email protected]>
Until a set of "essential" flows has been installed. At the moment, we include NetworkPolicy flows (using podNetworkWait as the signal), Pod forwarding flows (reconciled by the CNIServer), and Node routing flows (installed by the NodeRouteController). This set can be extended in the future if desired. We leverage the wrapper around sync.WaitGroup which was introduced previously in antrea-io#5777. It simplifies unit testing, and we can achieve some symmetry with podNetworkWait. We can also start leveraging this new wait group (flowRestoreCompleteWait) as the signal to delete flows from previous rounds. However, at the moment this is incomplete, as we don't wait for all controllers to signal that they have installed initial flows. Because the NodeRouteController does not have an initial "reconcile" operation (like the CNIServer) to install flows for the initial Node list, we instead rely on a different mechanims provided by upstream K8s for controllers. When registering event handlers, we can request for the ADD handler to include a boolean flag indicating whether the object is part of the initial list retrieved by the informer. Using this mechanism, we can reliably signal through flowRestoreCompleteWait when this initial list of Nodes has been synced at least once. This change is possible because of antrea-io#6361, which removed the dependency on the proxy (kube-proxy or AntreaProxy) to access the Antrea Controller. Prior to antrea-io#6361, there would have been a circular dependency in the case where kube-proxy was removed: flow-restore-wait will not be removed until the Pod network is "ready", which will not happen until the NetworkPolicy controller has started its watchers, and that depends on antrea Service reachability which depends on flow-restore-wait being removed. Fixes antrea-io#6338 Signed-off-by: Antonin Bas <[email protected]>
Until a set of "essential" flows has been installed. At the moment, we include NetworkPolicy flows (using podNetworkWait as the signal), Pod forwarding flows (reconciled by the CNIServer), and Node routing flows (installed by the NodeRouteController). This set can be extended in the future if desired. We leverage the wrapper around sync.WaitGroup which was introduced previously in #5777. It simplifies unit testing, and we can achieve some symmetry with podNetworkWait. We can also start leveraging this new wait group (flowRestoreCompleteWait) as the signal to delete flows from previous rounds. However, at the moment this is incomplete, as we don't wait for all controllers to signal that they have installed initial flows. Because the NodeRouteController does not have an initial "reconcile" operation (like the CNIServer) to install flows for the initial Node list, we instead rely on a different mechanims provided by upstream K8s for controllers. When registering event handlers, we can request for the ADD handler to include a boolean flag indicating whether the object is part of the initial list retrieved by the informer. Using this mechanism, we can reliably signal through flowRestoreCompleteWait when this initial list of Nodes has been synced at least once. This change is possible because of #6361, which removed the dependency on the proxy (kube-proxy or AntreaProxy) to access the Antrea Controller. Prior to #6361, there would have been a circular dependency in the case where kube-proxy was removed: flow-restore-wait will not be removed until the Pod network is "ready", which will not happen until the NetworkPolicy controller has started its watchers, and that depends on antrea Service reachability which depends on flow-restore-wait being removed. Fixes #6338 Signed-off-by: Antonin Bas <[email protected]>
We add Endpoint resolution to the AntreaClientProvider, so that when
running in-cluster, accessing the Antrea Service (i.e., accessing the
Antrea Controller API) no longer depends on the ClusterIP functionality
provided by the K8s proxy, whether it is kube-proxy or AntreaProxy.
This gives us more flexibility during Agent initialization. For example,
when kube-proxy is removed and ProxyAll is enable for AntreaProxy,
accessing the Antrea Service no longer requires any routes or OVS flows
installed by the Antrea Agent.
To implement this functionality, we add a controller (EndpointResolver),
to watch the Antrea Service and the corresponding Endpoints
resource. For every relevant update, the Endpoint is resolved and the
new URL is sent to the AntreaClientProvider. This is a similar model as
the one we already use for CA bundle updates.
Note that when the Service stops being available, we clear the Endpoint
URL and notify listeners. This means that GetAntreaClient() can now
return an error even if a previous call was successful.