Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

matchLabels under NetworkPolicy's main podSelector section is matching by ANY label match, not EVERY label match #1135

Closed
uipo78 opened this issue Aug 2, 2019 · 31 comments

Comments

@uipo78
Copy link

uipo78 commented Aug 2, 2019

What happened:

matchLabels under NetworkPolicy's main podSelector section appear to select labels by ANY matching, not by every label matching that's under that section. This contradicts this section of the Kubernetes docs.

What you expected to happen:

All labels under matchLabels of the NetworkPolicy's main podSelector must match in order for the network policy to apply to a pod.

How to reproduce it (as minimally and precisely as possible):

Suppose I have two hello world apps, one with a service named hello-world-1 and another with a service named hello-world-2. Both are exposed at port 8000. Suppose further that each share the label app=hello-world, while hello-world-1 also has the label number=one and hello-world-2 has the label number=two. If I deploy both in the same namespace and have the following network policies in place, I expect hello-world-1 to be the only one who receives traffic:

apiVersion: extensions/v1beta1
kind: NetworkPolicy
metadata:
  name: networkpolicy-1
spec:
  podSelector:
    matchLabels:
      app: hello-world
      number: one
  policyTypes:
  - Ingress
  ingress:
    - {}
---
apiVersion: extensions/v1beta1
kind: NetworkPolicy
metadata:
  name: networkpolicy-2
spec:
  podSelector:
    matchLabels:
      app: hello-world
      number: two
  policyTypes:
  - Ingress
  - Egress

However, both apps receive traffic.

Anything else we need to know?:

Nope

Environment:

  • Kubernetes version (use kubectl version): 1.13.7
@sauryadas
Copy link
Contributor

@aanandr can you please take a look?

@sauryadas
Copy link
Contributor

@uipo78 Can you please confirm you are using Azure network policies?

@uipo78
Copy link
Author

uipo78 commented Aug 11, 2019

I'm using standard Kubernetes network policies on clusters using the Azure CNI.

@aanandr
Copy link

aanandr commented Aug 13, 2019

@uipo78 - the matching should be ANY mapping. Please see this link - https://kubernetes.io/docs/concepts/services-networking/network-policies/
However, if you don't specify an ingress or egress (or both policies) then all traffic should be denied. @saiyan86 - can you check?

@uipo78
Copy link
Author

uipo78 commented Aug 14, 2019

@aanandr I don't see anything about the behavior of label selectors, particular to network policies, in the docs that you referenced.

@uipo78
Copy link
Author

uipo78 commented Aug 14, 2019

I'd also be surprised if label selectors behave differently for network policies than they do generally: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#resources-that-support-set-based-requirements

@aanandr
Copy link

aanandr commented Aug 14, 2019

@uipo78 - I misread your earlier description. My apologies. You are right - if there are multiple match labels to select a Pod then they should use the AND clause. This is a bug and it has just been reported by a few other customers too. We are actively working on a fix.
Thanks for bringing this to our notice and providing all this info. Appreciate it.

@uipo78
Copy link
Author

uipo78 commented Aug 22, 2019

I noticed that the most recent release includes this bullet:

Fixed Azure Network Policy bug with multiple labels under a matchLabels selector.

Is that related to this issue?

@uipo78
Copy link
Author

uipo78 commented Sep 5, 2019

bueller?

@uipo78
Copy link
Author

uipo78 commented Sep 10, 2019

@sauryadas @aanandr

@aanandr
Copy link

aanandr commented Sep 10, 2019

@uipo78 - yes we rolled out a bunch of fixes recently and one of them fixes the issue reported here. I can confirm that.

@ball-hayden
Copy link

Could you confirm the fixes have already been rolled out? I have been told by support that they won't be landing until 2019-09-20?

I also appear to still have the same version of azure-npm that I have had for the last couple of weeks - is there a version number they are fixed in please?

@uipo78
Copy link
Author

uipo78 commented Sep 11, 2019

This is the release to which I was referring @ball-hayden: https://github.com/Azure/AKS/releases/tag/2019-08-19.

I'm going to close this issue, since @aanandr confirmed that the fix is part of the release mentioned above (that's where my bullet point comes from).

@uipo78 uipo78 closed this as completed Sep 11, 2019
@jnoller
Copy link
Contributor

jnoller commented Sep 11, 2019

Heads up; this weeks build does contain additional fixes, i'll reopen and then link the release going out now when the release notes are published.

@jnoller jnoller reopened this Sep 11, 2019
@uipo78
Copy link
Author

uipo78 commented Sep 18, 2019

Ah ok, perfect. I was going to reopen this issue anyway because it doesn't appear to be resolved in 1.14.6 as mentioned earlier.

@saiyan86
Copy link

saiyan86 commented Sep 19, 2019

FYI the image contains the fix is here:mcr.microsoft.com/containernetworking/azure-npm:v1.0.27. source code if you are curious

@kagkarlsson
Copy link

When and how does this change propagate out to managed AKS clusters? I see we are currently at azure-npm:v1.0.18

@blaw2422
Copy link

@aanandr , could you tell me if this bug would also cause egress traffic to be blocked? If I use the example policy to "allow all egress traffic", I still have egress traffic blocked. https://kubernetes.io/docs/concepts/services-networking/network-policies/#default-allow-all-ingress-traffic

@saiyan86
Copy link

@aanandr , could you tell me if this bug would also cause egress traffic to be blocked? If I use the example policy to "allow all egress traffic", I still have egress traffic blocked. https://kubernetes.io/docs/concepts/services-networking/network-policies/#default-allow-all-ingress-traffic

Hi @blaw2422 the fix is here: Azure/azure-container-networking#398
and will be released as v1.0.28.

@saiyan86
Copy link

When and how does this change propagate out to managed AKS clusters? I see we are currently at azure-npm:v1.0.18

Hi @kagkarlsson it should be updated weekly. Did your cluster get the update?

@kagkarlsson
Copy link

Ok, good to know that it is supposed to be weekly updates. To be honest I cannot confirm we have gotten the update because we switched to Calico after running into what seemed like bugs with the other implementation.

@uipo78
Copy link
Author

uipo78 commented Oct 2, 2019

I didn't get the update. We're running on Kubernetes v1.14.6. The azure-npm image is still v1.0.27.

@uipo78
Copy link
Author

uipo78 commented Oct 2, 2019

It's also worth mentioning that I reconstructed a cluster yesterday, and even then, the tag for azure-npm is still v1.0.27.

@uipo78
Copy link
Author

uipo78 commented Oct 2, 2019

Is there a project manager that can provide accurate details on the release of this fix? At this point, it feels like the AKS team is conjecturing.

@uipo78
Copy link
Author

uipo78 commented Oct 2, 2019

Just to confirm: we're observing this issue's original problem in azure-npm:v1.0.27.

@saiyan86
Copy link

saiyan86 commented Oct 2, 2019

@jaer-tsun @matmerr

@aanandr
Copy link

aanandr commented Oct 2, 2019

@uipo78 - apologize for the inconvenience. The fix for this issue is currently in aks-engine and we are working with the AKS team to get it rolled out to AKS also.

@HansK-p
Copy link

HansK-p commented Feb 13, 2020

I must admit I'm also looking forward to a fix here. I have what I believe is the same problem in AKS v1.15.7 (and earlier versions).

My understanding and experience is that using both Egress and Ingress Network Policies gives a somewhat unpredictable and unstable result in AKS (it has worked fine with Calico for years).

This means that we have to choose between using Egress or using Ingress network policies in our Namespaces. And this somewhat limits our options. We have to decide if we are to protect against a break in or a break out of pods. But we can't do both at the same time, that is: we can't both deny network access from the namespace to resources outside the AKS cluster and restrict network access to the namespace from the outside at the same time.

It would be nice with an update if I've misunderstood the implications of this problem with AKS network policies.

@ghost ghost added the action-required label Jul 22, 2020
@ghost
Copy link

ghost commented Jul 27, 2020

Action required from @Azure/aks-pm

@ghost ghost added the Needs Attention 👋 Issues needs attention/assignee/owner label Jul 27, 2020
@ghost
Copy link

ghost commented Aug 6, 2020

Issue needing attention of @Azure/aks-leads

@palma21
Copy link
Member

palma21 commented Aug 6, 2020

The fix for this OP should be released some months back. If you still experience the same behavior as OP please do comment back.

@palma21 palma21 closed this as completed Aug 6, 2020
@palma21 palma21 added resolution/fix-released and removed Needs Attention 👋 Issues needs attention/assignee/owner action-required labels Aug 6, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Sep 6, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

10 participants