Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additional selectorLabels are not added to primary deployment #1312

Open
AliakseiVenski opened this issue Nov 9, 2022 · 9 comments
Open

Comments

@AliakseiVenski
Copy link

Describe the bug

I tried to use selectorLabels described there, but my additional selector labels are present only on original deployment {deploy} and not on primary {deploy}-primary created by flagger.

original deployment:

spec:
  replicas: 0
  selector:
    matchLabels:
      app.kubernetes.io/instance: serviceName
      app.kubernetes.io/name: serviceName

primary deployment

spec:
  replicas: 2
  selector:
    matchLabels:
      app.kubernetes.io/name: serviceName

flagger values.yaml

image:
  tag: 1.22.2
meshProvider: traefik
metricsServer: http://prometheus-operator-kube-p-prometheus.prometheus-operator:9090
resources:
  limits:
    cpu: 1000m
    memory: 512Mi
  requests:
    cpu: 20m
    memory: 64Mi
selectorLabels: "app.kubernetes.io/name,app.kubernetes.io/instance"

flagger container spec

spec:
      containers:
        - name: flagger
          image: ghcr.io/fluxcd/flagger:1.22.2
          command:
            - ./flagger
            - '-log-level=info'
            - '-mesh-provider=traefik'
            - >-
              -metrics-server=http://prometheus-operator-kube-p-prometheus.prometheus-operator:9090
            - '-selector-labels=app.kubernetes.io/name,app.kubernetes.io/instance'
            - '-enable-config-tracking=true'
            - '-slack-user=flagger'

P.S.: I'm using helm chart for this deployment

To Reproduce

  • Instruct flagger to take into consideration additional label selectors in values.yaml
  • Add additional label that is enabled at flagger to your deployment (Helm)

Expected behavior

All labels that specified to at 'selectorLabels' are copied to primary from original deployment after successful promotion

Additional context

  • Flagger version: 1.22.2
  • Kubernetes version: 1.24.3
  • Service Mesh provider: -
  • Ingress provider: Traefik
@AliakseiVenski
Copy link
Author

I noticed one more thing - after unsuccessful promotion deployment rolled back, then after 5-10 seconds I see 'New revision detected' again and flagger tries to promote failed deployment second time.

@AliakseiVenski
Copy link
Author

AliakseiVenski commented Nov 29, 2022

@aryan9600 could you help please? I looked at #1227 and still do not know how to resolve this.

@AliakseiVenski
Copy link
Author

I left just 'app.kubernetes.io/instance' as single selector label, now all services have this label as a selector, so I do everything correctly. But if I specify more than one label - rest labels except first in a row are not considered by flagger.

@AliakseiVenski
Copy link
Author

Regarding deployment selector labels - they are changing only if you delete and then install helm release, so flagger operator doesn't modify primary deployment selector labels on the fly (works with service selector label).
Already 2 bugs there in one post :(

@jkotiuk
Copy link

jkotiuk commented Mar 1, 2023

I'm observing the same issue. We are running flagger 1.27

In the helm chart I've set following values:
selectorLabels: "app.kubernetes.io/name,app.kubernetes.io/instance"

On the initial deployment we have following labels:

spec:                                             
  selector:                                       
    matchLabels:                                  
      app.kubernetes.io/instance: xxx             
      app.kubernetes.io/name: xxx                 
  template:                                       
    metadata:                                     
      labels:                                     
        app.kubernetes.io/instance: xxx           
        app.kubernetes.io/name: xxx               
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                app.kubernetes.io/instance: xxx
                app.kubernetes.io/name: xxx
            topologyKey: kubernetes.io/hostname
      topologySpreadConstraints:
      - labelSelector:
          matchLabels:
            app.kubernetes.io/instance: xxx
            app.kubernetes.io/name: xxx
        maxSkew: 1
        topologyKey: topology.kubernetes.io/zone
        whenUnsatisfiable: ScheduleAnyway

On the primary deployment, not all template labels are updated. This is causing that affinity and topology spread constrain rules do not work at all due mismatch in labels.

spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: xxx-primary
  template:
    metadata:
      labels:
        app.kubernetes.io/instance: xxx             <- missing primary
        app.kubernetes.io/name: xxx-primary
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                app.kubernetes.io/instance: xxx-primary
                app.kubernetes.io/name: xxx-primary
            topologyKey: kubernetes.io/hostname
      topologySpreadConstraints:
      - labelSelector:
          matchLabels:
            app.kubernetes.io/instance: xxx-primary
            app.kubernetes.io/name: xxx-primary
        maxSkew: 1
        topologyKey: topology.kubernetes.io/zone
        whenUnsatisfiable: ScheduleAnyway

From my tests only the first label match in selectorLabels parameter is updated in templates.metadata

@pinkavaj
Copy link
Contributor

pinkavaj commented Oct 9, 2023

Thats exactly what the code does https://github.com/fluxcd/flagger/blob/main/pkg/canary/daemonset_controller.go#L314 This completely breaks the deployment for us :(

@AliakseiVenski
Copy link
Author

AliakseiVenski commented Oct 9, 2023

Hey @aryan9600, any comments on this? You see that users face this issue having an impact from small to high? Almost 1 year after issue is created and no feedback at all.

@AliakseiVenski
Copy link
Author

P.S.: recently we updated flagger to latest, issue still persists.

@KrylixZA
Copy link

KrylixZA commented May 21, 2024

Bumping this thread. Just ran into this problem while trying to implement safe deployments for an event-driven application using Dapr. Without getting into too many details, Dapr creates a <app-name>-dapr network service to allow for service discovery as part of the features it provides. When the canary gets promoted and the labels are not copied over correctly, the Dapr network service loses track of the <app-name>-primary pods and the entire system stops working.

This is a total blocker for me :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants