No topology key found on hw nodes #400

SNB-hz · 2023-03-27T12:38:53Z

In clusters with hardware nodes, a new PVC and its workload can be stuck in Pending state if they are scheduled without nodeAffinity.

Steps to reproduce:

run a cluster that includes a hardware worker, and label the hw node with instance.hetzner.cloud/is-root-server=true as mentioned in the README
install CSI driver according to instructions
apply the test-pvc and pod mentioned in the README, using the default storageClass with WaitForFirstConsumer volumeBindingMode

Expected Behaviour:

hcloud-csi-controller should provide the desired / required topology constaints to the k8s scheduler, which then schedules the pod on a node fulfilling the topology requirements.
As the hardware node does not run csi-driver and cannot mount hetzner cloud volumes, the workload should not be scheduled there.

Observed Behaviour:

Both pvc and pod are stuck in Pending state.
the container csi-provisioner of the CSI Controller deployment logs this Error:

'Warning' reason: 'ProvisioningFailed' failed to provision volume with StorageClass "hcloud-volumes": error generating accessibility requirements: no topology key found on CSINode hardwarenode.testcluster

More Info:
Tested with csi-driver 2.1.1 as well as 2.2.0, together with csi-provisioner 3.4.0

the DaemonSet for hcloud-csi-node does not run on the hw node
because of this, the csinode object for the node lists no driver:

kubectl get csinode
NAME                     DRIVERS       AGE
virtualnode.testcluster     1           1d
hardwarenode.testcluster    0           1d

the csinode object of the virtual node looks ok:

kubectl get csinode virtualnode.testcluster -oyaml
apiVersion: storage.k8s.io/v1
kind: CSINode
...
spec:
  drivers:
  - allocatable:
      count: 16
    name: csi.hetzner.cloud
    nodeID: "12769030"
    topologyKeys:
    - csi.hetzner.cloud/location

the csinode object of the hardware node does not have a driver and therefore no topology key, as the node intentionally runs no hcloud-csi-node pod due to the nodeAffinity:

kubectl get csinode hardwarenode.testcluster -oyaml
apiVersion: storage.k8s.io/v1
kind: CSINode
...
spec:
  drivers: null

Theory

It seems we are hitting this Issue in csi-provisioner.
As the hardware node has no csi-driver pod and therefore no driver or topology key listed, the csi-provisioner breaks. It is trying to build the preferred topology to give it to the scheduler, but as the hardware node has no topology key, the csi-provisioner fails. Pod and PVC cannot finish scheduling and remain in Pending state forever.

Workaround

This issue can be avoided by making sure the object that uses the PVC (StatefulSet, Pod etc.) cannot be scheduled on the hardware node in the first place. This can be done by specifying a nodeAffinity:

    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: instance.hetzner.cloud/is-root-server
                operator: NotIn
                values:
                - "true"

Proposed Solution

The external-provisioner Issue, lists a few possible solutions on the csi-driver side, such as running the csi-driver on all nodes, including hardware nodes.
CSI-controller would then need to be aware of which nodes are virtual or hardware when providing the topology preferences to the k8s scheduler.

The text was updated successfully, but these errors were encountered:

hypery2k · 2023-04-15T06:25:33Z

Having the same issue, seems that the wrong node is selected:

- apiVersion: v1
  kind: PersistentVolumeClaim
  metadata:
    annotations:
      volume.beta.kubernetes.io/storage-provisioner: csi.hetzner.cloud
      volume.kubernetes.io/selected-node: production-agent-large-srd
      volume.kubernetes.io/storage-provisioner: csi.hetzner.cloud
    creationTimestamp: "2023-03-24T05:54:30Z"
    finalizers:
    - kubernetes.io/pvc-protection
    labels:
      app.kubernetes.io/component: primary
      app.kubernetes.io/instance: pcf-app
      app.kubernetes.io/name: postgresql
    name: data-pcf-app-postgresql-0
    namespace: pen-testing
    resourceVersion: "4164426"
    uid: 0c39bdac-5540-4a34-b274-151a6409cdbf
  spec:
    accessModes:
    - ReadWriteOnce
    resources:
      requests:
        storage: 8Gi
    storageClassName: hcloud-volumes
    volumeMode: Filesystem
  status:
    phase: Pending
- apiVersion: v1
  kind: PersistentVolumeClaim
  metadata:
    annotations:
      meta.helm.sh/release-name: reconmap-app
      meta.helm.sh/release-namespace: pen-testing
      pv.kubernetes.io/bind-completed: "yes"
      pv.kubernetes.io/bound-by-controller: "yes"
      volume.beta.kubernetes.io/storage-provisioner: csi.hetzner.cloud
      volume.kubernetes.io/selected-node: production-storage-yhq
      volume.kubernetes.io/storage-provisioner: csi.hetzner.cloud
    creationTimestamp: "2023-03-22T08:09:04Z"
    finalizers:
    - kubernetes.io/pvc-protection
    labels:
      app: mysql
      app.kubernetes.io/managed-by: Helm
    name: reconmap-app-mysql-pv-claim
    namespace: pen-testing
    resourceVersion: "3367563"
    uid: e355ac30-2136-4193-8264-04e33bc335c8
  spec:
    accessModes:
    - ReadWriteOnce
    resources:
      requests:
        storage: 20Gi
    storageClassName: hcloud-volumes
    volumeMode: Filesystem
    volumeName: pvc-e355ac30-2136-4193-8264-04e33bc335c8
  status:
    accessModes:
    - ReadWriteOnce
    capacity:
      storage: 20Gi
    phase: Bound

Seeing this logs:

17m         Normal    WaitForFirstConsumer   persistentvolumeclaim/data-pcf-app-postgresql-0   waiting for first consumer to be created before binding
12m         Normal    ExternalProvisioning   persistentvolumeclaim/data-pcf-app-postgresql-0   waiting for a volume to be created, either by external provisioner "csi.hetzner.cloud" or manually created by system administrator
12m         Normal    Provisioning           persistentvolumeclaim/data-pcf-app-postgresql-0   External provisioner is provisioning volume for claim "pen-testing/data-pcf-app-postgresql-0"
12m         Warning   ProvisioningFailed     persistentvolumeclaim/data-pcf-app-postgresql-0   failed to provision volume with StorageClass "hcloud-volumes": error generating accessibility requirements: no topology key found on CSINode production-agent-large-srd
10m         Normal    WaitForFirstConsumer   persistentvolumeclaim/data-pcf-app-postgresql-0   waiting for first consumer to be created before binding
6s          Normal    ExternalProvisioning   persistentvolumeclaim/data-pcf-app-postgresql-0   waiting for a volume to be created, either by external provisioner "csi.hetzner.cloud" or manually created by system administrator
61s         Normal    Provisioning           persistentvolumeclaim/data-pcf-app-postgresql-0   External provisioner is provisioning volume for claim "pen-testing/data-pcf-app-postgresql-0"
61s         Warning   ProvisioningFailed     persistentvolumeclaim/data-pcf-app-postgresql-0   failed to provision volume with StorageClass "hcloud-volumes": error generating accessibility requirements: no topology key found on CSINode production-agent-large-srd

When updating manually the volume.kubernetes.io/selected-node annotation to production-storage-yhq it's working

samcday · 2023-04-15T09:34:20Z

As per the hint in linked issue, perhaps this can be easily solved by setting allowed topologies on the StorageClass? That is, assuming the StorageClass has an allowedTopologies selector that accurately matches hcloud Nodes only, then we can be sure the Kubernetes scheduler won't try to schedule a Pod with hcloud PVC attachment(s) on non-hcloud nodes.

This only solves the issue for Kube, I have no idea about Swarm/Nomad.

github-actions · 2023-07-14T12:53:56Z

This issue has been marked as stale because it has not had recent activity. The bot will close the issue if no further action occurs.

github-actions · 2023-11-12T12:48:32Z

This issue has been marked as stale because it has not had recent activity. The bot will close the issue if no further action occurs.

- See #400

Due to a bug in the scheduler a node with no driver instance might be picked and the volume is stuck in pending as the "no capacity - > reschedule" recovery is never triggered [[0]](kubernetes/kubernetes#122109), [[1]](kubernetes-csi/external-provisioner#544). - See #400 --------- Co-authored-by: lukasmetzner <[email protected]> Co-authored-by: Julian Tölle <[email protected]>

samcday · 2024-10-29T11:44:23Z

Great to see such a thorough and satisfying conclusion/solution here! 👍

apricote · 2024-10-29T12:18:13Z

(never sure if you are sarcastic or not)

You can check the updated docs to learn more about it: https://github.com/hetznercloud/csi-driver/tree/main/docs/kubernetes#integration-with-root-servers

We ended up going with the allowedTopologies in the StorageClass as you suggested in #400 (comment)

The necessary label is automatically added by hcloud-cloud-controller-manager if the customer is running that in their cluster.

samcday · 2024-10-29T12:25:10Z

I'm impressed that my customary acerbic wit has left such an indelible mark ;)

I wasn't being sarcastic at all! I had also tripped over the corresponding stuff in cluster-autoscaler - hence being impressed with the thoroughness of the fix here! (and of course that the fix took a similar to shape to how I proposed also leaves me feeling additionally chuffed xD)

Due to a bug in the scheduler a node with no driver instance might be picked and the volume is stuck in pending as the "no capacity - > reschedule" recovery is never triggered [[0]](kubernetes/kubernetes#122109), [[1]](kubernetes-csi/external-provisioner#544). - See #400 --------- Co-authored-by: lukasmetzner <[email protected]> Co-authored-by: Julian Tölle <[email protected]>

### ⚠️ Removed Feature from v2.10.0 We have reverted a workaround for an upstream issue in the Kubernetes scheduler where nodes without the CSI Plugin (e.g. Robot servers) would still be considered for scheduling, but then creating and attaching the volume fails with no automatic reconciliation of the this error. Due to variations in the CSI specification implementation, these changes disrupted Nomad clusters, requiring us to revert them. We are actively working on placing this workaround behind a feature flag, allowing Kubernetes users to bypass the upstream issue. This affects you, if you have set the Helm value `allowedTopologyCloudServer` in v2.10.0. If you are affected by the Kubernetes upstream issue, we will provide a fix in the next minor version v2.11.0. Learn more about this in [#400](#400) and [#771](#771). ### Bug Fixes - reverted NodeGetInfo response as it breaks Nomad clusters (#776) Co-authored-by: releaser-pleaser <>

lukasmetzner · 2024-11-12T10:03:05Z

Hi,

We encountered compatibility issues with Nomad clusters due to differences in CSI Spec implementations, which led us to revert our recent changes. We’ve now released v2.10.1 to address this. Moving forward, we’ll implement a feature flag to reintroduce this workaround, scheduled for release in v2.11.0.

We apologize for any inconvenience this may have caused.

Best regards,
Lukas

apricote mentioned this issue Mar 27, 2023

No topology key found on hw nodes #399

Closed

github-actions bot added the Stale label Jul 14, 2023

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Aug 14, 2023

apricote removed the Stale label Aug 14, 2023

apricote reopened this Aug 14, 2023

github-actions bot added the Stale label Nov 12, 2023

apricote added bug Something isn't working pinned and removed Stale labels Nov 13, 2023

lukasmetzner self-assigned this Oct 9, 2024

lukasmetzner pushed a commit that referenced this issue Oct 9, 2024

fix: Volume requests are falsely scheduled to Robot servers

76c2e06

- See #400

lukasmetzner linked a pull request Oct 9, 2024 that will close this issue

feat: force pods with volumes to be scheduled on Cloud servers #743

Merged

lukasmetzner mentioned this issue Oct 9, 2024

feat: force pods with volumes to be scheduled on Cloud servers #743

Merged

lukasmetzner pushed a commit that referenced this issue Oct 23, 2024

fix: Volume requests are falsely scheduled to Robot servers

ef97931

- See #400

lukasmetzner pushed a commit that referenced this issue Oct 28, 2024

fix: Volume requests are falsely scheduled to Robot servers

6c7de80

- See #400

lukasmetzner pushed a commit that referenced this issue Oct 28, 2024

fix: Volume requests are falsely scheduled to Robot servers

fa63e74

- See #400

lukasmetzner closed this as completed in #743 Oct 29, 2024

hcloud-bot mentioned this issue Nov 12, 2024

chore(main): release v2.10.1 #779

Merged

lukasmetzner reopened this Nov 12, 2024

lukasmetzner mentioned this issue Nov 12, 2024

feat: Added new option enableProvidedByTopology #780

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

No topology key found on hw nodes #400

No topology key found on hw nodes #400

SNB-hz commented Mar 27, 2023 •

edited

Loading

hypery2k commented Apr 15, 2023

samcday commented Apr 15, 2023 •

edited

Loading

github-actions bot commented Jul 14, 2023

github-actions bot commented Nov 12, 2023

samcday commented Oct 29, 2024

apricote commented Oct 29, 2024

samcday commented Oct 29, 2024 •

edited

Loading

lukasmetzner commented Nov 12, 2024

No topology key found on hw nodes #400

No topology key found on hw nodes #400

Comments

SNB-hz commented Mar 27, 2023 • edited Loading

hypery2k commented Apr 15, 2023

samcday commented Apr 15, 2023 • edited Loading

github-actions bot commented Jul 14, 2023

github-actions bot commented Nov 12, 2023

samcday commented Oct 29, 2024

apricote commented Oct 29, 2024

samcday commented Oct 29, 2024 • edited Loading

lukasmetzner commented Nov 12, 2024

SNB-hz commented Mar 27, 2023 •

edited

Loading

samcday commented Apr 15, 2023 •

edited

Loading

samcday commented Oct 29, 2024 •

edited

Loading