Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Controller states: "Not enough available nodes" #440

Closed
boedy opened this issue Mar 24, 2023 · 4 comments
Closed

Controller states: "Not enough available nodes" #440

boedy opened this issue Mar 24, 2023 · 4 comments

Comments

@boedy
Copy link

boedy commented Mar 24, 2023

I have Stateful that's not able to deploy. See information below.

PVC event:

failed to provision volume with StorageClass "fast": rpc error: code = Internal desc = CreateVolume failed for pvc-de4e2543-6eca-4e78-a616-9f478c85e62b: rpc error: code = ResourceExhausted desc = failed to enough replicas on requisite nodes: Message: 'Not enough available nodes'; Details: 'Not enough nodes fulfilling the following auto-place criteria: * has a deployed storage pool named TransactionList [local-disk] * the storage pools have to have at least '10485760' free space * the current access context has enough privileges to use the node and the storage pool * the node is online Auto-place configuration details: Place Count: 2 Replicas on different nodes: TransactionList [Aux/topology/kubernetes.io/hostname] Replicas on same nodes: TransactionList [Aux/topology/topology.kubernetes.io/zone] Don't place with resource (List): [pvc-de4e2543-6eca-4e78-a616-9f478c85e62b] Node name: [eu-central-1771, eu-central-68d6, eu-central-7f14, eu-west-f5b4] Storage pool name: TransactionList [local-disk] Layer stack: TransactionList [DRBD, STORAGE] Auto-placing resource: pvc-de4e2543-6eca-4e78-a616-9f478c85e62b'

Controller logs:

14:17:45.638 [grizzly-http-server-0] INFO  LINSTOR/Controller - SYSTEM - New volume group with number '0' of resource group 'sc-d88271c8-8dc1-54a2-ab85-7aee8f9649b0' created.
14:17:46.041 [grizzly-http-server-0] INFO  LINSTOR/Controller - SYSTEM - New volume definition with number '0' of resource definition 'pvc-1076c61b-1dbb-42c5-b740-68f0873db644' created.
14:17:46.093 [MainWorkerPool-1] INFO  LINSTOR/Controller - SYSTEM - Drbd-auto-verify-Algo for pvc-1076c61b-1dbb-42c5-b740-68f0873db644 automatically set to crct10dif-pclmul
14:17:46.886 [MainWorkerPool-1] ERROR LINSTOR/Controller - SYSTEM - Not enough available nodes
14:22:48.636 [grizzly-http-server-0] INFO  LINSTOR/Controller - SYSTEM - New volume group with number '0' of resource group 'sc-d88271c8-8dc1-54a2-ab85-7aee8f9649b0' created.
14:22:49.021 [grizzly-http-server-0] INFO  LINSTOR/Controller - SYSTEM - New volume definition with number '0' of resource definition 'pvc-1076c61b-1dbb-42c5-b740-68f0873db644' created.
14:22:49.073 [MainWorkerPool-1] INFO  LINSTOR/Controller - SYSTEM - Drbd-auto-verify-Algo for pvc-1076c61b-1dbb-42c5-b740-68f0873db644 automatically set to crct10dif-pclmul
14:22:49.927 [MainWorkerPool-1] ERROR LINSTOR/Controller - SYSTEM - Not enough available nodes
14:23:04.981 [grizzly-http-server-1] INFO  LINSTOR/Controller - SYSTEM - New volume group with number '0' of resource group 'sc-d88271c8-8dc1-54a2-ab85-7aee8f9649b0' created.
14:23:05.376 [grizzly-http-server-1] INFO  LINSTOR/Controller - SYSTEM - New volume definition with number '0' of resource definition 'pvc-1076c61b-1dbb-42c5-b740-68f0873db644' created.
14:23:05.426 [MainWorkerPool-1] INFO  LINSTOR/Controller - SYSTEM - Drbd-auto-verify-Algo for pvc-1076c61b-1dbb-42c5-b740-68f0873db644 automatically set to crct10dif-pclmul
14:23:06.338 [MainWorkerPool-1] ERROR LINSTOR/Controller - SYSTEM - Not enough available nodes
14:25:40.300 [grizzly-http-server-0] INFO  LINSTOR/Controller - SYSTEM - New volume group with number '0' of resource group 'sc-d88271c8-8dc1-54a2-ab85-7aee8f9649b0' created.
14:25:40.691 [grizzly-http-server-0] INFO  LINSTOR/Controller - SYSTEM - New volume definition with number '0' of resource definition 'pvc-b874becc-aaa2-4700-b79f-a7a3e3f05bba' created.
14:25:40.741 [MainWorkerPool-1] INFO  LINSTOR/Controller - SYSTEM - Drbd-auto-verify-Algo for pvc-b874becc-aaa2-4700-b79f-a7a3e3f05bba automatically set to crct10dif-pclmul
14:27:52.141 [grizzly-http-server-1] INFO  LINSTOR/Controller - SYSTEM - New volume group with number '0' of resource group 'sc-d88271c8-8dc1-54a2-ab85-7aee8f9649b0' created.
14:27:52.533 [grizzly-http-server-1] INFO  LINSTOR/Controller - SYSTEM - New volume definition with number '0' of resource definition 'pvc-1076c61b-1dbb-42c5-b740-68f0873db644' created.
14:27:52.584 [MainWorkerPool-1] INFO  LINSTOR/Controller - SYSTEM - Drbd-auto-verify-Algo for pvc-1076c61b-1dbb-42c5-b740-68f0873db644 automatically set to crct10dif-pclmul
14:27:53.489 [MainWorkerPool-1] ERROR LINSTOR/Controller - SYSTEM - Not enough available nodes
14:06:48.238 [MainWorkerPool-1] INFO  LINSTOR/Controller - SYSTEM - Satellite eu-central-1771 reports a capacity of 0 kiB, allocated space 0 kiB, no errors
14:06:48.261 [MainWorkerPool-1] INFO  LINSTOR/Controller - SYSTEM - Satellite eu-central-7f14 reports a capacity of 0 kiB, allocated space 0 kiB, no errors
14:06:48.280 [MainWorkerPool-1] INFO  LINSTOR/Controller - SYSTEM - Satellite eu-west-f5b4 reports a capacity of 0 kiB, allocated space 0 kiB, no errors
14:06:48.311 [MainWorkerPool-1] INFO  LINSTOR/Controller - SYSTEM - Satellite eu-central-68d6 reports a capacity of 0 kiB, allocated space 0 kiB, no errors
linstor sp l
+----------------------------------------------------------------------------------------------------------------------------------------------------------+
| StoragePool          | Node            | Driver    | PoolName                         | FreeCapacity | TotalCapacity | CanSnapshots | State | SharedName |
|==========================================================================================================================================================|
| DfltDisklessStorPool | eu-central-1771 | DISKLESS  |                                  |              |               | False        | Ok    |            |
| DfltDisklessStorPool | eu-central-68d6 | DISKLESS  |                                  |              |               | False        | Ok    |            |
| DfltDisklessStorPool | eu-central-7f14 | DISKLESS  |                                  |              |               | False        | Ok    |            |
| DfltDisklessStorPool | eu-west-f5b4    | DISKLESS  |                                  |              |               | False        | Ok    |            |
| attached-storage     | eu-central-1771 | FILE_THIN | /mnt/attached-storage            |     9.23 GiB |      9.75 GiB | False        | Ok    |            |
| attached-storage     | eu-central-68d6 | FILE_THIN | /mnt/attached-storage            |   199.10 GiB |    225.04 GiB | False        | Ok    |            |
| attached-storage     | eu-central-7f14 | FILE_THIN | /mnt/attached-storage            |   192.43 GiB |    225.04 GiB | False        | Ok    |            |
| attached-storage     | eu-west-f5b4    | FILE_THIN | /mnt/attached-storage            |    70.86 GiB |     77.35 GiB | False        | Ok    |            |
| local-disk           | eu-central-1771 | FILE_THIN | /var/lib/piraeus-datastore/pool1 |   199.26 GiB |    225.04 GiB | False        | Ok    |            |
| local-disk           | eu-central-68d6 | FILE_THIN | /var/lib/piraeus-datastore/pool1 |   199.10 GiB |    225.04 GiB | False        | Ok    |            |
| local-disk           | eu-central-7f14 | FILE_THIN | /var/lib/piraeus-datastore/pool1 |   192.43 GiB |    225.04 GiB | False        | Ok    |            |
| local-disk           | eu-west-f5b4    | FILE_THIN | /var/lib/piraeus-datastore/pool1 |    70.86 GiB |     77.35 GiB | False        | Ok    |            |
+----------------------------------------------------------------------------------------------------------------------------------------------------------+
linstor v l
+--------------------------------------------------------------------------------------------------------------------------------------------+
| Node            | Resource                                 | StoragePool | VolNr | MinorNr | DeviceName    | Allocated | InUse  |    State |
|============================================================================================================================================|
| eu-central-1771 | pvc-a9cdd414-9118-4e48-8966-1a17b4124c6b | local-disk  |     0 |    1003 | /dev/drbd1003 |  9.09 GiB | Unused | UpToDate |
| eu-central-68d6 | pvc-a9cdd414-9118-4e48-8966-1a17b4124c6b | local-disk  |     0 |    1003 | /dev/drbd1003 |  9.09 GiB | InUse  | UpToDate |
| eu-central-7f14 | pvc-a9cdd414-9118-4e48-8966-1a17b4124c6b | local-disk  |     0 |    1003 | /dev/drbd1003 |  9.09 GiB | Unused | UpToDate |
| eu-central-1771 | pvc-f94ee9b9-76ac-460a-aede-3b904b986932 | local-disk  |     0 |    1000 | /dev/drbd1000 | 32.91 MiB | InUse  | UpToDate |
| eu-central-68d6 | pvc-f94ee9b9-76ac-460a-aede-3b904b986932 | local-disk  |     0 |    1000 | /dev/drbd1000 |   912 KiB | Unused | UpToDate |
| eu-central-7f14 | pvc-f94ee9b9-76ac-460a-aede-3b904b986932 | local-disk  |     0 |    1000 | /dev/drbd1000 |   912 KiB | Unused | UpToDate |
+--------------------------------------------------------------------------------------------------------------------------------------------+

I am able to create resources manually:

linstor r c eu-west-f5b4 pvc-a9cdd414-9118-4e48-8966-1a17b4124c6b
SUCCESS:
Description:
    New resource 'pvc-a9cdd414-9118-4e48-8966-1a17b4124c6b' on node 'eu-west-f5b4' registered.
Details:
    Resource 'pvc-a9cdd414-9118-4e48-8966-1a17b4124c6b' on node 'eu-west-f5b4' UUID is: d14b3ce5-3384-446e-87a5-81d1cd01bf3d
SUCCESS:
Description:
    Volume with number '0' on resource 'pvc-a9cdd414-9118-4e48-8966-1a17b4124c6b' on node 'eu-west-f5b4' successfully registered
Details:
    Volume UUID is: 4fd93704-aa3e-4291-aadd-12119c4fa815
SUCCESS:
    Added peer(s) 'eu-west-f5b4' to resource 'pvc-a9cdd414-9118-4e48-8966-1a17b4124c6b' on 'eu-central-1771'
SUCCESS:
    Added peer(s) 'eu-west-f5b4' to resource 'pvc-a9cdd414-9118-4e48-8966-1a17b4124c6b' on 'eu-central-68d6'
SUCCESS:
    Added peer(s) 'eu-west-f5b4' to resource 'pvc-a9cdd414-9118-4e48-8966-1a17b4124c6b' on 'eu-central-7f14'
SUCCESS:
    Created resource 'pvc-a9cdd414-9118-4e48-8966-1a17b4124c6b' on 'eu-west-f5b4'
SUCCESS:
Description:
    Resource 'pvc-a9cdd414-9118-4e48-8966-1a17b4124c6b' on 'eu-west-f5b4' ready
Details:
    Node(s): 'eu-west-f5b4', Resource: 'pvc-a9cdd414-9118-4e48-8966-1a17b4124c6b'
    ```
@WanzenBug
Copy link
Member

I guess eu-west has a different zone from the eu-central nodes. The problem is that it first locks in a preferred node (eu-west), and then tries to place the rest. This then fails because there are not enough nodes in that zone.

By the way, you dont need specify the placement on different nodes with kubernetes.io/hostname, that is always implied.

@boedy
Copy link
Author

boedy commented Mar 24, 2023

I just removed the eu-west node from the cluster. Still getting same error:

failed to provision volume with StorageClass "fast": rpc error: code = Internal desc = CreateVolume failed for pvc-2478242b-90f8-4ba1-9821-fd3c6de40917: rpc error: code = ResourceExhausted desc = failed to enough replicas on requisite nodes: Message: 'Not enough available nodes'; Details: 'Not enough nodes fulfilling the following auto-place criteria: * has a deployed storage pool named TransactionList [local-disk] * the storage pools have to have at least '10485760' free space * the current access context has enough privileges to use the node and the storage pool * the node is online Auto-place configuration details: Place Count: 2 Replicas on different nodes: TransactionList [Aux/topology/kubernetes.io/hostname] Replicas on same nodes: TransactionList [Aux/topology/topology.kubernetes.io/zone] Don't place with resource (List): [pvc-2478242b-90f8-4ba1-9821-fd3c6de40917] Node name: [eu-central-1771, eu-central-68d6, eu-central-7f14] Storage pool name: TransactionList [local-disk] Layer stack: TransactionList [DRBD, STORAGE] Auto-placing resource: pvc-2478242b-90f8-4ba1-9821-fd3c6de40917'

+----------------------------------------------------------------------------------------------------------------------------------------------------------+
| StoragePool          | Node            | Driver    | PoolName                         | FreeCapacity | TotalCapacity | CanSnapshots | State | SharedName |
|==========================================================================================================================================================|
| DfltDisklessStorPool | eu-central-1771 | DISKLESS  |                                  |              |               | False        | Ok    |            |
| DfltDisklessStorPool | eu-central-68d6 | DISKLESS  |                                  |              |               | False        | Ok    |            |
| DfltDisklessStorPool | eu-central-7f14 | DISKLESS  |                                  |              |               | False        | Ok    |            |
| attached-storage     | eu-central-1771 | FILE_THIN | /mnt/attached-storage            |     9.23 GiB |      9.75 GiB | False        | Ok    |            |
| attached-storage     | eu-central-68d6 | FILE_THIN | /mnt/attached-storage            |   199.09 GiB |    225.04 GiB | False        | Ok    |            |
| attached-storage     | eu-central-7f14 | FILE_THIN | /mnt/attached-storage            |   192.39 GiB |    225.04 GiB | False        | Ok    |            |
| local-disk           | eu-central-1771 | FILE_THIN | /var/lib/piraeus-datastore/pool1 |   199.26 GiB |    225.04 GiB | False        | Ok    |            |
| local-disk           | eu-central-68d6 | FILE_THIN | /var/lib/piraeus-datastore/pool1 |   199.09 GiB |    225.04 GiB | False        | Ok    |            |
| local-disk           | eu-central-7f14 | FILE_THIN | /var/lib/piraeus-datastore/pool1 |   192.39 GiB |    225.04 GiB | False        | Ok    |            |
+----------------------------------------------------------------------------------------------------------------------------------------------------------+

@WanzenBug
Copy link
Member

Well, are the nodes actually in the same zone? Perhaps they are in the same region, but different zones?

@boedy
Copy link
Author

boedy commented Mar 24, 2023

Ah.. Yes Two of them are! The third one isn't:

Also I had an nodeAffinity defined causing it to schedule one of the single zone servers:

  - key: topology.kubernetes.io/zone
    operator: NotIn
    values:
      - eu-central-nbg1-dc3

Mystery solved!🕵️ Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants