OCPEDGE-1200: feat: allow dynamic add/remove of devices via pvmove #698

jakobmoellerdev · 2024-08-26T15:34:19Z

This is a sample implementation of dynamic device selectors via pvmove and vgreduce. Here is an example of a modified LVMCluster:

apiVersion: v1
count: 1
eventTime: null
firstTimestamp: "2024-08-26T15:29:33Z"
involvedObject:
  apiVersion: lvm.topolvm.io/v1alpha1
  kind: LVMCluster
  name: my-lvmcluster
  namespace: openshift-storage
  uid: 0dc5f864-0404-4a44-a0f7-c47e296d1aea
kind: Event
lastTimestamp: "2024-08-26T15:29:33Z"
message: 'update on node openshift-storage/crc in volume group openshift-storage/vg1:
  moving extents from reduction candidates (/dev/vdc) to left pvs (/dev/vdb) to ensure
  consistent data after path removal'
metadata:
  creationTimestamp: "2024-08-26T15:29:33Z"
  name: my-lvmcluster.17ef50efdc764466
  namespace: openshift-storage
  resourceVersion: "819668"
  uid: 05d96f72-8864-46e2-862a-7d6b8f02a062
reason: ExtentMigrationStarted
reportingComponent: vg-manager
reportingInstance: ""
source:
  component: vg-manager
type: Normal

apiVersion: v1
count: 1
eventTime: null
firstTimestamp: "2024-08-26T15:30:03Z"
involvedObject:
  apiVersion: lvm.topolvm.io/v1alpha1
  kind: LVMCluster
  name: my-lvmcluster
  namespace: openshift-storage
  uid: 0dc5f864-0404-4a44-a0f7-c47e296d1aea
kind: Event
lastTimestamp: "2024-08-26T15:30:03Z"
message: 'update on node openshift-storage/crc in volume group openshift-storage/vg1:
  moved extents from reduction candidates (/dev/vdc) to left pvs (/dev/vdb) to ensure
  consistent data after path removal'
metadata:
  creationTimestamp: "2024-08-26T15:30:03Z"
  name: my-lvmcluster.17ef50f6de83df23
  namespace: openshift-storage
  resourceVersion: "819774"
  uid: e999fcac-52d4-42f9-9bc2-66c0815d3b44
reason: ExtentMigrationFinished
reportingComponent: vg-manager
reportingInstance: ""
source:
  component: vg-manager
type: Normal

apiVersion: lvm.topolvm.io/v1alpha1
kind: LVMCluster
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"lvm.topolvm.io/v1alpha1","kind":"LVMCluster","metadata":{"annotations":{},"name":"my-lvmcluster","namespace":"openshift-storage"},"spec":{"storage":{"deviceClasses":[{"default":true,"deviceSelector":{"paths":["/dev/vdb","/dev/vdc"]},"fstype":"ext4","name":"vg1","thinPoolConfig":{"name":"thin-pool-1","overprovisionRatio":10,"sizePercent":40}}]}}}
  creationTimestamp: "2024-08-26T15:29:16Z"
  finalizers:
  - lvmcluster.topolvm.io
  generation: 2
  name: my-lvmcluster
  namespace: openshift-storage
  resourceVersion: "819780"
  uid: 0dc5f864-0404-4a44-a0f7-c47e296d1aea
spec:
  storage:
    deviceClasses:
    - default: true
      deviceSelector:
        paths:
        - /dev/vdb
      fstype: ext4
      name: vg1
      thinPoolConfig:
        chunkSizeCalculationPolicy: Static
        name: thin-pool-1
        overprovisionRatio: 10
        sizePercent: 40
status:
  conditions:
  - lastTransitionTime: "2024-08-26T15:30:03Z"
    message: Reconciliation is complete and all the resources are available
    reason: ResourcesAvailable
    status: "True"
    type: ResourcesAvailable
  - lastTransitionTime: "2024-08-26T15:30:03Z"
    message: All the VGs are ready
    reason: VGsReady
    status: "True"
    type: VolumeGroupsReady
  deviceClassStatuses:
  - name: vg1
    nodeStatus:
    - deviceDiscoveryPolicy: Preconfigured
      devices:
      - /dev/vdb
      excluded:
      - name: /dev/vda
        reasons:
        - /dev/vda has children block devices and could not be considered
        - /dev/vda is not part of the device selector
      - name: /dev/vda1
        reasons:
        - /dev/vda1 has an invalid partition label "BIOS-BOOT"
        - /dev/vda1 is not part of the device selector
      - name: /dev/vda2
        reasons:
        - /dev/vda2 has an invalid filesystem signature (vfat) and cannot be used
        - /dev/vda2 is not part of the device selector
      - name: /dev/vda3
        reasons:
        - /dev/vda3 has an invalid filesystem signature (ext4) and cannot be used
        - /dev/vda3 has an invalid partition label "boot"
        - /dev/vda3 is not part of the device selector
      - name: /dev/vda4
        reasons:
        - /dev/vda4 has an invalid filesystem signature (xfs) and cannot be used
        - /dev/vda4 is not part of the device selector
      - name: /dev/vdc
        reasons:
        - /dev/vdc is not part of the device selector
      name: vg1
      node: crc
      status: Ready
  ready: true
  state: Ready

Possible Gotchas that need to be considered for a merge:

How do we deal with long pvmoves (even empty thin pools can take upwards of 1 minute on a ssd under bad conditions) => Asynchronous? If not, then how do we properly lock the lvmcluster for modification?
How do we deal with crashing nodes during migration? Pickup again with pvmove without args? How do we test that

Signed-off-by: Jakob Möller <[email protected]>

openshift-ci-robot · 2024-08-26T15:34:23Z

@jakobmoellerdev: This pull request references OCPEDGE-1200 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the spike to target the "4.18.0" version, but no target version was set.

In response to this:

This is a sample implementation of dynamic device selectors via pvmove and vgreduce. Here is an example of a modified LVMCluster:

apiVersion: v1
count: 1
eventTime: null
firstTimestamp: "2024-08-26T15:29:33Z"
involvedObject:
 apiVersion: lvm.topolvm.io/v1alpha1
 kind: LVMCluster
 name: my-lvmcluster
 namespace: openshift-storage
 uid: 0dc5f864-0404-4a44-a0f7-c47e296d1aea
kind: Event
lastTimestamp: "2024-08-26T15:29:33Z"
message: 'update on node openshift-storage/crc in volume group openshift-storage/vg1:
 moving extents from reduction candidates (/dev/vdc) to left pvs (/dev/vdb) to ensure
 consistent data after path removal'
metadata:
 creationTimestamp: "2024-08-26T15:29:33Z"
 name: my-lvmcluster.17ef50efdc764466
 namespace: openshift-storage
 resourceVersion: "819668"
 uid: 05d96f72-8864-46e2-862a-7d6b8f02a062
reason: ExtentMigrationStarted
reportingComponent: vg-manager
reportingInstance: ""
source:
 component: vg-manager
type: Normal

apiVersion: v1
count: 1
eventTime: null
firstTimestamp: "2024-08-26T15:30:03Z"
involvedObject:
 apiVersion: lvm.topolvm.io/v1alpha1
 kind: LVMCluster
 name: my-lvmcluster
 namespace: openshift-storage
 uid: 0dc5f864-0404-4a44-a0f7-c47e296d1aea
kind: Event
lastTimestamp: "2024-08-26T15:30:03Z"
message: 'update on node openshift-storage/crc in volume group openshift-storage/vg1:
 moved extents from reduction candidates (/dev/vdc) to left pvs (/dev/vdb) to ensure
 consistent data after path removal'
metadata:
 creationTimestamp: "2024-08-26T15:30:03Z"
 name: my-lvmcluster.17ef50f6de83df23
 namespace: openshift-storage
 resourceVersion: "819774"
 uid: e999fcac-52d4-42f9-9bc2-66c0815d3b44
reason: ExtentMigrationFinished
reportingComponent: vg-manager
reportingInstance: ""
source:
 component: vg-manager
type: Normal

apiVersion: lvm.topolvm.io/v1alpha1
kind: LVMCluster
metadata:
 annotations:
   kubectl.kubernetes.io/last-applied-configuration: |
     {"apiVersion":"lvm.topolvm.io/v1alpha1","kind":"LVMCluster","metadata":{"annotations":{},"name":"my-lvmcluster","namespace":"openshift-storage"},"spec":{"storage":{"deviceClasses":[{"default":true,"deviceSelector":{"paths":["/dev/vdb","/dev/vdc"]},"fstype":"ext4","name":"vg1","thinPoolConfig":{"name":"thin-pool-1","overprovisionRatio":10,"sizePercent":40}}]}}}
 creationTimestamp: "2024-08-26T15:29:16Z"
 finalizers:
 - lvmcluster.topolvm.io
 generation: 2
 name: my-lvmcluster
 namespace: openshift-storage
 resourceVersion: "819780"
 uid: 0dc5f864-0404-4a44-a0f7-c47e296d1aea
spec:
 storage:
   deviceClasses:
   - default: true
     deviceSelector:
       paths:
       - /dev/vdb
     fstype: ext4
     name: vg1
     thinPoolConfig:
       chunkSizeCalculationPolicy: Static
       name: thin-pool-1
       overprovisionRatio: 10
       sizePercent: 40
status:
 conditions:
 - lastTransitionTime: "2024-08-26T15:30:03Z"
   message: Reconciliation is complete and all the resources are available
   reason: ResourcesAvailable
   status: "True"
   type: ResourcesAvailable
 - lastTransitionTime: "2024-08-26T15:30:03Z"
   message: All the VGs are ready
   reason: VGsReady
   status: "True"
   type: VolumeGroupsReady
 deviceClassStatuses:
 - name: vg1
   nodeStatus:
   - deviceDiscoveryPolicy: Preconfigured
     devices:
     - /dev/vdb
     excluded:
     - name: /dev/vda
       reasons:
       - /dev/vda has children block devices and could not be considered
       - /dev/vda is not part of the device selector
     - name: /dev/vda1
       reasons:
       - /dev/vda1 has an invalid partition label "BIOS-BOOT"
       - /dev/vda1 is not part of the device selector
     - name: /dev/vda2
       reasons:
       - /dev/vda2 has an invalid filesystem signature (vfat) and cannot be used
       - /dev/vda2 is not part of the device selector
     - name: /dev/vda3
       reasons:
       - /dev/vda3 has an invalid filesystem signature (ext4) and cannot be used
       - /dev/vda3 has an invalid partition label "boot"
       - /dev/vda3 is not part of the device selector
     - name: /dev/vda4
       reasons:
       - /dev/vda4 has an invalid filesystem signature (xfs) and cannot be used
       - /dev/vda4 is not part of the device selector
     - name: /dev/vdc
       reasons:
       - /dev/vdc is not part of the device selector
     name: vg1
     node: crc
     status: Ready
 ready: true
 state: Ready

Possible Gotchas that need to be considered for a merge:

How do we deal with long pvmoves (even empty thin pools can take upwards of 1 minute on a ssd under bad conditions) => Asynchronous? If not, then how do we properly lock the lvmcluster for modification?
How do we deal with crashing nodes during migration? Pickup again with pvmove without args? How do we test that

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci · 2024-08-26T15:34:24Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

openshift-ci · 2024-08-26T15:34:35Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jakobmoellerdev

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [jakobmoellerdev]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Signed-off-by: Jakob Möller <[email protected]>

jakobmoellerdev · 2024-08-28T13:14:29Z

/test e2e-aws-hypershift

jakobmoellerdev · 2024-08-28T14:59:07Z

/test e2e-aws-hypershift

jakobmoellerdev · 2024-09-02T12:17:07Z

/test e2e-aws-hypershift

openshift-ci · 2024-09-02T12:57:35Z

@jakobmoellerdev: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/e2e-aws-hypershift	`cbf1fac`	link	true	`/test e2e-aws-hypershift`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

feat: allow dynamic add/remove of devices via pvmove

0729895

Signed-off-by: Jakob Möller <[email protected]>

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Aug 26, 2024

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 26, 2024

openshift-ci bot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 26, 2024

feat: async migration

cbf1fac

Signed-off-by: Jakob Möller <[email protected]>

openshift-ci bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OCPEDGE-1200: feat: allow dynamic add/remove of devices via pvmove #698

OCPEDGE-1200: feat: allow dynamic add/remove of devices via pvmove #698

jakobmoellerdev commented Aug 26, 2024

openshift-ci-robot commented Aug 26, 2024 •

edited by openshift-ci bot

Loading

openshift-ci bot commented Aug 26, 2024

openshift-ci bot commented Aug 26, 2024

jakobmoellerdev commented Aug 28, 2024

jakobmoellerdev commented Aug 28, 2024

jakobmoellerdev commented Sep 2, 2024

openshift-ci bot commented Sep 2, 2024

OCPEDGE-1200: feat: allow dynamic add/remove of devices via pvmove #698

Are you sure you want to change the base?

OCPEDGE-1200: feat: allow dynamic add/remove of devices via pvmove #698

Conversation

jakobmoellerdev commented Aug 26, 2024

openshift-ci-robot commented Aug 26, 2024 • edited by openshift-ci bot Loading

openshift-ci bot commented Aug 26, 2024

openshift-ci bot commented Aug 26, 2024

jakobmoellerdev commented Aug 28, 2024

jakobmoellerdev commented Aug 28, 2024

jakobmoellerdev commented Sep 2, 2024

openshift-ci bot commented Sep 2, 2024

openshift-ci-robot commented Aug 26, 2024 •

edited by openshift-ci bot

Loading