Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backward compatibility and migration to 1.1 volumes from existing 1.0 versions #378

Closed
ShyamsundarR opened this issue May 17, 2019 · 25 comments
Labels
release-1.1.0 Track release 1.1 items

Comments

@ShyamsundarR
Copy link
Contributor

As the plugin moves from the current config maps to instead use a more descriptive VolumeID and RADOS objects (once #312 is merged), to store Kubernetes VolumeName and its backing image details, it needs to ensure the following for existing users:

  1. Backward compatibility with older Volumes provisioned
  2. Migration possibilities to the new scheme

1) Backward compatibility:

NOTE: This is as mentioned in this comment, #296 (comment)

For volumes created using existing 1.0 versions of the Ceph-CSI plugins the following actions would be supported by 1.1 version of the plugin,
DeleteVolume
DeleteSnapshot
NodePublishVolume (IOW, mounting and using the volume for required IO operations)

And, the following would be unsupported:
CreateVolume from snapshot source which is from an older version
CreateSnapshot from volume source which is from an older version

NOTE: Support for 0.3 created PVs requires feasibility analysis to ensure the above compatibility can be guaranteed for the same.

Method for doing this, would be to continue using the existing config maps, and on detecting older style VolumeID in the mentioned RPC requests, read the required data from the config maps and process the request.

2) Migration

These were discussed in issue #296

Option A: #296 (comment)
In short, create a new PVC using the new version of the plugin, and using Ceph CLIs clone the image backing the older PVC to the image backing the newer PVC, further updating pods that use the older PVC to the new PVC.

Option B: #296 (comment)
Deals with PV.ClaimRef juggling as in the link above.

Option C: Migrate data using a PV to PV data copy

Option A is what I have tested and am far more comfortable recommending, as that would update the driver name as well as the required VolumeID and other fields. Hence makes it future proof from a Ceph-CSI implementation.

Further with Option-A and the backward compatibility included as above, the migration can be staged, and need not happen in one go. Hopefully alleviating some concerns around down time for the pods using the PVs.

Documentation and instructions for achieving the same would be provided.

Other options:

  • convert config maps to RADOS objects as required by the 1.1 version of the plugin
    • This will not address the VolumeID that is already exchanged with the Kubernetes and is immutable and hence cannot be modified
  • csi-translation-lib
    • Is not applicable to this use-case, as it is more for in-tree to CSI migrations, and also cannot help with this use-case

NOTE: CephFS would be backward compatible, but may not be able to leverage the provided migration solution, as in Optio-A, as it involves snapshots and clones

@kfox1111
Copy link
Contributor

Why not just read the data out of the configmaps, inject it into the omaps and drop the old configmaps?

@ShyamsundarR
Copy link
Contributor Author

Why not just read the data out of the configmaps, inject it into the omaps and drop the old configmaps?

Mounting does not need the config maps and also does not use them in the current scheme.

That would leave the VolumeID that has been exchanged with Kubernetes in the old format. It still means translating that into, a pool and monitors (or Ceph cluster) to find and delete the image. This information is no ex-tractable from the old format VolumeID, and is used from the config maps, and hence cannot be stored in RADOS (because we will not know where (as in cluster/pool) to go and look for these values).

The VolumeID being immutable once exchanged with Kubernetes, cannot be modified and hence moving the config maps out of Kubernetes into RADOS is not going to help with the above backward compatibility.

@kfox1111
Copy link
Contributor

Why was the volumeid format changed?

@ShyamsundarR
Copy link
Contributor Author

Why was the volumeid format changed?

The config maps held the required cluster/pool to VolumeID relationship for a provisioned image. During a DeleteVolume call, we only get the VolumeID to work with in the request. Thus without the config maps, we needed to encode required information in the VolumeID that can help locate the cluster/pool where the image (and its related OMaps in RADOS) are present.

The above required us to change the VolumeID encoding rules.

@kfox1111
Copy link
Contributor

so in the new format, the cluster/pool is encoded into the volumeid's?

@kfox1111
Copy link
Contributor

Would sticking them in as volumeAttributes work too?

@ShyamsundarR
Copy link
Contributor Author

Would sticking them in as volumeAttributes work too?

No, this is not passed to the Delete request as detailed in this issue container-storage-interface/spec#167 (there was one other discussion that I am failing to find the link to that talks about a timing issue regarding when the attribute information is deleted and when the volume is deleted, but in short this information is not available).

Also, the cluster/pool in the new VolumeID is encoded as detailed here.

@kfox1111
Copy link
Contributor

Hmm.. I see.

What about saad's suggestion of adding cluster/pool into a secret and injected via 'credentials'?

That would decouple it from the volume id.

@kfox1111
Copy link
Contributor

I guess that would still require an edit to the pv's which may not be supported.

@kfox1111
Copy link
Contributor

Another migration option was mentioned on sig-storage. Force delete the pv while leaving the pvc. This would leave the state in 'lost'. if you then delete all pods referencing the pvc, no new instances will be launched. Then you can reupload the pvc with any changes and the pvc should become unlost and work again.

I have not tested this theory.

@ShyamsundarR
Copy link
Contributor Author

I think it is important to decide on what the end result looks like here, and if we are looking at migration or just plain backward compatibility.

For backward compatibility, retain the config maps, and function with them as before, just not adding any more to the config maps. We can even flatten the config maps to a file and use that instead of an actual Kubernetes config map being accessed by the plugin (but that raises the question of who stores this flat file and where). This is the simplest form of backward compatibility that we can reach, from the code to the deployment as it exists.

For migration, I think the end result is that everything is in the new format, the PV parameters, VolumeID and the metadata stored on the backend, as this would be ideal in the long run, instead of supporting any other intermediate form/format/parameters in any entity.

For reaching the migration goal, I see the possibility of a "down" pod conversion of the PVC to the a new format PV (as detailed in Option-A), without incurring a data copy. IOW, as we control the metadata on the Ceph cluster, we can manipulate that, but recreate required objects on the Kubernetes cluster as we do not have control over the same.

@kfox1111 Are you attempting to find other ways to reach the same end state/goal? Or, is your end-goal different than what I am looking at?

@kfox1111
Copy link
Contributor

Same goal really. The configmaps have always been a bit of a pain for operators to deal with. So not having to special case some old volumes would be better all around. The helm chart also has a bug in it that caused the configmaps to enter the wrong namespace (default). So one additional goal is to decide if we need yet another migration plan for those, migrating them from the wrong namespace to the right namespace, or does the migration plan for this issue solve the other issue too.

@ShyamsundarR
Copy link
Contributor Author

Same goal really. The configmaps have always been a bit of a pain for operators to deal with. So not having to special case some old volumes would be better all around. The helm chart also has a bug in it that caused the configmaps to enter the wrong namespace (default). So one additional goal is to decide if we need yet another migration plan for those, migrating them from the wrong namespace to the right namespace, or does the migration plan for this issue solve the other issue too.

The current migration plan, will get rid of the config map, as we would not need it any longer. As a result at the end of the migration it would solve the config maps landing in the default namespace (as those would be deleted). Just to be on the same page, the current migration plan is to recreate the PV/PVC on the Kubernetes end, but shift the old image as the new image in the Ceph pool, and hence the old PV is deleted and so the config map is also removed.

@kfox1111
Copy link
Contributor

I'm good getting rid of the configmaps if we can but, deleting PVC's is really really painful/error prone/potentially disastrous.

My preferred list of ways to solve this (highest to lowest preference):

  1. get the k8s folks to add an api to let us just fix the pv's directly.
  2. write a tool that updates the pv's in etcd directly while the api server is off.
  3. backup pv's, force delete them, delete all pods referencing pvcs bound to pv, restore edited pv's.
  4. do nothing, keep supporting configmaps
  5. do the PV/PVC deletion recreation dance.
  6. create whole parallel pv/pvc's, migrate all the data, delete old pv/pvcs, retarget all workload.

@humblec
Copy link
Collaborator

humblec commented May 24, 2019

I'm good getting rid of the configmaps if we can but, deleting PVC's is really really painful/error prone/potentially disastrous.

We understand the manual efforts, but unfortunately we are into that state. Also please remember that, this driver has not declared as stable or there were no stable releases before. We will make sure we will not break again, but this change is inevitable.

My preferred list of ways to solve this (highest to lowest preference):
get the k8s folks to add an api to let us just fix the pv's directly.
......

PVs are immutable since start or from day0 for a reason. I dont think its something community is going to accept. You can give a try for sure, even I can help. But, it looks to me very difficult to get this change done as a solution in upstream atleast in short term.

As an additional note, we are more than happy to have above mentioned process written or get some contribution from community to mitigate the effect of this change!

ShyamsundarR added a commit to ShyamsundarR/ceph-csi that referenced this issue May 31, 2019
This commit adds support to mount and delete volumes provisioned by older
plugin versions (1.0.0) in order to support backward compatibility to 1.0.0
created volumes.

It adds back the ability to specify where older meta data was specified, using
the metadatastorage option to the plugin. Further, using the provided meta data
to mount and delete the older volumes.

It also supports a variety of ways in which monitor information may have been
specified (in the storage class, or in the secret), to keep the monitor
information current.

Testing done:
- Mount/Delete 1.0.0 plugin created volume with monitors in the StorageClass
- Mount/Delete 1.0.0 plugin created volume with monitors in the secret with
  a key "monitors"
- Mount/Delete 1.0.0 plugin created volume with monitors in the secret with
  a user specified key
- PVC creation and deletion with the current version (to ensure at the minimum
  no broken functionality)
- Tested some negative cases, where monitor information is missing in secrets
  or present with a different key name, to understand if failure scenarios work
  as expected

Updates ceph#378

Follow-up work:
- Documentation on how to upgrade to 1.1 plugin and retain above functionality
  for older volumes

Signed-off-by: ShyamsundarR <[email protected]>
ShyamsundarR added a commit to ShyamsundarR/ceph-csi that referenced this issue Jun 11, 2019
This commit adds support to mount and delete volumes provisioned by older
plugin versions (1.0.0) in order to support backward compatibility to 1.0.0
created volumes.

It adds back the ability to specify where older meta data was specified, using
the metadatastorage option to the plugin. Further, using the provided meta data
to mount and delete the older volumes.

It also supports a variety of ways in which monitor information may have been
specified (in the storage class, or in the secret), to keep the monitor
information current.

Testing done:
- Mount/Delete 1.0.0 plugin created volume with monitors in the StorageClass
- Mount/Delete 1.0.0 plugin created volume with monitors in the secret with
  a key "monitors"
- Mount/Delete 1.0.0 plugin created volume with monitors in the secret with
  a user specified key
- PVC creation and deletion with the current version (to ensure at the minimum
  no broken functionality)
- Tested some negative cases, where monitor information is missing in secrets
  or present with a different key name, to understand if failure scenarios work
  as expected

Updates ceph#378

Follow-up work:
- Documentation on how to upgrade to 1.1 plugin and retain above functionality
  for older volumes

Signed-off-by: ShyamsundarR <[email protected]>
ShyamsundarR added a commit to ShyamsundarR/ceph-csi that referenced this issue Jul 2, 2019
This commit adds support to mount and delete volumes provisioned by older
plugin versions (1.0.0) in order to support backward compatibility to 1.0.0
created volumes.

It adds back the ability to specify where older meta data was specified, using
the metadatastorage option to the plugin. Further, using the provided meta data
to mount and delete the older volumes.

It also supports a variety of ways in which monitor information may have been
specified (in the storage class, or in the secret), to keep the monitor
information current.

Testing done:
- Mount/Delete 1.0.0 plugin created volume with monitors in the StorageClass
- Mount/Delete 1.0.0 plugin created volume with monitors in the secret with
  a key "monitors"
- Mount/Delete 1.0.0 plugin created volume with monitors in the secret with
  a user specified key
- PVC creation and deletion with the current version (to ensure at the minimum
  no broken functionality)
- Tested some negative cases, where monitor information is missing in secrets
  or present with a different key name, to understand if failure scenarios work
  as expected

Updates ceph#378

Follow-up work:
- Documentation on how to upgrade to 1.1 plugin and retain above functionality
  for older volumes

Signed-off-by: ShyamsundarR <[email protected]>
ShyamsundarR added a commit to ShyamsundarR/ceph-csi that referenced this issue Jul 2, 2019
This commit adds support to mount and delete volumes provisioned by older
plugin versions (1.0.0) in order to support backward compatibility to 1.0.0
created volumes.

It adds back the ability to specify where older meta data was specified, using
the metadatastorage option to the plugin. Further, using the provided meta data
to mount and delete the older volumes.

It also supports a variety of ways in which monitor information may have been
specified (in the storage class, or in the secret), to keep the monitor
information current.

Testing done:
- Mount/Delete 1.0.0 plugin created volume with monitors in the StorageClass
- Mount/Delete 1.0.0 plugin created volume with monitors in the secret with
  a key "monitors"
- Mount/Delete 1.0.0 plugin created volume with monitors in the secret with
  a user specified key
- PVC creation and deletion with the current version (to ensure at the minimum
  no broken functionality)
- Tested some negative cases, where monitor information is missing in secrets
  or present with a different key name, to understand if failure scenarios work
  as expected

Updates ceph#378

Follow-up work:
- Documentation on how to upgrade to 1.1 plugin and retain above functionality
  for older volumes

Signed-off-by: ShyamsundarR <[email protected]>
ShyamsundarR added a commit to ShyamsundarR/ceph-csi that referenced this issue Jul 8, 2019
This commit adds support to mount and delete volumes provisioned by older
plugin versions (1.0.0) in order to support backward compatibility to 1.0.0
created volumes.

It adds back the ability to specify where older meta data was specified, using
the metadatastorage option to the plugin. Further, using the provided meta data
to mount and delete the older volumes.

It also supports a variety of ways in which monitor information may have been
specified (in the storage class, or in the secret), to keep the monitor
information current.

Testing done:
- Mount/Delete 1.0.0 plugin created volume with monitors in the StorageClass
- Mount/Delete 1.0.0 plugin created volume with monitors in the secret with
  a key "monitors"
- Mount/Delete 1.0.0 plugin created volume with monitors in the secret with
  a user specified key
- PVC creation and deletion with the current version (to ensure at the minimum
  no broken functionality)
- Tested some negative cases, where monitor information is missing in secrets
  or present with a different key name, to understand if failure scenarios work
  as expected

Updates ceph#378

Follow-up work:
- Documentation on how to upgrade to 1.1 plugin and retain above functionality
  for older volumes

Signed-off-by: ShyamsundarR <[email protected]>
mergify bot pushed a commit that referenced this issue Jul 8, 2019
This commit adds support to mount and delete volumes provisioned by older
plugin versions (1.0.0) in order to support backward compatibility to 1.0.0
created volumes.

It adds back the ability to specify where older meta data was specified, using
the metadatastorage option to the plugin. Further, using the provided meta data
to mount and delete the older volumes.

It also supports a variety of ways in which monitor information may have been
specified (in the storage class, or in the secret), to keep the monitor
information current.

Testing done:
- Mount/Delete 1.0.0 plugin created volume with monitors in the StorageClass
- Mount/Delete 1.0.0 plugin created volume with monitors in the secret with
  a key "monitors"
- Mount/Delete 1.0.0 plugin created volume with monitors in the secret with
  a user specified key
- PVC creation and deletion with the current version (to ensure at the minimum
  no broken functionality)
- Tested some negative cases, where monitor information is missing in secrets
  or present with a different key name, to understand if failure scenarios work
  as expected

Updates #378

Follow-up work:
- Documentation on how to upgrade to 1.1 plugin and retain above functionality
  for older volumes

Signed-off-by: ShyamsundarR <[email protected]>
@Madhu-1
Copy link
Collaborator

Madhu-1 commented Jul 16, 2019

@ShyamsundarR anything pending on this one?

@ShyamsundarR
Copy link
Contributor Author

@Madhu-1 what is the current status of CephFS. Also, we possibly need documentation on how to upgrade and use older volumes from 1.0.

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Jul 16, 2019

@poornimag anything pending for cephfs 1.0?

Note: I think we don't need to support migration of 0.3 to 1.0 or 1.1.0

@ShyamsundarR
Copy link
Contributor Author

@poornimag anything pending for cephfs 1.0?

Note: I think we don't need to support migration of 0.3 to 1.0 or 1.1.0

If I understand the above right, you are stating "migration" is not supported, but "backward compatibility" is, correct?

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Jul 16, 2019

do we need to support backward compatibility?

@ShyamsundarR
Copy link
Contributor Author

do we need to support backward compatibility?

Yes, that is what we agreed to do. So we should support "using" 1.0 created volumes, and by that I mean node services and ability to delete such volumes.

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Jul 16, 2019

@ShyamsundarR backward compatibility for 1.0 is fine not for 0.3

@ShyamsundarR
Copy link
Contributor Author

@ShyamsundarR backward compatibility for 1.0 is fine not for 0.3

Agreed.

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Jul 16, 2019

@ShyamsundarR, in that case, can you please fix the issue title

@ShyamsundarR ShyamsundarR changed the title Backward compatibility and migration to 1.1 volumes from existing 1.0 and 0.3 versions Backward compatibility and migration to 1.1 volumes from existing 1.0 versions Jul 16, 2019
wilmardo pushed a commit to wilmardo/ceph-csi that referenced this issue Jul 29, 2019
This commit adds support to mount and delete volumes provisioned by older
plugin versions (1.0.0) in order to support backward compatibility to 1.0.0
created volumes.

It adds back the ability to specify where older meta data was specified, using
the metadatastorage option to the plugin. Further, using the provided meta data
to mount and delete the older volumes.

It also supports a variety of ways in which monitor information may have been
specified (in the storage class, or in the secret), to keep the monitor
information current.

Testing done:
- Mount/Delete 1.0.0 plugin created volume with monitors in the StorageClass
- Mount/Delete 1.0.0 plugin created volume with monitors in the secret with
  a key "monitors"
- Mount/Delete 1.0.0 plugin created volume with monitors in the secret with
  a user specified key
- PVC creation and deletion with the current version (to ensure at the minimum
  no broken functionality)
- Tested some negative cases, where monitor information is missing in secrets
  or present with a different key name, to understand if failure scenarios work
  as expected

Updates ceph#378

Follow-up work:
- Documentation on how to upgrade to 1.1 plugin and retain above functionality
  for older volumes

Signed-off-by: ShyamsundarR <[email protected]>
@Madhu-1
Copy link
Collaborator

Madhu-1 commented Apr 2, 2020

closing this one as we support mounting/deleting 1.0.0 pvc

@Madhu-1 Madhu-1 closed this as completed Apr 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-1.1.0 Track release 1.1 items
Projects
None yet
Development

No branches or pull requests

4 participants