Storage: Ceph RBD lock concurrent snapshot migrations #13096
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is sometimes causing an error in the pipeline when subsequent migrations are called right after another and the earlier migration isn't yet finished as the snapshot requires locking.
The generic volume migration (refresh) is triggering the Ceph RBD drivers
MountVolumeSnapshot()
which is callingvol.MountLock()
for the respective volume. Before #13079 the driversMountVolumeSnapshot()
was also used for the initial migration using typeRBD
.As a result we also have to use the lock here so that subsequent generic refreshes will wait until the lock is returned.
That was first observed in https://github.com/canonical/lxd/actions/runs/8206575452/job/22446018257?pr=13084 and caused by #13079.