Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storage: Ceph RBD lock concurrent snapshot migrations #13096

Merged

Conversation

roosterfish
Copy link
Contributor

This is sometimes causing an error in the pipeline when subsequent migrations are called right after another and the earlier migration isn't yet finished as the snapshot requires locking.

The generic volume migration (refresh) is triggering the Ceph RBD drivers MountVolumeSnapshot() which is calling vol.MountLock() for the respective volume. Before #13079 the drivers MountVolumeSnapshot() was also used for the initial migration using type RBD.
As a result we also have to use the lock here so that subsequent generic refreshes will wait until the lock is returned.

That was first observed in https://github.com/canonical/lxd/actions/runs/8206575452/job/22446018257?pr=13084 and caused by #13079.

@tomponline tomponline changed the title Ceph RBD: Lock concurrent snashot migrations Storage: Ceph RBD lock concurrent snashot migrations Mar 11, 2024
@roosterfish roosterfish changed the title Storage: Ceph RBD lock concurrent snashot migrations Storage: Ceph RBD lock concurrent snapshot migrations Mar 11, 2024
@tomponline tomponline merged commit 1dd0cac into canonical:main Mar 11, 2024
28 checks passed
@roosterfish roosterfish deleted the fix_ceph_concurrent_migrations branch March 11, 2024 14:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants