Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential catch22 in incus admin restore #646

Closed
5 tasks done
markrattray opened this issue Mar 22, 2024 · 2 comments · Fixed by #678
Closed
5 tasks done

Potential catch22 in incus admin restore #646

markrattray opened this issue Mar 22, 2024 · 2 comments · Fixed by #678
Assignees
Labels
Bug Confirmed to be a bug Easy Good for new contributors
Milestone

Comments

@markrattray
Copy link

Required information

  • Distribution: Ubuntu
  • Distribution version: 22.04.4
  • The output of "incus info" or if that fails:

Issue description

I've reinstalled the OS and Incus on a stand-alone physical server due to disk corruption (failed RAID member) and was using incus admin recover to get everything back from the other arrays still in the same server.

I think I've noticed a potential situation where this tool will not be able to restore if 2 or more instances cross-reference storage volumes on other pools, and vice versa...e.g.:

  storagepool01:
    instance01:
      disk00:
        pool: storagepool02
        source: instance01_disk00

  storagepool02:
    instance02:
      disk00:
        pool: storagepool01
        source: instance02_disk00

During the recovery, which is understandable because it hasn't recovered that pool yet, the import process cannot deal with dependencies and at some point recovery might be difficult:

Error: Failed import request: Failed creating instance "instance01" record in project "default": Failed creating instance record: Failed initialising instance: Failed add validation for device "instance01_disk00": Failed to get storage pool "storagepool02": Storage pool not found

In my case this was not an the exact scenario so I just needed to restore the pools in a specific order.

Information to attach

  • Any relevant kernel output (dmesg)
  • Container log (incus info NAME --show-log)
  • Container configuration (incus config show NAME --expanded)
  • Main daemon log (at /var/log/incus/incusd.log)
  • [Y] Output of the client with --debug
  • Output of the daemon with --debug (alternatively output of incus monitor --pretty while reproducing the issue)
@stgraber
Copy link
Member

Shouldn't be too difficult to make the server-side logic consider all storage pools before validating references which should take care of this.

@stgraber stgraber added Bug Confirmed to be a bug Easy Good for new contributors labels Mar 26, 2024
@stgraber stgraber added this to the incus-6.0 milestone Mar 26, 2024
@stgraber stgraber self-assigned this Mar 26, 2024
@stgraber
Copy link
Member

Confirmed the issue here:

root@v1:~# incus admin recover
This server currently has the following storage pools:
Would you like to recover another storage pool? (yes/no) [default=no]: yes
Name of the storage pool: vol1
Name of the storage backend (dir, zfs): zfs
Source of the storage pool (block device, volume group, dataset, path, ... as applicable): vol1
Additional storage pool configuration property (KEY=VALUE, empty when done): 
Would you like to recover another storage pool? (yes/no) [default=no]: yes
Name of the storage pool: vol2
Name of the storage backend (dir, zfs): zfs
Source of the storage pool (block device, volume group, dataset, path, ... as applicable): vol2
Additional storage pool configuration property (KEY=VALUE, empty when done): 
Would you like to recover another storage pool? (yes/no) [default=no]: 
The recovery process will be scanning the following storage pools:
 - NEW: "vol1" (backend="zfs", source="vol1")
 - NEW: "vol2" (backend="zfs", source="vol2")
Would you like to continue with scanning for lost volumes? (yes/no) [default=yes]:    
Scanning for unknown volumes...
The following unknown storage pools have been found:
 - Storage pool "vol1" of type "zfs"
 - Storage pool "vol2" of type "zfs"
The following unknown volumes have been found:
 - Container "a1" on pool "vol1" in project "default" (includes 0 snapshots)
 - Volume "bar" on pool "vol2" in project "default" (includes 0 snapshots)
 - Volume "foo" on pool "vol2" in project "default" (includes 0 snapshots)
Would you like those to be recovered? (yes/no) [default=no]: yes
Starting recovery...
Error: Failed import request: Failed creating instance "a1" record in project "default": Failed creating instance record: Failed initializing instance: Failed add validation for device "bar": Failed to get storage pool "vol2": Storage pool not found
root@v1:~# 

stgraber added a commit to stgraber/incus that referenced this issue Mar 27, 2024
Closes lxc#646

Signed-off-by: Stéphane Graber <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Confirmed to be a bug Easy Good for new contributors
Development

Successfully merging a pull request may close this issue.

2 participants