Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Error] Nvidia integration mounts files as read-only that prevent installing some packages inside the container. #1500

Closed
Butakus opened this issue Jul 29, 2024 · 10 comments
Labels
bug Something isn't working

Comments

@Butakus
Copy link

Butakus commented Jul 29, 2024

Describe the bug
I found this problem when trying to install the libboost1.74-dev package in a Ubuntu 22.04 container with a Ubuntu 24.04 host created with nvidia integration (--nvidia flag).

When creating a container with the --nvidia flag, distrobox-init will mount as read-only all files under /usr/ that have "nvidia" in the name.

In my case, I had the libboost1.83-dev package installed in the host (Ubuntu 24.04), which installs some header files under /usr/include/boost/... with the string "nvidia" in the file name, that are unrelated to the nvidia driver or libraries. The problem appears in the container (Ubuntu 22.04) if I try to install boost (libboost1.74-dev), because it will try to overwrite those header files that were mounted as read-only. This is the error from apt:

Unpacking libboost1.74-dev:amd64 (1.74.0-14ubuntu3) ...
dpkg: error processing archive /tmp/apt-dpkg-install-Zcdiwt/18-libboost1.74-dev_1.74.0-14ubuntu3_amd64.deb (--unpack):
unable to make backup link of './usr/include/boost/compute/detail/nvidia_compute_capability.hpp' before installing new version: Invalid cross-device link
dpkg-deb: error: paste subprocess was killed by signal (Broken pipe)

I believe those files should not be mounted in the container, even if they have "nvidia" in the file name.

It is possible to pass an option to dpkg to ignore those files, but this is not a solution, because the files installed in the host and the container are from different versions of boost.

To Reproduce

  1. [Prerequisite] Have a host distro with boost installed. I only tested this with Ubuntu 24.04 in the host and 22.04 in the container, but it should be possible to replicate it with other distros.
  2. Create a new box with Nvidia integration:
distrobox create --image docker.io/library/ubuntu:22.04 --name boost_test --nvidia
  1. Try to install boost libraries inside the container:
distrobox enter boost_test
apt install -y libboost1.74-dev

It should not be possible to install because of files mounted by distrobox-init.

Expected behavior
When using the --nvidia flag, the files unrelated to the nvidia driver / libraries should not be mounted in the container.

It should be possible to install boost in the container, even if boost is already installed in the host. This issue might also happen with different packages (not only boost) if they have any file with "nvidia" in the file name.

Desktop (please complete the following information):

  • Are you using podman, docker or lilipod? Which version or podman, docker or lilipod?
    podman 4.9.3
  • Which version of distrobox?
    1.7.0
  • Which host distribution?
    Ubuntu 24.04
  • How did you install distrobox?
    From Ubuntu's universe repository through apt.

Additional context
I am aware of issue #1054, but this is different, as I am not trying to install cuda or other Nvidia library, just the boost libraries. I think it might not be a good idea to mount all files that are found with "nvidia" in the name, but I have zero idea about the best way to solve this.

@Butakus Butakus added the bug Something isn't working label Jul 29, 2024
@juanscelyg
Copy link

Hi @Butakus

The same situation than you . . . The same host and the same image in container.

Preparing to unpack .../libboost1.74-dev_1.74.0-14ubuntu3_amd64.deb ...
Unpacking libboost1.74-dev:amd64 (1.74.0-14ubuntu3) ...
dpkg: error processing archive /var/cache/apt/archives/libboost1.74-dev_1.74.0-14ubuntu3_amd64.deb (--unpack):
 unable to make backup link of './usr/include/boost/compute/detail/nvidia_compute_capability.hpp' before installing new version: Invalid cross-device link
dpkg-deb: error: paste subprocess was killed by signal (Broken pipe)
Errors were encountered while processing:
 /var/cache/apt/archives/libboost1.74-dev_1.74.0-14ubuntu3_amd64.deb
E: Sub-process /usr/bin/dpkg returned an error code (1)

@ToolmanP
Copy link

ToolmanP commented Sep 10, 2024

I think we might add something like .gitignore to explicitly exclude the files with the same semantics as the gitignore in a certain conf file like (.distrobox-ignore) in user's config directory. I think that would make sense if we want to exclude some host files instead of globbing everything "nvidia".

@ToolmanP
Copy link

I've written a patch to address with a new option

--nvidia-exclude:	directories to be excluded when nvidia is integrated (e.g. /usr/share/cmake)

I think you guys can use --nvidia-exclude /usr/include/boost to exclude boost from nvidia import.

@Butakus
Copy link
Author

Butakus commented Sep 19, 2024

Thanks for the patch, looks like a valid workaround. Does this mean that we will have to include the --nvidia-exclude flag on every distrobox enter command?

Also, as a side question, Would it be possible to implement this the opposite way? I mean, finding what files are created/used by the nvidia driver/libraries and selecting those nvidia files and directories for mounting, instead of using find to grab all "nvidia" files. No one should expect in the default configuration to get some random library mounted in the container because it has files with the "nvidia" name in it.
I don't know if these files and directories are consistent between distributions and/or Nvidia versions, which would make this approach impossible.

@ToolmanP
Copy link

Does this mean that we will have to include the --nvidia-exclude flag on every distrobox enter command?

No actually, it only happens on the creation of the container. In fact, I failed to find command to change the mount options after creation for podman.

@ToolmanP
Copy link

I don't know if these files and directories are consistent between distributions and/or Nvidia versions, which would make this approach impossible.

I also thought about this before, maybe we can take advantage of the host's package manager if possible, but the problem is that Nvidia Driver installation has also manual installation without using package manager, i don't know whether we can get a filelist from that driver.

@ToolmanP
Copy link

ToolmanP commented Sep 23, 2024

Also, as a side question, Would it be possible to implement this the opposite way? I mean, finding what files are created/used by the nvidia driver/libraries and selecting those nvidia files and directories for mounting, instead of using find to grab all "nvidia" files. No one should expect in the default configuration to get some random library mounted in the container because it has files with the "nvidia" name in it.

However, only a small portion of packages maybe get conflicts with nvidia names, and i deem that maybe we can manually display a widget and print all the directories and let the users exclude the directories is a better way to handle this issue.

@2aecfff4
Copy link

@89luca89
Hi,
The issue is not completely fixed.
The same issue happens with cmake-data on Fedora when a box is created with the --nvidia flag.

distrobox create --image fedora:latest --name fedora-dev --nvidia --home ~/.distrobox/fedora-dev
distrobox enter fedora-dev
sudo dnf install cmake
Error unpacking rpm package cmake-data-3.28.2-1.fc40.noarch
  Installing       : cmake-3.28.2-1.fc40.x86_64                                                                                                                                                                                 11/11 
error: unpacking of archive failed on file /usr/share/cmake/Modules/Compiler/NVIDIA-CUDA.cmake;670a5d86: cpio: rename failed - Device or resource busy
error: cmake-data-3.28.2-1.fc40.noarch: install failed

89luca89 added a commit that referenced this issue Oct 12, 2024
@89luca89
Copy link
Owner

Thanks @2aecfff4 I've pushed a new commit that should fix all of these occurrencies

@2aecfff4
Copy link

@89luca89
I can confirm that it works perfectly now.
Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants