Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid cross-device link (os error 18) when upgrading on a docker OverlayFS #1239

Closed
wraithan opened this issue Aug 21, 2017 · 18 comments
Closed
Labels
help wanted O-containers Not really OS-related, but container-specific

Comments

@wraithan
Copy link

$ rustup update nightly
info: syncing channel updates for 'nightly-x86_64-unknown-linux-gnu'
info: latest update on 2017-08-21, rust version 1.21.0-nightly (8c303ed87 2017-08-20)
info: downloading component 'rustc'
info: downloading component 'rust-std'
info: downloading component 'cargo'
info: downloading component 'rust-docs'
info: removing component 'rustc'
info: rolling back changes
error: could not rename component directory from '/root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/etc' to '/root/.rustup/tmp/x5u5mnp0hhtywco8_dir/bk'
info: caused by: Invalid cross-device link (os error 18)

std::fs::rename() basically doesn't work on OverlayFS as far as I can tell by looking at other similar reports for various languages and projects hitting cross-device link errors on OverlayFS is boils down to using the rename syscall.

I'd like to propose wrapping the std::fs::rename() calls and if on linux detect os error 18 attempt to do a copy and delete instead. There are periodic other reports of errors like this on various platforms, the wrapper could try to handle the other OS cases too if they have a similar error code (or maybe even the same one if this is standard, I'm not sure).

Interestingly there is the bootstrap/update problem where folks who are experiencing may be unable to update their rustup install and not be able get the update that fixes the problem once there is a solution. Those folks will need to be advised to reinstall their rustup.

If the proposed solution to the problem works for the dev team, I'll attempt to provide a PR within a week of getting the go ahead.

This is relevant because some people use a common Docker image for their CI environments that may not be updated frequently enough for beta/nightly and have rustup update $desired_env in their script. Which is how I found this problem.

@wraithan
Copy link
Author

Spoke with @nrc and @alexcrichton on IRC and they said this seemed reasonable. I'll put forward an implementation this week.

@cyplo
Copy link
Contributor

cyplo commented Dec 17, 2017

Heyo ! Any news on this one ? I encounter this bug regularly when doing builds on dockerized CI. Let me know if there is any more info I can provide.

@cyplo
Copy link
Contributor

cyplo commented Dec 17, 2017

Looking at the sources, there already exists a wrapper function called utils::rename_file, it's used by components and transaction. Would that be a good candidate here to replace every other call to fs::rename ?

@ishitatsuyuki
Copy link

For those affected by this bug, see the renaming section of the kernel documentation.

@wraithan fs::rename inside std implies atomicity. For a renaming operation that doesn't fail, we should put it in a separate crate, as copying will likely involve locking.

@cyplo
Copy link
Contributor

cyplo commented Apr 6, 2018

Heya, thank you to @nrc for taking a look :) (https://internals.rust-lang.org/t/contributing-to-rustup-help-with-code-structure-needed/7193). I'm thinking of trying to tackle this bug, I like writing the replication test first, so would probably focus on this.; to try to inject the fault in the test and see what's what. Let me know if someone else wants to look into that as well, we can combine forces :)

@cyplo
Copy link
Contributor

cyplo commented Jun 8, 2019

Hi, I haven't had much time to finish working on this and the issue is still present for newest rustup.
Let me know if anyone would like to pick this one up.

@workingjubilee
Copy link
Member

@rustbot label: +O-containers

@rustbot rustbot added the O-containers Not really OS-related, but container-specific label May 21, 2021
@CatarinaPedreira
Copy link

CatarinaPedreira commented Sep 14, 2021

Hey all! Any news on this? @cyplo @wraithan @workingjubilee
I'm building a docker image and get "Invalid cross-device link" in the RUN rustup update nightly instruction of my Dockerfile.

@ishitatsuyuki thanks for the documentation. I see that this problem has to do with "redirect_dir" being disabled. So any idea how to enable it through the Dockerfile?

@ishitatsuyuki
Copy link

@CatarinaPedreira If you need to work around the issue, just remove the toolchain and install it again. I think it would avoid involving renaming across overlayfs boundary.

@CatarinaPedreira
Copy link

@ishitatsuyuki Thanks for the quick reply. I'll do that then, thank you :)

@kinnison
Copy link
Contributor

Copy+Delete would be exceedingly slow because the rename stuff is used in our transactional filesystem accessing code. If we had to open+open+{read,write,loop}+close+close rather than rename then our toolchain update process would become immensely slow. Perhaps we can detect that particular OS error by attempting a rename on something innocuous first, and if that fails, refuse to update a toolchain on such a filesystem. Though that would prevent the installation of new components/targets too. More thought needed, but in the short term the workaround is to either not include a toolchain in your underlying docker image, or else remove and then install the toolchain in your CI.

@CatarinaPedreira
Copy link

Thank you @kinnison !

@ThatGeoGuy
Copy link

As of rust 1.63.0 I seem to be encountering this issue again during the clippy stage. Posting the relevant log:

$ CARGO_HOME=/usr/local/cargo rustup update stable
info: syncing channel updates for 'stable-x86_64-unknown-linux-gnu'
info: latest update on 2022-08-11, rust version 1.63.0 (4b91a6ea7 2022-08-08)
info: downloading component 'clippy'
info: downloading component 'cargo'
info: downloading component 'rust-std'
info: downloading component 'rustc'
info: removing previous version of component 'clippy'
info: rolling back changes
error: could not rename component file from '/usr/local/rustup/toolchains/stable-x86_64-unknown-linux-gnu/share/doc/clippy' to '/usr/local/rustup/tmp/1vsy16kvdse0rwk9_dir/bk': Invalid cross-device link (os error 18)
Cleaning up file based variables 00:00
ERROR: Job failed: command terminated with exit code 1

Could this have creeped back in somewhere?

@rbtcollins
Copy link
Contributor

No, what is happening is that you are updating toolchain across docker layers. Either or the correct toolchain in your docker build, or remove and reinstall your toolchains

@bonigarcia
Copy link

I'm experiencing this issue in Fedora Linux. The same logic works nicely in Debian-based systems like Ubuntu, but the error Invalid cross-device link (os error 18) happens in Fedora. The steps to reproduce it are:

  1. Download Edge from https://packages.microsoft.com/repos/edge/pool/main/m/microsoft-edge-stable/microsoft-edge-stable_123.0.2420.53-1_amd64.deb
  2. Extract the content of the DEB file
  3. Try to move the resulting parent folder to a different path using fs::rename()

@djc
Copy link
Contributor

djc commented Mar 27, 2024

@bonigarcia what does your problem have to do with rustup?

@bonigarcia
Copy link

@djc I believe this problem happens in fs::rename(). If this is not the right place to discuss it, do you know where I should report it?

@djc
Copy link
Contributor

djc commented Mar 27, 2024

The rust-lang/rust issue tracker covers the standard library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted O-containers Not really OS-related, but container-specific
Projects
None yet
Development

No branches or pull requests