Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Content-addressed cache? #964

Open
nikclayton opened this issue Feb 26, 2021 · 1 comment
Open

Content-addressed cache? #964

nikclayton opened this issue Feb 26, 2021 · 1 comment

Comments

@nikclayton
Copy link

One of the caveats in the documentation is:

Absolute paths to files must match to get a cache hit. This means that even if you are using a shared cache, everyone will have to build at the same absolute path (i.e. not in $HOME) in order to benefit each other.

This is a problem for single-developer workflows that use git worktree. With git worktree I can check out multiple branches of the same repository to distinct local directories. But because they're distinct, there's no cache benefit.

For example, I have a complex project with hundreds of dependencies. Just building from a clean checkout takes about 3 minutes.

So if I do:

% cd /my/repo/root
% git worktree -b branch1 /my/worktree/branch1
% cd /my/worktree/branch1
% cargo build

This takes 3 minutes.

If I then do

% cd /my/repo/root
% git worktree -b branch2 /my/worktree/branch2
% cd /my/worktree/branch2
% cargo build

this also takes 3 minutes, and it's re-building everything I just built for branch1.

I went poking through the code, and it looks like https://github.com/mozilla/sccache/blob/master/src/compiler/rust.rs#L1458-L1462 is where the file names (source_files) are included in the computation of the hash key.

But earlier in that function, the file contents are also included in the hash key computation. So I think the filename is redundant here.

I changed that line in a local build to ignore source_files:

let inputs = abs_externs.into_iter().chain(abs_staticlibs).collect();

and now the example I gave above takes 3 minutes for a build with a cold cache (the branch1 example), but the branch2 example now takes 1m32s, because it's able to reuse the cached build artifacts.

And all the sccache tests pass with this change.

I imagine it can't be this simple to fix the "Absolute paths must match" restriction and use sccache as a semi- content-addressed cache, and there's some subtlety that I've missed.

Or is that all that's necessary?

I couldn't find anything else that discussed this -- if this has already been covered in another forum and I couldn't find it there's no need to repeat the issues here, just link to them and I'll go and do some more reading...

@luser
Copy link
Contributor

luser commented Feb 26, 2021

There's some discussion in #35. Mostly it comes down to whether the pathnames wind up in the output files. If debug info is enabled, then they generally do. Compiler options that allow path remapping could be used to mitigate this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants