Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clean-unref: documentation & feat suggestion (recursivity) #116

Open
numero-744 opened this issue Jan 7, 2022 · 4 comments
Open

clean-unref: documentation & feat suggestion (recursivity) #116

numero-744 opened this issue Jan 7, 2022 · 4 comments
Labels

Comments

@numero-744
Copy link

Hi!

About documentation

cargo cache -h
...
    clean-unref    remove crates that are not referenced in a Cargo.toml from the cache

When I read this I guess that it looks around for Cargo.toml to list all the dependencies and remove data in the cache that are not used by any of them. In the README is mentioned:

  • purge cache entries not unused to build a specified crate (cargo cache clean-unref)

Then I understand that it does this only for the crate I am in. When I run the command, from my home directory, I get:

% cargo cache clean-unref
...
Failed to Cargo.toml manifest in /home/***** or downwards.

So I think the help description for this subcommand could be improved (for instance copying the README)

The suggestion

This feature as I currently understand it removes unused crates (but not binaries) that are not listed in the Cargo.toml of the crate I am currently in, therefore ignoring all the other crates. I think it can be good for CI use, but for personal use, I think that it could be interesting to have a command like clean-unref-user which (Rust-like pseudocode):

  • let used = Iterates over all Cargo.toml in user directory recursively (can take some time)
  • .map(|file| file. get the list of the dependencies ()).flatten();
  • Get the list of the repositories
  • for dependency in used { remove dependency from the list of repositories (skip if already removed or missing) }
  • The remaining repositories are not used so can be deleted.
@matthiaskrgr
Copy link
Owner

clean-unref is supposed to run inside a rust project (git repo or crate or whatever), so yeah, I need to improve the docs and perhaps the error a bit, thanks for pointing that out!

Yes, cargo-cache only touches any sort of crate sources, I don't want it to do anything with cargo installed binaries, I think that is out of scope.

Your suggestion is another clean-unref which walks the current directory (cwd) -tree and runs clean-unref not only a single crate but on all found rust projects that were found, as if they were a single project, did I understand that correctly?

@numero-744
Copy link
Author

Yes this is the idea: not keeping only the cache for dependencies of a single project, but for the union of the dependencies of all the projects of the user.

Also, I was thinking to do it user-wise to not miss dependencies, so instead of starting from cwd, start from the user home directory. If cwd were as a start point by default, if the user runs cargo cache clean-unref-user (or other given name) while in a crate, it would be the equivalent of cargo cache clean-unref and delete caches for dependencies of all the other projects of the user (unless they are also dependencies of the current project), which is probably not what the user wanted to do. Maybe the subcommand could take an optional argument which could be the start point, so the user could still start from cwd by adding . as an argument to the subcommand.

Side note: For reference, I would like to highlight why it is different from a simple shell script which would run cargo cache clean-unref in each found project. Doing so would leave in the cache only the dependencies that are in all the projects of the user, so probably nothing. The expected behavior is to retain the union of the dependencies, not the intersection.

@matthiaskrgr
Copy link
Owner

Right, so we could have something like cargo-cache clean-unref -r <path> / --recursive <path> where we can pass a directory or just "." for the cwd.

Some notes on how that could be implemented:
Use walkdir on path and crawl all the directories and search for cargo.tomls, collect them into a vec.
rayon par_iter() (I think we still don't have a better walkdir -> par_iter conversion today) over the vec and try to open the respective rust projects with cargo-metadata.
If we fail (for example if there is just some Cargo.toml of a test of a crate that is not a valid crate source) silently ignore it.
Make cargo-metadata query each project for a list of dependencies (the logic already exists and just need to be refactored a bit I hope).
Note: It might be that cargo-metadata extracts/downloads all the required dependencies in this steps, so the par_iter may not pay out a lot actually...? need to check that (the .cargo-cache might be locked for one crate at a time for this)
collect all the dependencies into one huge list and then prune all other entries from the cache (as we already do today, but only for a single crate.

@numero-744
Copy link
Author

I am not sure keeping it in the same subcommand is good for the user because if the user forgets the -r then almost everything will have to be downloaded again. I think that -r is often used where commands are more destructive with it than without it, there it is the opposite: omitting -r is more destructive than adding it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants