-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[path] Also match parent dirs in include/exclude #4256
Conversation
NOTE: This is a major change in pattern matching for `include` and `exclude` fields, and can result in additional inclusion/exclusion for some patterns. Previously, for inclusion/exclusion matters, Cargo only works with paths of files in a package/repository, and glob pattern matching has been applying only to these file paths. The old behavior results in some unexpected behavior. For example, having: ```toml exclude = ["data"] ``` in a manifest next to a `data` directory, it will not exclude the directory. To make it work, a pattern must be provided that matches the *files* under this directory, like: ```toml exclude = ["data/*"] ``` To make Cargo's inclusion/exclusion behavior more intutional, and bring it on par with similar systems, like `gitignore`, we need to also match these patterns with the *directories*. The directories are seen internally as *parents* of the files. Therefore, this diff expands the pattern matching to files and their parent directories. Now, it's easier to exclude all data files: ```toml exclude = ["data"] ``` or include only the `src` directory: ```toml include = ["src"] ``` Fixes <rust-lang#3578>
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @matklad (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see the contribution instructions for more information. |
Making However, I am worried a bit that we write a bit of a custom matching algorithm here, so there's a high chance that it will work differently from |
I tested out Spec for gitignore: https://git-scm.com/docs/gitignore#_pattern_format Here's my test: The lines marked with XXX are the disagreements with gitignore matching. Basically, there are a few major differences between glob/globset and gitignore matching:
Do you think we should put the logic for these cases into one of the upstream repos, maybe behind their Options? @BurntSushi, @alexcrichton? Can we start with this diff, that puts the logic in one function here in Cargo, and later switch to the upstream functionality when available? If so, I can add to this implementations for both cases 2 and 3, to complete it. |
I have added a real |
Awesome investigation @behnam! Looks like ripgrep does quite a bit of massaging of Taking a step back, current doc say
So the current behavior matches documentation. What we actually want to do here is to change the behavior to "The syntax is the same as in .gitignore". So this change is not exactly a bug-fix. |
So that's what you meant, @matklad, by And good point about the behavior being documented. Although, it still doesn't explain why So, we have these options now, I suppose:
IMHO, the last option is better than the rest, because not only solves the current problem, also uses a more mainstream solution for the specific problem (since gitignore format is more specific to inclusion/exclusion than UNIX glob). And if we go with that, then we can move the gitignore logic of ripgrep into its own crate, to make it more available, like glob/globset. (UDPATE: Actually the libraries already exists: https://crates.io/crates/gitignore (based on What do you think? |
I think we should ask @alexcrichton opinion here :) First of all, a small clarification:
It already lives in a separate crate: https://crates.io/crates/ignore I think we do want to switch from "The syntax of each element in this array is what rust-lang/glob accepts" which is not very user friendly to "the syntax is the same as in .gitignore". Between |
Agreed, @matklad. So let's wait for @alexcrichton now. Also, if we go with option 3, I would like to propose two additions:
|
Thanks for all the investigation here! I personally really like the idea of moving to literally "gitignore syntax" as it's intuitive and well spec'd already. I think we should move to the I agree though with @matklad that we probably want to do this gradually. I think we should start by warning if globs exclude more than gitignore directives, and then eventually we can switch to gitignore directives warning if globs are different, and then finally later we can remove the |
Awesome! I filed a new issue here to track the steps and progress: #4268 I wrote it based on all we discussed here, but please check it if you can and make edits if needed. So I'll close this and submit a new PR for the new issue. |
NOTE: This is a major change in pattern matching for
include
andexclude
fields, and can result in additional inclusion/exclusion forsome patterns.
Previously, for inclusion/exclusion matters, Cargo only works with paths
of files in a package/repository, and glob pattern matching has been
applying only to these file paths.
The old behavior results in some unexpected behavior. For example,
having:
in a manifest next to a
data
directory, it will not exclude thedirectory. To make it work, a pattern must be provided that matches the
files under this directory, like:
To make Cargo's inclusion/exclusion behavior more intutional, and bring
it on par with similar systems, like
gitignore
, we need to also matchthese patterns with the directories. The directories are seen
internally as parents of the files. Therefore, this diff expands the
pattern matching to files and their parent directories.
Now, it's easier to exclude all data files:
or include only the
src
directory:Fixes #3578