-
-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move DirEntry::ino method to an extension trait, Support more metadata for sort_by #53
Conversation
Thanks @jeremielate! No need to update the version number yet. You're right this is a breaking change, but I think we'll update the version number later, because there are a few breaking changes in other PRs to work through. Just so we've got some links between issues, you've covered:
EDIT: My mistake, you've referenced the issues in commits, I didn't notice :) I'll spend some time digging through this change (there's a fair bit there), but @BurntSushi will probably come along at some point with feedback. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the hard work @jeremielate!
I think your sort_by
implementation is a great reference for discussion. We can work on more details in this PR conversation.
src/lib.rs
Outdated
@@ -842,6 +843,15 @@ impl DirEntry { | |||
} | |||
} | |||
|
|||
#[cfg(unix)] | |||
impl std::os::unix::fs::DirEntryExt for DirEntry { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'd think this is a perfect reason to implement DirEntryExt
in the standard library but we shouldn't do that :( Those traits are going away at some point, so we need to define our own DirEntryExt
with an ino
method and implement that.
It'll be easier to put this trait and its implementation for DirEntry
in a separate unix.rs
file.
src/lib.rs
Outdated
@@ -191,7 +190,8 @@ struct WalkDirOptions { | |||
max_open: usize, | |||
min_depth: usize, | |||
max_depth: usize, | |||
sorter: Option<Box<FnMut(&OsString,&OsString) -> Ordering + 'static>>, | |||
sorter: Option<Box<FnMut((&OsStr, Option<Metadata>), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we haven't got the Metadata
on hand already I think we'll need to consider the implications of fetching it multiple times. It might be a non-issue, but it might also be expensive.
At the very least, I think it's worth pulling these arguments into a struct
, something like:
pub struct DirEntrySort {
file_name: OsString,
metadata: Option<fs::Metadata>,
}
impl DirEntrySort {
pub fn file_name(&self) -> &OsStr {
&self.file_name
}
pub fn metadata(&self) -> Option<&fs::Metadata> {
self.metadata.as_ref()
}
}
cc: @BurntSushi
@@ -792,3 +792,18 @@ fn walk_dir_sort_small_fd_max() { | |||
assert_eq!(got, | |||
["", "/foo", "/foo/abc", "/foo/abc/fit", "/foo/bar", "/foo/faz"]); | |||
} | |||
|
|||
#[test] | |||
fn walk_dir_send_sync_traits() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! 👍
…rt_by callback. Trait DirEntryExt copied and implemented from std::os::unix::fs to a local module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, thanks again! 👍
I've just left 2 really minor comments.
I'm happy with your new sort_by
implementation, but will leave the rest to @BurntSushi
src/lib.rs
Outdated
@@ -842,6 +861,15 @@ impl DirEntry { | |||
} | |||
} | |||
|
|||
#[cfg(unix)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move this impl into the new unix
module too.
src/unix.rs
Outdated
@@ -0,0 +1,8 @@ | |||
|
|||
/// Unix specific extension methods. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Do you think we should mirror the docs on the DirEntryExt
trait in the standard library?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know how to properly link std docs, does the link to this trait may break in the future if change occur in the std ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just meant copying the docs in the standard DirEntryExt
, rather than linking to that trait. So our walkdir::unix::DirEntryExt
trait looks like:
/// Unix-specific extension methods for `walkdir::DirEntry`
pub trait DirEntryExt {
/// Returns the underlying `d_ino` field in the contained `dirent` structure.
fn ino(&self) -> u64;
}
I have added a commit referencing #38 while an older pull request has already corrected this issue, do i need to remove this last commit ? |
@jeremielate That's all good, we'll coordinate it all when merging PRs in 👍 Let's keep the rest of this PR focused on the new Please feel free to open other PRs for any other issues you want to tackle! You've done some awesome work already. |
…DirEntryExt. Moved trait implementation for DirEntry in module unix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jeremielate Thanks so much for working on this!
I think this API is going to be tricky to get right. There are many competing concerns, and I am pretty sure that the implementation in this PR is probably not the right balance. Here are my thoughts:
- This implementation ignores errors if the
metadata
call fails and instead just returnsNone
. This probably isn't acceptable. - Even if the caller only sorts by
OsStr
, they are still forced to pay the cost of an extra stat call becausemetadata
is called unconditionally. - This introduces a new public type just for the sorting API, but there's no real obvious reason why we can't just use
DirEntry
instead. If we passed&DirEntry
to the sort callback, then we could remove the extra public type, and the caller could avoid the extra stat call when they don't need it. The downside of this approach is that if the caller callsmetadata
in the comparator, then each call tometadata
will result in another stat call, which is unfortunate for performance reasons, but also because each call tometadata
could yield a different result. And therefore, you'll end up sorting data that's changing during the sort, which is bad.
So I'm not sure what the right path is here. Technically, we could solve all of the problems by introducing a new public type that lightly wraps DirEntry
by caching any metadata
retrievals. But this seems clunky and unfortunate.
On top of all of that, passing &DirEntry
to the comparator will require some not-so-nice refactoring inside walkdir
. Right now, the actual DirEntry
is constructed as late as possible. To make this work, we probably want to change DirList::Closed
to hold a vec::IntoIter<Result<DirEntry>>
instead of a vec::IntoIter<Result<fs::DirEntry>>
. And that in turns probably means threading the current depth through some more code.
Thoughts? Ideas? Other simple solutions that I'm missing?
&self.file_name | ||
} | ||
|
||
/// Optional metadata. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you insert a blank ///
line after this?
@BurntSushi Maybe we can start by listing some requirements for the From what I can gather:
I need to go back and refresh my memory of how this all works, but it sounds like either caching I'm saying this out of ignorance of |
@KodrAus Sounds like you've got the requirements right, although I'm actually not too concerned about internal complexity. I think we can deal with that. What I'm worried about is introducing new public types that complicate the API that end users need to understand. The code should be able to be refactored such that we build the
Assuming we're OK with (2), does that mean we need to convince ourselves to be OK with (1)? |
Ah my mistake, for some reason I was thinking So I was basically already expecting (2), but I guess it's a bit of a subtle deviation from For (1) I'd be more surprised about So at this stage I don't find myself too surprised by those points, but I don't think I'm really representative of |
@alexcrichton Do you have any thoughts here? I think you did a good chunk of The overall problem we're trying to solve here is to support the Today, we pass a pair of Note that the plan is to release |
Yeah I agree that my expectation here is that a pair of If I understand it right then calling |
Right. That, and also performance concerns. The I guess I'm OK just moving forward and passing |
In theory the user could have a cache in this closure, right? That way you could use |
Yeah, I suppose they could. |
With #70 merged, I think we might want to pair this PR down to just the extension trait. What do you think @jeremielate? |
Hi @jeremielate, do you still have time to work on this? If not that's totally fine, but we appreciate your contribution and want you to have the chance to get it merged in. |
Is it right to break the api when making improvements ? If so, do I need to change the version number ?