-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Strategizing about opendir/readdir #4440
Comments
Another issue that I came across while looking into this was #1770. The existing So an API that looks nice to me is to have Are those issues open to revive discussion on? |
I wrote above:
I should have said In any case, even though I misdescribed things a bit, I'd still like to raise and discuss the question of whether we should be doing that extra syscall for every directory entry returned. (This is potentially even more wasteful since we're not yet returning a stream of directory entries.) |
Thanks for the detailed and thorough write up @dubiousjim, this all seems reasonable to me. I'd prefer the sparse info with |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
I'd like to implement a
Dir
class, and functionsopendir
/opendirSync
that construct it. Most of this looks to be easy to implement, but there are some design choices it'd be helpful to get feedback on first. (Rather than implementing things one way, then needing to go back and redo it.)As a reminder, here are some things currently in master:
PRs I have queued in the #4017 series will propose these changes:
The
chown
functions are commented because I haven't implemented them yet, and am not sure whether the machinery is there to let me do so.For
opendir
and friends, I was thinking something like this:The factory functions (
opendir
/opendirSync
) would throw an error ifpath
didn't refer to a directory. (I'd suggest at the same timeopen
andopenSync
should be made to throw an error ifpath
did refer to a directory. Though perhaps not if it refers to a other things that are neither directories nor files.) We could use the existingop_open
to implement all of these.If that all sounds reasonable, the main strategic questions remaining have to do with
readdir
andFileInfo
.Ideally, there'd be a version of
std::fs::read_dir
in the Rust libs that took aRawFd
as argument rather than aPath
. And also a tokio-ized version of the same. But sadly there is neither. I tried poking around inlibstd/sys/unix/fs.rs
in the Rust sources to see how hard this would be to implement. But to get it to compile, I had to keep pulling in duplicates of more and more of the internal machinery of that file. In the end this didn't seem a promising way to go. Plus it would involve lots of calling into thelibc
crate withunsafe
code blocks. (As the Rust std libs must.)We could work on getting Rust to include such functionality in their std libs, and even offer them an implementation, but I suspect even if they're open to merging that in the near future it would take a while before it's exposed in a version of Rust we're using.
We're already using the
nix
package for some things on the Unix side (Linux/Mac). And they have an interfacenix::dir::Dir
, which can be constructed using either paths or fds. That looks good for our purposes. (Downsides: it's Unix-only, and also blocking, so we'd have to continue usingblocking_json
fromcli/ops/dispatch_json.rs
here, rather than moving to async functions fromtokio::fs
, as proposed in refactor: ops/fs.rs should use tokio::fs where applicable #4188.) I think this is the most promising way to go.Some issues about the
nix::dir::Dir
implementation though. The iterator over directory entries that it provides is sparser than the one provided bystd::fs::read_dir
(and its counterpart intokio::fs
). It only provides something like this for each entry:The
file_type
information is only populated bynix::dir::Dir
if it's available without making an extralstat
system call. (Some Linux filesystems would require that extra call. I don't know which ones.) So what's naturally available to us if we go this way would only be a sparser version of our existingFileInfo
. At the same time, it would be more efficient than thestd::fs::read_dir
version, which goes ahead and does the extralstat
call to fill in the additional fields, which won't always be needed. Of course, we could go ahead and do the same with the results ofnix
's directory iterator (usingnix::dir::Dir::openat(dirfd, filename, ...)
). But that seems wasteful. Why not instead just provide the minimal information, and give the caller the means to retrieve more info if they want it. (Using calls likefstatat
,openat
etc that are also present innix
. I'm assuming that from aDir
class, our implementation will only have arid
saved from which it can get an open dirfd. It won't have --- and there need not even anymore exist --- a valid path that could be used to reach that directory.)If the strategy just sketched seems attractive for
opendir(path).readdir()
, perhaps we should also consider it forreaddir(path)
. That is, have this not provide a fullFileInfo
like one gets fromstat
/lstat
, but a sparser thing like described above.That is in fact what other languages/hosts do in their std libs.
fs.DirEnt
has the fields:name: string|Buffer
,isFile(): boolean
,isDirectory(): boolean
, and so on.os.DirEntry
has the fieldsname: bytes|str
,path: bytes|str
,inode()
,is_dir(follow_symlinks=False)
,is_file(follow_symlinks=False)
,is_symlink()
, andstat(follow_symlinks=False)
os.Readdir
returns a list ofFileInfo
structures, same as is returned by itsos.Lstat
andos.Stat
. But this is somewhat sparser than ours; and they also have the functionos.Readdirnames
that returns just a list of names.I'm ignoring in all of this the issues raised in [Discussion] Decide on reading directory API for 1.0 release #4277 and Make
Deno.readDir
actually streaming #4218, which have to do with whetherreaddir
provides its results all at once or as a stream. The latter seems ultimately desirable, but I don't know how to make the Deno Rust guts save the state needed to implement an iterator. In any case, I think those issues are orthogonal to these.The text was updated successfully, but these errors were encountered: