-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow colons in Windows host paths #235
Conversation
Note: this notation could lead to errors like |
This looks good! Although I'm kinda worried about the possibility of errors you mentioned above, but I think I like the idea of having a |
I think there's no need for another PR with just changed separator. We should decide which one we want (@sunfishcode ?), update this PR and then merge it. If we decide to stay with single colon, then we could think about some warning, but we'd need to think about the check itself (doesn't look at first glance as something easy to check exhaustively). |
Another note which I should have put here earlier. I'm still confused. Since, if I remember correctly, call like |
Agreed: this should definitely work. An interesting question (for WASI standardization) is whether it should be allowed to use Windows-style paths inside a WASI module at all. The alternative being to standardize on *nix-style paths: no drive letters, As for which separator to use, one option would be to let the user choose: require the separator to also be mentioned at the beginning of the mapping pattern, and maybe at the end: |
@tschneidereit excellent point! As I’ve already discussed offline with @mrowqa, with CraneStation/wasi-common#41 landing, any path with slashes works as expected:
And:
Note that both are equivalent on Windows. As @mrowqa aptly pointed out, paths with backslashes however are somewhat of a mystery ATM, but we’re currently trying figure it out. |
@tschneidereit I also considered standardization of the paths. However, I don't know much about what programs we are currently compiling successfully and what may break, so it's hard for me to favor some specific option.
Other option is to have syntax like So, I suggest: use temporarily '::' just for the simplicity, and once we change the command line arguments parser, we can try using optional arguments with two values if that's possible.
They are not. Look at [1]. The second one is actually a relative path. Windows remembers current drive letter (which is used if none is specified), and for each drive, it remembers current path (so [1] https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file |
I think we just fundamentally have to do something here. We can't support OS-specific paths, at least for preopened files, i.e. where the path is set before runtime. Whether we also need to normalize all dynamic paths is a different question. If a user gets a list of all files in a directory, do we try to normalize their names? I'm not sure that's really viable.
That sounds good to me, yes. Thank you for the explanation, and for the research! |
I haven't worked on WASI, so I don't know how exactly it's implemented. @kubkon ?
You're welcome :) . I'll update the PR in just a moment. |
Ah, that's actually good to know, thanks for clarifying this! |
That's because I'm not an expert in this domain and am probably not using the right terminology myself :) I'll explain what I meant more specifically:
That's what I meant, yes, as opposed to a preopened path. (See this comment for some more thoughts on how to handle those.)
Good question! A lot, perhaps all, of this only matters if we're talking about both, really: how do host and guest paths map to each other.
What I meant by "normalization" is a way to ensure that you're talking about the same path regardless of the host system. Unfortunately I think that's just not possible. For absolute paths, we might be able to handle that by treating them as opaque and just leaving it up to the host to map them to something useful. However, for relative paths used in Say you have two calls to openat openat(dir_fd, "FileName");
openat(dir_fd, "filename"); These will result in either the same or two different files being opened, depending on whether the underlying file system is case sensitive or not. That means code that uses both of these—perhaps at entirely different places in the code base—will behave differently e.g. on (stock-) Windows and macOS vs (stock-) Linux. |
I think we should land this. If we need to change things later, we always can. One thing that'd be good is to have some kinds of tests, ideally ones that work across all supported OSs. Not sure how doable that is with out current test harness. |
macOS can be made to behave case sensitive too using |
What happens if such a process tries to create two files with the names In any case, that doesn't seem to help us much, given that WASI needs to work across a wide range of different OSs, with different filesystem semantics (and quirks.) |
I don't know. You also have to have to be root to use |
This PR looks good to me; please also update the documentation in docs/WASI-tutorial.md for the new command-line syntax. |
@tschneidereit thank you for the explanation! Now, it sounds familiar - I had discussed parts of it with @sunfishcode and @kubkon (even opened https://github.com/CraneStation/wasi-common/issues/44). What exactly would like to test? The @sunfishcode oh, I missed that one occurrence. I have updated it. |
one (probably not very good) solution is to manually check if there are files with the same case-insensitive name and either prohibit creating a differently-cased name, rename to the case on disk, or rename to the case specified in the filesystem call, with a check to bypass the case-checking/case-conversion if there are already 2 or more files with matching names. when opening an existing file, it would always fail if the case doesn't match. this would be kinda the intersection between case-sensitive and case-insensitive filesystems. |
@programmerjake Yeah, that's a good point. We could do some amount of emulation if we wanted a greater degree of portability. However we'd also pay a fair amount of overhead -- if I understand what you're saying, we'd have to try 2N files where N is the number of alphabetical characters, or scanning all the filenames in the directory, and we'd also have to think about racing with other processes. @tschneidereit I agree that we want tests, though in this case we don't have tests for the old |
at a minimum, we could have open fail when the case doesn't match. that's relatively inexpensive to implement and would probably catch most non-portable cases. |
@programmerjake do you mean we'd fail if we try to open |
@tschneidereit yes, open would fail in that case. I think it'll help portability since, in my (admittedly limited) experience, the most common case-sensitivity issue I've encountered is code that refers to other files using the wrong case when opening for reading. Common examples include resource paths and |
@tschneidereit one additional reason I think it will help is that it will catch issues during development on case-insensitive filesystems as opposed to when the code is tested on a case-sensitive filesystem (which it may not be, some people only ever test on Windows) |
created an issue for case-sensitivity: https://github.com/WebAssembly/WASI/issues/72 |
@programmerjake excellent, thank you! |
Note that the runtime behavior of checking for case-folded filenames is not Wine and Samba run into issues with deeply nested directories: |
This patch rebrands a few public APIs from typed-continuations to stack-switching. Resolves bytecodealliance#146.
This PR allows calls like
wasmtime --mapdir=.:\\?\C:\Users\user\workspace cp.wasm a.txt b.txt
.