-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use cases for value_of/values_of #751
Comments
No worries, so far all the issues you've written have been superb and are pushing great things! Keep them coming! I think you're absolutely correct in that reasoning, and I actually agree the I can't remember exactly, but I believe my reasoning behind making invalid UTF-8 allowed by default was....whelp no I can't remember. I know I had a reason though! Maybe it was prior to adding the Changing that default is technically a breaking change, although like Rust I do believe some breaking changes should be allowed if they fix [logic] bugs, prevent security holes, or otherwise correct other unsound behavior. It should be simple to at least find github users who have used the Thoughts? Changing the docs is a quick fix too 😉 |
Ok, so on Github there are actually more than I thought using the |
I think it's perfectly fine to wait for |
One thing to consider though is this fact: if I'm writing a CLI tool on Unix and my tool accepts file paths as arguments, then I probably always want to allow invalid UTF-8 because I'm otherwise preventing my tool from working on all file paths. This is kind of a bigger picture question to answer that's probably out-of-scope for a small footgun, because it means putting UTF-8 handling front and center, which obviously sacrifices a bit of ergonomics. It's a hard line to walk and probably requiring them to explicitly use the |
Exactly, I wouldn't make a change like that unless I could reasonably sure it either breaks no code or I was able to contact the majority (if not all) users of said feature to get their approval first. And in this case I don't think that's reasonable. I have been thinking of a 3.x for a while now, so hopefully the wait won't be too long! :) |
I've tried to fix panics in Cargo (w/clap 2.x), but Cargo frequently uses such pattern (and its
Changing it to Without breaking changes, it could be improved by adding Perhaps it could be fixed at the source, by changing definition of arguments? Currently How about having |
This properly implements the behavior I was expecting where: - the list of paths will not contain duplicate entries - it will append entries from STDIN if explicitly requested (I couldn't figure out the implicit logic to work in an intuitive way) - it will default to just the current working directory if no other paths have been specified I believe that paths are not guaranteed to be valid UTF-8, so the proper way to handle that is to use OsString values instead of String values. Note that I've also dropped the external bstr crate dependency, since I'm not 100% confident on how to convert between the two, or if it will even be beneficial/necessary. I can revist that in the future once the real parsing functionality is implemented. See: - clap-rs/clap#751
It does seem like the vast majority of non-utf8 offenders in CLI args are filesystem paths, and otherwise there's little sense to be on guard against invalid utf-8, correct? Maybe introducing some sort of path-specific getter (as described in #1723 for instance) is the way to mitigating the issue?
That might not be a bad idea, but it's not really feasible implementation-wise. That's just very far image from how the current implementation works; we'd have to rewrite a good part of clap for it. Maybe in next major version 🤷 |
Opinion: Using |
|
Is it planned to fix this for clap 3? It looks like the current API as of clap 3 beta 2 still panics on invalid UTF-8: https://docs.rs/clap/3.0.0-beta.2/clap/struct.ArgMatches.html#method.value_of |
I assume so since its tagged with the 3.0 milestone. At minimum, I'd like to see hidden callers use a non-panicing API (derive, Options
I lean towards
|
There's a little bit of holistic clean up that needs to be done regarding these API methods and how they are used in derive. This is related to a few others issues too. Definitely need to be done before 3.0 |
@pksunkara how does my proposal sound for a path to move forward? |
Based on another thread (#2627), By making |
Yup. So, we basically need to change the |
@pksunkara just want to make sure you read what my comment was associated with; its not just inverting it but also moving it from an app setting to an arg setting |
Missed that, but still sounds good. This change might be a bit big for @hosseind88 though in #2623 |
Before, validating UTF-8 was all-or-nothing and would cause a `panic` if someone used the right API with non-UTF-8 input. Now, all arguments are validated for UTF-8, unless opted-out. This ensures a non-panicing path forward at the cost of people using the builder API that previously did `value_of_os` need to now set this flag. Fixes clap-rs#751
Before, validating UTF-8 was all-or-nothing and would cause a `panic` if someone used the right API with non-UTF-8 input. Now, all arguments are validated for UTF-8, unless opted-out. This ensures a non-panicing path forward at the cost of people using the builder API that previously did `value_of_os` need to now set this flag. Fixes clap-rs#751
Before, validating UTF-8 was all-or-nothing and would cause a `panic` if someone used the right API with non-UTF-8 input. Now, all arguments are validated for UTF-8, unless opted-out. This ensures a non-panicing path forward at the cost of people using the builder API that previously did `value_of_os` need to now set this flag. Fixes clap-rs#751
Before, validating UTF-8 was all-or-nothing and would cause a `panic` if someone used the right API with non-UTF-8 input. Now, all arguments are validated for UTF-8, unless opted-out. This ensures a non-panicing path forward at the cost of people using the builder API that previously did `value_of_os` need to now set this flag. Fixes clap-rs#751
Before, validating UTF-8 was all-or-nothing and would cause a `panic` if someone used the right API with non-UTF-8 input. Now, all arguments are validated for UTF-8, unless opted-out. This ensures a non-panicing path forward at the cost of people using the builder API that previously did `value_of_os` need to now set this flag. Fixes clap-rs#751
Before, validating UTF-8 was all-or-nothing and would cause a `panic` if someone used the right API with non-UTF-8 input. Now, all arguments are validated for UTF-8, unless opted-out. This ensures a non-panicing path forward at the cost of people using the builder API that previously did `value_of_os` need to now set this flag. Fixes clap-rs#751
Before, validating UTF-8 was all-or-nothing and would cause a `panic` if someone used the right API with non-UTF-8 input. Now, all arguments are validated for UTF-8, unless opted-out. This ensures a non-panicing path forward at the cost of people using the builder API that previously did `value_of_os` need to now set this flag. Fixes clap-rs#751
Sorry to keep opening issues, and this may indeed be more of a question because I've missed something, but what are the intended use cases of
value_of
/values_of
? In particular, I notice that this is part of its contract:(Pedant: I think this should say "invalid UTF-8." There is no such thing as a "UTF-8 codepoint.")
Since the value is typed by the user, this means that the user can cause your program to panic if you use
value_of
/values_of
when the user gives a value that isn't valid UTF-8. In my view, if an end user sees a panic, then it ought to be considered a bug. If my interpretation is right, then that means I should never usevalue_of
/values_of
, right?I do see one possible use case: if a caller disables the
AllowInvalidUtf8
setting, then I believe clap will return a nice error if it detects invalid UTF-8, and therefore,value_of
/values_of
will never panic. However, according to the docs,AllowInvalidUtf8
is enabled by default, so this seems like a potential footgun.(This is a good instance of the pot calling the kettle black because
docopt.rs
, for example, can't handle invalid UTF-8 at all! :-) So clap is already leagues better in this regard.)The text was updated successfully, but these errors were encountered: