Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std::os - Add join_paths, make setenv non-utf8 capable #15280

Closed
wants to merge 1 commit into from

Conversation

aturon
Copy link
Member

@aturon aturon commented Jun 30, 2014

This commit changes os in two ways:

  • It adds a join_paths function that is the converse to split_paths,
    easing manipulation of the PATH environment variable according to
    platform conventions.
  • It changes setenv to take a BytesContainer rather than a &str
    value, since environment variables may have non-utf8 values on some
    platforms. Since &str is a BytesContainer, this is not a
    breaking change.

Along the way, it also refactors the split_paths function so that
cfg switches are applied internally (and the function header is given
only once). This fixes a bug: the doc comment had an example for only
one platform.

@lilyball
Copy link
Contributor

You forgot to remove the #[cfg(unix)] on line 451 that modified fn split_paths().

@aturon
Copy link
Member Author

aturon commented Jun 30, 2014

@kballard Thanks, fixed.


/// Given a collection of paths, join them together into a single bytestring
/// according to the platform's conventions for the `PATH` environment
/// variable. Drops empty paths.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an excessively long summary for the function. Can you write up a shorter one-line summary first? Remember, the first line is displayed as a one-line summary in the API documentation, and the rest of the docstring is only displayed when looking at the function itself.

Similarly, split_paths() has an excessively long summary and could benefit from a rewrite there as well.

@lilyball
Copy link
Contributor

I don't think dropping empty paths is the correct behavior, not for join_paths(), and not for the existing split_paths(). Notably, Bash treats an empty path segment in PATH as referring to the current directory (whether it's due to a leading or trailing colon, or two adjacent colons).

@lilyball
Copy link
Contributor

Pondering on this a bit more, split_paths() returns a vector of Path objects, which can't strictly speaking represent an empty path. But it can represent ., and that's actually how it interprets constructing a Path from "" (e.g. Path::new("") returns the same thing as Path::new(".")). Given that, get rid of the filter for empty paths, and it should just work.

@lilyball
Copy link
Contributor

Note that the above applies to unix. I don't know how Windows handles empty path segments.


for path in paths.iter().map(|p| p.container_as_bytes()) {
if path.len() == 0 { continue }
if joined.len() != 0 { joined.push(sep) }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would look better as

for (i, path) in paths.iter().map(|p| p.container_as_bytes()).filter(|b| !b.is_empty()).enumerate() {
    if i > 0 { joined.push(sep) }

(modulo the removal of empty path filtering; see my PR comments).

@aturon
Copy link
Member Author

aturon commented Jun 30, 2014

@kballard Thanks for the thorough review, as always! JFYI: Windows always includes the current working directory on the search path. But I still think you're right about dropping empty paths.

@lilyball
Copy link
Contributor

If the current working directory is always included, then keeping empty paths should be harmless. The user can filter them out if they want. This also makes it more general (for use in splitting other env vars, not just PATH).

@aturon
Copy link
Member Author

aturon commented Jun 30, 2014

@kballard good points and agreed. I'm taking the suggestion and getting rid of the filter. I'll ping you once tests pass here and I've pushed.

@aturon
Copy link
Member Author

aturon commented Jun 30, 2014

@kballard Finally got back to this. I've updated the PR, and I believe addressed all of your comments, but please let me know if I missed anything.

///
/// Returns an `Err` result if one of the input `Path`s contains an
/// invalid character for constructing the `PATH` variable (a double
/// quote on Windows or a colon on Unix).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this also mention what the &'static str returned contains? (I presume an error message?)

This commit changes `os` in three ways:

* It adds a `join_paths` function that is the converse to `split_paths`,
  easing manipulation of the `PATH` environment variable according to
  platform conventions.

* **Breaking change**: It changes `split_paths` to no longer drop empty paths, since they are
  meaningful to some shells (where they are synonymous with the current
  working directory).

* It changes `setenv` to take a `BytesContainer` rather than a `&str`
  value, since environment variables may have non-utf8 values on some
  platforms. Since `&str` is a `BytesContainer`, this is *not* a
  breaking change.

Along the way, it also refactors the `split_paths` function so that
`cfg` switches are applied internally (and the function header is given
only once). This fixes a bug: the doc comment had an example for only
one platform.

[breaking-change]
@aturon
Copy link
Member Author

aturon commented Jul 1, 2014

@kballard OK, I think I've fixed all the embarrassing bugs you found. See if you can find some more?

let n: Vec<u16> = n.utf16_units().collect();
let n = n.append_one(0);
let v: Vec<u16> = v.utf16_units().collect();
let v: Vec<u16> = str::from_utf8(v).unwrap().utf16_units().collect();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

str::from_utf8(v).unwrap() introduces failure if the input is not valid utf-8.

Perhaps setenv() needs a bool return value, with documentation stating that it may return false on Windows (and only Windows).

in_progress.push(*b);
}
parsed.push(Path::new(in_progress));
parsed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the above notes on how bash splits paths, I am curious how Windows handles the same circumstances (empty paths, paths containing just semicolons, etc).

bors added a commit that referenced this pull request Jul 3, 2014
Closes #15276 (Guide: if)
Closes #15280 (std::os - Add join_paths, make setenv non-utf8 capable)
Closes #15314 (Guide: functions)
Closes #15327 (Simplify PatIdent to contain an Ident rather than a Path)
Closes #15340 (Guide: add mutable binding section)
Closes #15342 (Fix ICE with nested macro_rules!-style macros)
Closes #15350 (Remove duplicated slash in install script path)
Closes #15351 (correct a few spelling mistakes in the tutorial)
Closes #15352 (librustc: Have the kind checker check sub-bounds in trait casts.)
Closes #15359 (Fix spelling errors.)
Closes #15361 (Rename set_broadast() to set_broadcast().)
Closes #15366 (Simplify creating a parser from a token tree)
Closes #15367 (Add examples for StrVector methods)
Closes #15372 (Vec::grow should use reserve_additional, Vec::reserve should check against capacity)
Closes #15373 (Fix minor issues in the documentation of libtime.)
@bors bors closed this in #15377 Jul 3, 2014
bors added a commit to rust-lang-ci/rust that referenced this pull request Jul 17, 2023
minor: Bump rust-version to 1.70 and use the workspace one in xtask

CC rust-lang/rust-analyzer#15279
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants