Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsoundness in crate anstream #156

Closed
burakemir opened this issue Jan 11, 2024 · 0 comments · Fixed by #158
Closed

Unsoundness in crate anstream #156

burakemir opened this issue Jan 11, 2024 · 0 comments · Fixed by #158

Comments

@burakemir
Copy link

There is an unsoundness issue with multibyte sequences.

When I give a valid UTF8 string "ö\x1b😀" as input to crates/anstream/src/adapter/strip.rs the code will be confused.
The UTF8 bytes are \xc3\xb6 then \x1b then \xf0\x9f\x98\x80.

When we loop over "non-printable bytes" \x1b\xf0 will be considered as some non-printable sequence...

I do not know whether it is a valid escape sequence or not, but it does not matter: we will produce a broken str from the incorrectly segmented bytes via str::from_utf8_unchecked, and that should never happen.

I have a tentative fix that makes the code sound, which I will reference after filing the issue (so I can reference the issue in the PR).

Full credit goes to @Ralith who reviewed this code and asked me to follow up.

epage added a commit to epage/anstyle that referenced this issue Jan 12, 2024
This fixes a soundness issue where we create invalid UTF-8 data and then
do a `str::from_unchecked` on release builds.

This ensures we ignore up-to the start of UTF-8 sequences and not
mid-way through.

Fixes rust-cli#156
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant