Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Escaping multipart/form-data #7575

Open
michaelweiser opened this issue Feb 3, 2022 · 4 comments
Open

Escaping multipart/form-data #7575

michaelweiser opened this issue Feb 3, 2022 · 4 comments

Comments

@michaelweiser
Copy link

https://html.spec.whatwg.org/multipage/form-control-infrastructure.html#multipart-form-data

For field names and filenames for file fields, the result of the encoding in the previous bullet point must be escaped by replacing any 0x0A (LF) bytes with the byte sequence %0A, 0x0D (CR) with %0D and 0x22 (") with %22. The user agent must not perform any other escapes.

Doesn't that mean that no file (or field) name can ever legitimately contain character sequences that encode to byte sequences %0A, %0D or %22? (For most if not all encodings it'll be these character sequences exactly.)

The sender will apply no escaping but the receiving end will unescape them into bytes 0x0A, 0x0D and 0x22 respectively. From there they'll not decode back into the original character sequences, i.e. the filename will end up corrupted. For example, 100%0Applejuice.pdf would become 100<linefeed>pplejuice.pdf.

Should escaping of the escape character (0x25, %) as %25 be added here?

@annevk
Copy link
Member

annevk commented Feb 16, 2022

cc @andreubotella

@andreubotella
Copy link
Member

Chromium and WebKit have both had this behavior for many years, and they don't escape the percent sign. Given the relative rarity of such filenames, I suspect this hasn't been an issue in practice (although it seems like WebKit might consider this worth looking into).

@cdumez @mfreed7 @smaug----

@smaug----
Copy link

Sounds like an edge case, but something which probably should and could be fixed.

@mfreed7
Copy link
Contributor

mfreed7 commented Mar 21, 2022

This was reported to Chromium a few days ago, based on a bug filed against PHP.

I do agree that the current spec (and behavior) seems a bit odd here. It feels like we should add just one more explicit escape for filenames, from % --> %25. That would allow folks to distinguish between ".txt and %22.txt.

I suspect the compat story wouldn't be terrible in making this change, but there are no guarantees that it won't break something. I suppose we could add a use counter for finding % in a filename.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

6 participants