Make the HTML parser's scripting flag not depend on "scripting was enabled" #9426

zcorpan · 2023-06-14T18:57:54Z

https://html.spec.whatwg.org/multipage/parsing.html#other-parsing-state-flags says

The scripting flag is set to "enabled" if scripting was enabled for the Document with which the parser is associated when the parser was created, and "disabled" otherwise.

This flag affects parsing of noscript, and "scripting was enabled" is disabled for an <iframe sandbox> document (without allow-scripts) as well as these:

scripts in XMLHttpRequest's responseXML documents, scripts in DOMParser-created documents, scripts in documents created by XSLTProcessor's transformToDocument feature, and scripts that are first inserted by a script into a Document that was created using the createDocument() API.

It's also disabled for <template>'s template contents owner document, at least as implemented in WebKit and Chromium (but Gecko parses with scripting flag enabled), see http://software.hixie.ch/utilities/js/live-dom-viewer/saved/11796

This has caused mutation XSS issues in the past, see https://www.acunetix.com/blog/web-security-zone/mutation-xss-in-google-search/

I think it's not great that noscript is parsed differently for different documents that can be accessed from script. noscript was intended to show content to users when they had scripting disabled in the browser. Introducing parsing differences in different places invites XSS bugs.

Would it be web compatible to make the HTML parser's scripting flag only depend on whether the user has disabled scripting?

Alternatively, drop the scripting flag completely and always parse as if scripting is enabled, but this might regress the UX for users who disable script.

@whatwg/html-parser @whatwg/security

The text was updated successfully, but these errors were encountered:

gsnedders · 2023-06-15T09:45:06Z

Alternatively, drop the scripting flag completely and always parse as if scripting is enabled, but this might regress the UX for users who disable script.

That would be a significant UX regression, given it would mean noscript becomes meaningless, no?

The even more radical suggestion would be to parse as if scripting is disabled (so noscript still gets parsed correctly!) but with script execution. I doubt we can manage to do this (given it'll cause a performance regression for every noscript in the common case, and it has difficult-to-judge web compatibility implications), but it would be pretty great if the only thing that scripting being enabled changed was script execution itself. (And of course you can still have differential parser attacks in that case, thanks to document.write, but from a sanitizer point-of-view that's "just" a matter of removing all scripts.)

As a mostly-absentee maintainer of a library containing a sanitizer (html5lib), the noscript parsing has often scared me (because it is practically inevitable it will lead to attacks like this), and it feels like it would be a decent security win to the overall ecosystem.

zcorpan · 2023-06-15T10:28:11Z

Right it would be mostly useless. Maybe the display could still be flipped depending on scripting disabled (by the user?), but showing markup as text seems broken.

Always parsing noscript as if scripting is disabled seems like it would break web content e.g. using a meta refresh in noscript, or show duplicate content (scripted and non-scripted widgets), etc.

mozfreddyb · 2023-06-15T12:50:48Z

It would be nice if <noscript> had different display based on a flag (instead of different parsing). Not sure this can be made compatible though..

zcorpan · 2023-06-15T13:37:11Z

@mozfreddyb it does that today as well as different parsing. https://html.spec.whatwg.org/multipage/rendering.html#hidden-elements

It's not clear to me if @media (scripting) is 1:1 with HTML's "scripting is enabled"

annevk · 2023-06-15T13:40:03Z

I could see making HTML fragment parsing better account for the state of the context node's node document. If we want to completely revisit the way this works this needs some more historical context. E.g., I'm pretty sure it was fairly recently that we did the work around sandboxing and that making that work the same as scripting being disabled was an intentional design choice.

Note that changing the way <noscript> parses has other side effects beyond rendering. E.g., it can impact whether a subresource gets fetched.

zcorpan added the topic: parser label Jun 14, 2023

zcorpan mentioned this issue Jul 6, 2023

Allow self-closing tags everywhere #9491

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make the HTML parser's scripting flag not depend on "scripting was enabled" #9426

Make the HTML parser's scripting flag not depend on "scripting was enabled" #9426

zcorpan commented Jun 14, 2023 •

edited

Loading

gsnedders commented Jun 15, 2023

zcorpan commented Jun 15, 2023

mozfreddyb commented Jun 15, 2023

zcorpan commented Jun 15, 2023

annevk commented Jun 15, 2023 •

edited

Loading

Make the HTML parser's scripting flag not depend on "scripting was enabled" #9426

Make the HTML parser's scripting flag not depend on "scripting was enabled" #9426

Comments

zcorpan commented Jun 14, 2023 • edited Loading

gsnedders commented Jun 15, 2023

zcorpan commented Jun 15, 2023

mozfreddyb commented Jun 15, 2023

zcorpan commented Jun 15, 2023

annevk commented Jun 15, 2023 • edited Loading

zcorpan commented Jun 14, 2023 •

edited

Loading

annevk commented Jun 15, 2023 •

edited

Loading