Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: DOM Clobbering CVE #22688

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Sec-ant
Copy link

@Sec-ant Sec-ant commented Oct 7, 2024

The shell.js script has the similar DOM Clobbering vulnerability as described in CVE-2024-47068. This PR fixes it following the patches applied in the rollup repo.

The test_emsize.js is genereted and affected by the shell.js patch, so the test fixtures are regenerated and rebaselined using the following command as suggested in test_other.py:

emcc test/hello_world.c -Oz --closure 1 -o test/other/test_emsize.js

@Sec-ant
Copy link
Author

Sec-ant commented Oct 7, 2024

The failed test seems due to an unresponsiveness of the test harness browser. The test should pass if rerun.

@Sec-ant Sec-ant changed the title fix: DOM Clobbering CVE, similar to CVE-2024-47068 fix: DOM Clobbering CVE Oct 7, 2024
@juj
Copy link
Collaborator

juj commented Oct 7, 2024

Can you describe the vulnerability in detail in the context of Emscripten? I tried to read the linked CVE, but I was not quite able to grok it. :/

@juj
Copy link
Collaborator

juj commented Oct 7, 2024

My impression from the code change is that Emscripten should not depend on document.currentScript.src containing safe information, as an attacker could have assigned

document.currentScript.src = 'http://rogueserver.com/that/steals/your/data/';

Is that what is happening?

Although if that is the case, couldn't an attacker do

document.currentScript = {
  src: 'http://rogueserver.com/that/steals/your/data/',
  tagName: 'SCRIPT'
};

that would also pass the changed code in this PR?

@Sec-ant
Copy link
Author

Sec-ant commented Oct 8, 2024

My impression from the code change is that Emscripten should not depend on document.currentScript.src containing safe information, as an attacker could have assigned

document.currentScript.src = 'http://rogueserver.com/that/steals/your/data/';

Is that what is happening?

Not exactly. The issue here isn't that attackers can directly change document.currentScript. Instead, they can trick the script into using the wrong data. For example, an attacker could add an HTML element, like an <img> tag, with a name attribute set to "currentScript".

<img name="currentScript" src="http://rogueserver.com/malicious.js">

When the shell.js script runs, it could mistakenly use the img element's src (which might point to a malicious server) instead of the real script's source.

The DOM element insertion (like the <img> above) can usually be done through a website's feature that allows users to embed certain script-less HTML (e.g., markdown renderers, web email clients, forums) or via an HTML injection vulnerability in third-party JavaScript loaded on the page.

@juj
Copy link
Collaborator

juj commented Oct 8, 2024

Ok, today I learned that if I add an element to the DOM with a name='foo' attribute, then that element becomes available as document.foo in JavaScript.

And this happens regardless of what the string foo actually is: that mechanism happily overwrites any and every variable name on document. Testing this out, Emscripten currently accesses all the following members on document:

<html><body>
<img name='currentScript' src='http://google.com/foo.js'>
<img name='elements' src='http://this.destroys.elements/'>
<img name='getElementById' src='http://this.destroys.getElementById.function/'>
<img name='createElement' src='http://this.destroys.createElement.function/'>
<img name='createTextNode' src='http://this.destroys.createTextNode.function/'>
<img name='head' src='http://this.destroys.document.head.element/'>
<img name='body' src='http://this.destroys.document.body.element/'>
<img name='documentElement' src='http://this.destroys.document.documentElement/'>
<img name='querySelector' src='http://this.destroys.document.querySelector/'>
<img name='URL' src='http://this.destroys.document.URL/'>
<img name='addEventListener' src='http://this.destroys.document.addEventListener/'>
<img name='removeEventListener' src='http://this.destroys.document.removeEventListener/'>
<img name='callEventListeners' src='http://this.destroys.document.callEventListeners/'>
<img name='pointerLockElement' src='http://this.destroys.document.pointerLockElement/'>
<img name='fullscreenElement' src='http://this.destroys.document.fullscreenElement/'>
<img name='title' src='http://this.destroys.document.title/'>
<img name='styleSheets' src='http://this.destroys.document.styleSheets/'>
<img name='createEvent' src='http://this.destroys.document.createEvent/'>
<img name='fullscreenEnabled' src='http://this.destroys.document.fullscreenEnabled/'>
<img name='exitPointerLock' src='http://this.destroys.document.exitPointerLock/'>
<img name='visibilityState' src='http://this.destroys.document.visibilityState/'>
<img name='hidden' src='http://this.destroys.document.hidden/'>
<img name='hasFocus' src='http://this.destroys.document.hasFocus/'>
<img name='fireEvent' src='http://this.destroys.document.fireEvent/'>
<script>
console.log(document.currentScript.src);
console.dir(document.elements);           // All these console.dir()s give an "img" element above
console.dir(document.getElementById);
console.dir(document.createElement);
console.dir(document.createTextNode);
console.dir(document.head);
console.dir(document.body);
console.dir(document.documentElement);
console.dir(document.querySelector);
console.dir(document.URL);
console.dir(document.addEventListener);
console.dir(document.removeEventListener);
console.dir(document.callEventListeners);
console.dir(document.pointerLockElement);
console.dir(document.fullscreenElement);
console.dir(document.title);
console.dir(document.styleSheets);
console.dir(document.createEvent);
console.dir(document.fullscreenEnabled);
console.dir(document.exitPointerLock);
console.dir(document.visibilityState);
console.dir(document.hidden);
console.dir(document.hasFocus);
console.dir(document.fireEvent);
</script></body></html>

that can all be replaced in that manner, if an attacker has the ability to inject unsanitized DOM elements to a web page.

This raises some thoughts:

  1. Surely the above behavior should be treated as a browser bug? I.e. the fact that
<html><body>
<img name='currentScript' src='http://google.com/foo.js'>
<script>
console.log(document.currentScript.src);
</script></body></html>

prints out "http://google.com/foo.js" and not whatever ad hoc web server I'm using to host that test page should be fixed in the browsers? And ditto for all the other attribute names with prereserved meaning on document? Do you know if there are existing bug reports to browsers about changing the above behavior?

I tried searching if there is anything but couldn't find a conversation, so ended up writing whatwg/dom#1315 . Maybe that will get closed as a duplicate if it has been discussed before.

  1. If a developer allows end users to insert DOM elements to a web page, they definitely will need to sanitize against all name attributes, and not just name='currentScript'? Given that there are so many document.xxx variables that can be clobbered. Sure, document.currentScript is apparently the only one that has the possibility of escalating DOM injection powers into XSS powers, but this is to say that any developer who is allowing end users to inject DOM elements should already be very careful about sanitizing the name='' attribute very widely.

  2. It does seem like a good idea to sanitize these accesses given that the current browser behavior around name='' is so nonchalant. We do want however to create an opt-out for developers who are building very small self-contained pages with Emscripten: there will be many users who can statically reason that nobody will ever be injecting DOM elements into their web page. So we should cater to both types, naturally defaulting to having the secure access enabled.

@Sec-ant
Copy link
Author

Sec-ant commented Oct 8, 2024


Do you know if there are existing bug reports to browsers about changing the above behavior?

IMO those behaviors are not bugs but HTML living standards:

According to wikipedia:

Over 2020 and 2021, proposals were made at various web standard groups detailing defenses against DOM clobbering by disallowing named access to DOM elements at the browser level. However, these proposals were dismissed since after investigating Chrome telemetry data, it was found that over 10.5% of the web relies on the features working as per their current behaviour.

Discussions and proposals:

A comprehensive review of DOM clobbering: https://doi.org/10.1109/SP46215.2023.10179403


If a developer allows end users to insert DOM elements to a web page, they definitely will need to sanitize against all name attributes, and not just name='currentScript'?

Yes, not only the name attribute but also id. DOMPurify is a such library to sanitize HTML, and preventing DOM clobbering attacks is one of its goals. There is also a Sanitizer API WICG proposal that aims to bring protections from DOM clobbering attacks to the web.


It does seem like a good idea to sanitize these accesses given that the current browser behavior around name='' is so nonchalant. We do want however to create an opt-out for developers who are building very small self-contained pages with Emscripten: there will be many users who can statically reason that nobody will ever be injecting DOM elements into their web page. So we should cater to both types, naturally defaulting to having the secure access enabled.

Considered that I'm not very familiar with the Emscripten codebase, and I'm also just a regular user that bumped into this discovery while checking CVE-2024-47068 alerts across my projects, I respect your decisions in every way :)

@juj
Copy link
Collaborator

juj commented Oct 8, 2024

Discussions and proposals:

Thanks - reading these links makes this feel like an example of perfect is the enemy of good. I opened a separate ticket to discuss narrower against this document.currentScript alone, which is the common security vector in all these reported CVEs. Let's mull on this a little bit.

CODE 1083 129
DATA 72 1214
JS 4799 0
Total 6056
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its nice to see the progress we've made on overall code size here since this test was last rebaselined

@@ -175,7 +175,7 @@ var quit_ = (status, toThrow) => {
#if SHARED_MEMORY && !MODULARIZE
// In MODULARIZE mode _scriptName needs to be captured already at the very top of the page immediately when the page is parsed, so it is generated there
// before the page load. In non-MODULARIZE modes generate it here.
var _scriptName = (typeof document != 'undefined') ? document.currentScript?.src : undefined;
var _scriptName = (typeof document != 'undefined' && document.currentScript?.tagName.toUpperCase() === 'SCRIPT') ? document.currentScript.src : undefined;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Man.. this is kind of ugly, and seems like it will increase code size.

I'm not necessarily against landing this, but I wonder if we can further limit the cases where _scriptName is used.. maybe we can even eliminate it somehow.

Copy link
Collaborator

@juj juj Oct 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is pretty ugly :/ I wonder if the .toUpperCase() part at least is not needed? When testing, I am getting all uppercase letters in .tagName already. Is there a known browser where tagName not in upper case, or why is the .toUpperCase() there?

I think I would write a function into parseTools.mjs:

// Return document.currentScript.src in a fashion that guards against DOM clobbering attacks.
// See https://github.com/emscripten-core/emscripten/pull/22688
function secureGetCurrentScriptSrc() {
  /* For future test for browsers that have implemented https://github.com/whatwg/html/issues/10687
  if (MIN_CHROME_VERSION >= xyz && MIN_FIREFOX_VERSION >= abc && MIN_..._VERSION >= def) {
    return 'globalThis.document?.currentScript?.src';
  } else */ if (SUPPORTS_GLOBALTHIS) {
    return "(globalThis.document?.currentScript?.tagName == 'SCRIPT' && document.currentScript.src)";
  } else {
    return "(typeof document != 'undefined' && document.currentScript?.tagName == 'SCRIPT' && document.currentScript.src)";
  }
}

and then in all these places that read document.currentScript.src, do

var _scriptName = {{{ secureGetCurrentScriptSrc() }}};

This way we have clean JS code and a way to evolve out of this awkwardness for users in the future.

@ericcornelissen
Copy link

ericcornelissen commented Oct 28, 2024

Hey, just wanted to jump in here and share that there is a theoretical bypass for the document.currentScript?.tagName.toUpperCase() === 'SCRIPT' mitigation being suggested here (with or without .toUpperCase()). 1

In particular, when document.currentScript is clobbered to an <iframe name="currentScript"> it will reference the window object of the child page. When the srcdoc property can be used it can be used directly for a bypass. Otherwise, a child page on the same origin as the parent is necessary (at least in modern browsers), which makes the attack significantly harder. Not to mention that <iframe>s (or <script> for the first case) are often removed when HTML is sanitized...

This gap can be fixed by additionally checking that document.currentScript is a DOM node (e.g. document.currentScript instanceof Node) because the window object isn't a DOM node, so e.g.:

- var _scriptName = (typeof document != 'undefined' && document.currentScript?.tagName === 'SCRIPT') ? document.currentScript.src : undefined
+ var _scriptName = (typeof document != 'undefined' && document.currentScript instanceof Node && document.currentScript?.tagName === 'SCRIPT') ? document.currentScript.src : undefined
Here's a PoC of the problem:

A standalone using srcdoc can be used:

<html><body>
<iframe name="currentScript" srcdoc="<script>window.tagName='SCRIPT';window.src='http://evil.com';</script>"></iframe>
<script>
setTimeout(() =>
  alert((typeof document != 'undefined' && document.currentScript?.tagName === 'SCRIPT') ? document.currentScript.src : undefined)
, 100);
</script>
</body></html>

Alternatively, put the following files in the same directory and run python3 -m http.server 8080 in that directory, then visit http://localhost:8080/parent.html.

<!-- parent.html -->
<html><body>
<iframe name="currentScript" src="./child.html"></iframe>
<script>
setTimeout(() => 
  alert((typeof document != 'undefined' && document.currentScript?.tagName === 'SCRIPT') ? document.currentScript.src : undefined)
, 100);
</script>
</body></html>
<!-- child.html -->
<html><head><script>
window.tagName = "SCRIPT";
window.src = "http://evil.com";
</script></head></html>

Footnotes

  1. I'm fairly new to the topic of DOM Clobbering and just came across this issue first, the problem might apply to the mitigations in other projects too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants