-
Notifications
You must be signed in to change notification settings - Fork 46.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Trusted sources for React elements. #3583
Conversation
There's also an argument to be made for the default behavior being not trusting unknown sources from day 1 of this landing. That's technically a breaking change from v0.13, but actually matches the behavior of using instanceof with v0.12. It would remove all the warnings and just go straight for safety. I'm curious what you all think of that. |
Maybe warn in 0.13.2 with opt-in errors, then always error in 0.14? I don't want to keep dangerouslyTrustAllSources long-term. |
To be clear - there are no errors at present (maybe there will be in v0.14 for other reasons). Simply isValidElement returns false, so foreign elements are seen as arbitrary object literals. dangerouslyTrustAllSources has it's place beyond whether or not we do warnings. It provides a quick fix if you're doing hyjinx and this change screws you over. I agree that long term, it makes far less sense. |
This is a proposal for tightening React's XSS security model as referenced by facebook#3473. React Elements are simply JS objects of a particular shape. Because React applications are frequently assembling large trees of these elements and including user data, it's possible for a malicious user to insert data that appears like a React Element and therefore render arbitrary and potentially dangerous markup. Previous versions of React used instanceof, but that limited the ability to use multiple Reacts and tightly coupled JSX to React which we wanted to avoid. This proposal replaces `{_isReactElement: true}` with `{_source: "randomstring"}`. Using React.createElement() automatically adds this _source, but JSX could generate regular object bodies with `{_source: React.getSourceID()}`. In order to use multiple Reacts, a new API `React.trustSource(sourceID)` is added. You can imagine using a different React instance in a webworker and using `React.trustSource(sourceIDFromCallingWorkersReactGetSourceID)`. To preserve back-compat, the default behavior proposed is to dangerously trust all sources, but to warn! If this proposal lands, then I imagine a future version of React will make the default behavior to not trust unknown sources, removing warnings.
The more I think about it, the more I agree with you @spicyj that this makes sense as a 0.13.2 as a partial security fix (introduces the warnings and gives you new tools) and by 0.14 the default is explicit trusting and no more warnings. |
I like this idea, it is very much similar to how CSP's script-source nonce ("randomstring") and draws a clear boundary between the library generated data and untrusted user input. |
It's obviously a question of how far you want to take it, but this only prevents exploitation of React elements. It does not prevent other future sources, say if a users writes a component that takes HTML as children rather than a prop I get that explicitly separating values from elements is rather invasive (it's implicit), but it really only applies to inheriting/markup frontends and mostly codebases that do not follow the React recommendation of "isolating components" so that they can be reused in any context. Explicitly separating renderable elements (values of otherwise) from non-renderable primitives also has a number of additional benefits and most non-HTML frontends would be minimally affected. I get that it's invasive and a hard sell when HTML is our current frontend, but the problem is introduced by our current implicit behavior and can be definitively solved if we made explicit. |
@leebyron I think that it is important that compilers can generate ReactElements without coupling to the React instance. It removes the annoying need for JSX to have React in scope and these optimizations can work: #3228 To do this, the transpiler would need to have access to a source ID in its runtime instead of the React instance. <div className="foo" /> could turn into: { type: 'div', props: { className: 'foo' }, sourceID: $runtime.jsxSourceID } However, to do this I would have to wire up my transpiler and my React instance somehow. It might also require multiple different sourceIDs to be active at any time which would be slower than the ideal by-ref always-true comparison. Therefore I was thinking about maybe storing some source ID on the global that can be shared by transpilers and React, or if it is available, then React picks it up or something. What do you think? |
@syranide Your comment is a bit unclear in the context of this PR. I think you're referring to your comment in #3473 that you would prefer primitives to be wrapped. I'll comment there to keep this thread about the merits of the sourceID approach on its own. We can use #3473 to discuss alternative approaches. |
@sebmarkbage My bad, after more consideration I don't think my line of thinking is right approach for this immediate XSS issue, it's about going back and re-evaluating and that's an entirely separate discussion. This PR seems like the way to go 👍 (even if we could |
If the goal is to prevent JSON deserializing to react elements, you could use one of the two types not allowed in JSON. These are also cross realm and indifferent to the number of react versions, including 0. el = { type: 'div', props: { className: 'foo' }, unsafe: undefined };
invariant(hasOwn.call(el, 'unsafe') && el.unsafe === undefined, '...');
el = { type: 'div', props: { className: 'foo' }, safe: Math.max /* pick your favorite */ };
invariant(hasOwn.call(el, 'safe') && typeof el.safe === 'function', '...';)
JSON.stringify(el) === '{"type":"div","props":{"className":"foo"}}'; |
@sebmarkbage #3228 optimization can still work in this case - the inlined object body just needs to include a I'm not sure I understand why having |
@brigand - we do want JSON serialization. This would be much easier to solve otherwise. JSON serialization gives us the ability to move React components across web workers and between server and client. |
@leebyron JSX is a language feature so it should be able to execute without any other context. For example, array literals doesn't require the var Array = null;
var arr = []; // still works Likewise, I shouldn't need to do: import React from "react";
var element = <div />; It even looks like it is unused. Even if I did want to use React for something else I should be able to call it what I want without JSX dictating the variable name: import $R from "react";
import { Component, PropTypes } from "react"; Another neat feature is that React components doesn't even depend on React's base class necessarily. function Bar(props) {
return <div />;
}
class Foo {
render() {
return <Bar />;
}
} This should be perfectly valid. You can even load the modules without loading the dependency on the rest of React. You can run these in isolated unit tests without React and without any mocking. They're also agnostic to multiple instances of React. If you end up with both It also decouples it from React itself so that other VDOM libraries can use these elements and the same JSX transpiler. Now if a transpiler decides to require React automatically and bring it into scope to use |
@sebmarkbage This would be awesome 👍 |
@sebmarkbage 👍 should someone open a new issue to change transpiler to work like that? It would make so much more sense. |
We still have some work to make that a reality. In <0.14 a lot of things depend on "owner" which is global state inside React at the time the element is created. In 0.14 that is still true for refs.
|
Yes, however the I think we might beyond describing semantics of JSX as I imagine whatever we do here would be a transpiler detail and not a user-facing API change for those who use JSX. What I'm actually unsure about is how the order of operations will work. Putting this "source token" in a global somewhere like Perhaps we can propose something like |
@leebyron While JSON serialization is useful, I think @brigand's suggestion is interesting and potentially worth further thought/discussion. His solution doesn't prohibit moving components across a serialization boundary (@brigand's suggestion merely requires that an additional explicit action must be taken to say "Yes, I mean for this to be deserialized as a component", namely that the developer set the unserializable field on the component after deserialization). It also meets Sebastian's desired language-feature rule/syntax without resorting to smelly globals. It has the advantage that even if the attacker has full knowledge of the user's browser state (including the |
I'm not sure if I understand the full proposal from @brigand. I was under the impression that JSON serialization isn't just potentially useful - it's our goal. It unlocks the ability for React to operate across Realms, server, webworker, etc. In my opinion |
His suggestion, as I understand it, is to use a value which is impossible to describe using JSON (eg. a cyclic reference). Component (de)serialization is still possible, because the deserializer can explicitly declare their intent to receive a component by re-setting the value which was "lost" during the json serialization. So you don't loose the ability to pass components across the serialization boundary (to webworkers or whatever) but you make it impossible to "accidentally" deserialize a component because components require that special value. It means you have to know when you are expecting a The only downside is that deserialization of components requires "an extra step", but I don't see that as a bad thing. If you're really expecting to deserialize a component, you pass it into a function that rehydrates the component. |
My problem with that solution is that is limited to the current JSON limitations. It says nothing about other serialization mechanisms such as postMessage, etc. Most serialization boundaries don't support functions atm but there's nothing permanent about that. If you add a new feature to the serialization boundary, you change the security implications of the whole app. One way you can illustrate this is using the second ( One reason we care about serializability, in general, is because we don't want to traverse and copy an entire object hierarchy to make it React-compatible. One of the most powerful features of React is that you can put ReactElements in arbitrary data structures. Not just a single "children" array. var data = [
{
first: [
new Immutable.Vector([ element1, element2 ]),
new Immutable.Map().set('foo', element3)
],
second: element4
},
element5
];
return <Component data={data} />; // Component does something fancy with this deep data structure Even if your serializer knew how to properly transfer these data structures, it would also have to know how to properly deserialize a React element for this realm. A lot of serialization is, in practice, not pluggable because of layers of indirection. E.g. a data fetching library. The alternative is that you have a secondary step that recursively goes through this data structure, after data fetching, and creates local React elements in place of these elements. That requires marshalling of every single node in this potentially large tree. So the ideal scenario would go directly from deserialization into the React component.
@leebyron Well we could drop the underscore and make it possible for anyone to use this. Then it is not React specific.
Well, to properly decouple the transpiler and React, I think they would probably need to communicate through a third-party. Otherwise, you couple React to the transpiler instead. One possible third-party is a global like Standard Value DOMMy hope was that we would find a way to "standardize" (at some level, not necessarily ECMAScript) both JSX and the data structure together as a generic "Virtual DOM" (Value Type DOM). We would couple the If we implemented this on top of something like the Value Type Proposal. Then we could potentially do something like Custom value types can be passed between realms with their identity preserved by sharing a Symbol between realms. That way this at least works between non-serialized multiple realms (like iframes). Presumably, anyone that wants to share custom value types between realms would be in a similar situation where they need to find a way to share some kind of ID. This is primarily why I like the ID solution because it is essentially a polyfill of the shared symbol solution for realms. Of course, this "Value DOM" type is not really JSON serializable, unless it becomes defacto standard and therefore part of common JSON protocols or common revivers above JSON. Which sort of brings us back to the situation that we can be shared between realms but not serializable boundaries. |
The current thinking around Value Types in the spec is that they will use a single Symbol that is passed between realms. This is a coordination issue that is problematic for many systems like module systems and npm versioning. You also have to coordinate it with build systems like Babel. It is currently an unsolved problem. Therefore I would like to punt on this and make "dangerouslyTrustAllSources" the default for now, just like in 0.13. We can still have the opt-in solution IF you have the ability to coordinate this in your system. Another consequence of the Value Types system is that you're expected to only have ONE symbol to describe these types. I think that therefore we can get rid of the multi-source system which should simplify things a bit. We have also separated ReactDOMClient/ReactDOMServer/ReactIsomorphic so we already have a need to coordinate between these packages. How about we only expose |
(Can we call it a sourceToken? I think that sounds a little better.) |
Eventually went with #4832 instead. Pushes the burden of managing this to the Symbol system. |
This is a proposal for tightening React's XSS security model as referenced by #3473.
React Elements are simply JS objects of a particular shape. Because React applications are frequently assembling large trees of these elements and including user data, it's possible for a malicious user to insert data that appears like a React Element and therefore render arbitrary and potentially dangerous markup.
Previous versions of React used instanceof, but that limited the ability to use multiple Reacts and tightly coupled JSX to React which we wanted to avoid.
This proposal augments
{_isReactElement: true}
with{_source: "randomstring"}
. Using React.createElement() automatically adds this _source, but JSX could generate regular object bodies with{_source: React.getSourceID()}
.In order to use multiple Reacts, a new API
React.trustSource(sourceID)
is added. You can imagine using a different React instance in a webworker and usingReact.trustSource(sourceIDFromCallingWorkersReactGetSourceID)
.To preserve back-compat, the default behavior proposed is to dangerously trust all sources, but to warn! If this proposal lands, then I imagine a future version of React will make the default behavior to not trust unknown sources, removing warnings.