Naming of the innerHTML setter (e.g., `setInnerHTML`.) #100

mozfreddyb · 2021-06-15T07:27:26Z

I don't like the naming of setInnerHTML. In the security industry, we said so much about avoid using innerHTML, that library like React use the name called dangerouslySetInnerHTML. It's super hard to tell people now that "Hey! setInnerHTML is safe!". At the very least, it's confusing for developers. setHTML (or setSanitizedHTML) sounds better.

Originally posted by @shhnjk in #42 (comment)

The text was updated successfully, but these errors were encountered:

koto · 2021-06-15T08:12:19Z

If we need a new function to set innerHTML, it should support extensions in the future not to be tied to the sanitizer only. I am not sure if the setInnerHTML(value, {optionBag}) signature is sufficient (or the most ergonomic), but only for one reason, and a speculative one.

I think it might be valuable for the platform in the future to distinguish whether innerHTML value to be set is a literal, or a dynamic value (and e.g. block the dynamic ones). This needs some plumbing in EcmaScript. We've explored that subject in the past (https://github.com/mikewest/tc39-proposal-literals), and it seems like the most viable solution is to rely on template literals and tag functions, i.e. it might only be possible to reject a value if a following syntax is used:

assertThisIsALiteral`i am not dynamic`

If the innerHTML setter signature uses the option bag, it makes it hard to add support for the above - it would have to use something like:

el.setInnerHTML(ignoreMe, {literal: assertThisIsALiteral`the actual value`})

One option to allow for this extension would be to have dedicated setters instead of one, e.g.:

el.setInnerHTML.sanitized(dirty, {sanitizer})
el.setInnerHTML.literal`i am constant`
el.setInnerHTML.unsafe("please block me, CSP!")

That way the argument list, and its options can be freely chosen by the appropriate setter variant. That way it's also probably easier to reason about the code statically.

// cc @mikewest

mozfreddyb · 2021-06-15T08:53:27Z

I like @shhnjk's point, that an innerHTML-alike name might confuse people who remember that "innerhtml is dangerous".
I'm also very sympathetic to the point you are making (which is also a concern within TypeScript) where you couldn't easily decide whether the optionBag contains a sanitizer or not.

@otherdaniel, and Mario (@cure53) talked about it and in lack of a better term arrived at content, which is already an existing property name for a template element.

So, how about Node.setContent(DOMString input, Sanitizer mySanitizer)?

koto · 2021-06-15T09:28:47Z

how about Node.setContent(DOMString input, Sanitizer mySanitizer)?

That assumes that the sanitizer will be the only API ever allowed to set the content. How about Node.content.setSanitized(DOMString input, Sanitizer mySanitizer) ?

mozfreddyb · 2021-06-15T09:43:28Z

what is a bare Node.content supposed to return here?
We thought about an adjective or a describer for the set part, but the sanitizer is non-optional, so it's maybe redundant?
We also thought about setSanitized but then maybe the question is "set sanitized WHAT"? (inner, outer, text, content, ...)

setContentSanitized or setSanitizedContent are also terribly long and boring..

koto · 2021-06-15T10:29:50Z

what is a bare Node.content supposed to return here?

{setSanitized: Function} ? That way you can feature test all the future extensions of the API.

We thought about an adjective or a describer for the set part, but the sanitizer is non-optional, so it's maybe redundant?

It's only required for the setter that's plumbed to the sanitizer (and there are already requests for other setters e.g. whatwg/dom#912). I suspect there would be setters that have a combination of required and optional arguments - the most ergonomic way of supporting that, I think, is to make each of those variants a separate function with its own signature. We could group them under Node.content or not.

We also thought about setSanitized but then maybe the question is "set sanitized WHAT"? (inner, outer, text, content, ...)

setContentSanitized or setSanitizedContent are also terribly long and boring..

That would work, I think. I have a slight preference towards precise API surface rather than a generic endpoint that can do many things depending on arguments. Predictability is boring, and yet we want the API to intuitively describe what it's doing, no?

shhnjk · 2021-06-16T00:01:23Z

Isn't setSanitizedHTML better? The innerHTML is bad naming, but HTML isn't.

mozfreddyb · 2021-06-17T12:45:28Z

Thinking about this, I do like setSanitizedHTML() better than setContent().
However, I understand that having a .content might help exposing other use-cases the DOM is having (or is really going to have soon), that's something we should try to answer. Paging @domenic and @annevk again.

annevk · 2021-06-17T13:10:15Z

setHTML() seems nice in that it's both clear it will invoke the HTML parser and it beats innerHTML assignment by one character (being generous toward innerHTML here by omitting spaces):

e.setHTML("test");
e.innerHTML="test";

I'm not sure we need to make it clear it's sanitized. All new parser APIs should be safe-by-default from now on I think.

I'm also not sure I understand @koto's concern. It seems that requiring literals would either be a higher-level switch, a change to the API, or perhaps something you enforce through an option? Any of these would be compatible as far as I can tell. But I'd love to learn more as it does seem like something that would be good to get right from the start.

mozfreddyb · 2021-06-17T13:23:45Z

the function needs to accept a sanitizer instance @annevk. The sanitizer is not bound to some per-global or per-document state.
so we're either at el.setSanitizedHTML(input, sanitizerObj) or at el.setHTML(input, sanitizerObj).

annevk · 2021-06-17T13:32:49Z

Of course, but that's an optional argument right? I'm not sure how that relates to my comment.

otherdaniel · 2021-06-17T13:48:16Z

Of course, but that's an optional argument right?

I don't think that can be an optional argument. The point of the API is to give developers an easy-to-use setter without any built-in footguns. A function that gives hard security guarantees if the 2nd parameter is present and guaranteed insecurity if it's missing doesn't really meet that goal. The safe setter - whatever it ends up being called - should either be safe, or fail.

annevk · 2021-06-17T14:01:45Z

Why can the sanitizer not have safe defaults? What exactly do you need to configure there for it to work?

koto · 2021-06-17T14:05:53Z

It seems that requiring literals would either be a higher-level switch, a change to the API, or perhaps something you enforce through an option?

As far as I can tell, the most viable option for the platform APIs to distinguish literals from other values, is if the function accepting the value is a tag template function - which ties our hands w.r.t. to the function signature (it's Function(templateStringsArray, ...argumentsToInterpolate)). We could, for example, do:

 el.setHTML(input, sanitizer)
 // and, in the future
 el.setHTML.literal`this_is_not_injected_for_sure`

I think what's important is that the new setter (and all variants of it) is strictly more secure than innerHTML. Which is why e.g. the sanitizer cannot be optional in a variant that sanitizes (at least in the version put forward, there is no default sanitizer that is always available and called implicitly). And in general I would prefer if there's no assumption that the setHTML setter is only paired with the sanitizer. Sanitizing HTML is not the only way to make it safe (w.r.t. XSS or otherwise).

otherdaniel · 2021-06-17T14:43:05Z

Why can the sanitizer not have safe defaults? What exactly do you need to configure there for it to work?

The sanitizer has safe defaults. Was your intent that .SetHTML("xxx") would imply that a default-constructed Santizer instance is applied? That would be viable, as it meets the safe-or-fail goal.

I personally would prefer an explicit instance over an implied one, as I find that less surprising. On the plus side, Element.SetHTML with built-in Sanitizer might increase usage of the Sanitizer.

annevk · 2021-06-17T14:51:30Z

Yeah, I was expecting the second argument to merely be some options that have safe defaults, but can be overridden if you need more control over sanitization. I don't think we want everyone that has to parse a string to have to explicitly consider and configure a sanitizer.

@koto given that we cannot distinguish that type from a string it's still unclear to me what that means. And even once we can distinguish it, what it would mean then.

koto · 2021-06-17T15:03:08Z

@koto given that we cannot distinguish that type from a string it's still unclear to me what that means.

We can, from within the tag template function. The first argument is a frozen array of strings, and the whole array is in the Realm's [[TemplateMap]] slot. https://github.com/tc39/proposal-array-is-template-object proposes to expose this check to regular JS, but even without this, one can distinguish template literals from strings using the above machinery. Again, there's a whole history of how we got there, starting from https://github.com/mikewest/tc39-proposal-literals.

And even once we can distinguish it, what it would mean then.

I think it would be huge for combating injections, if platform was able to distinguish literals from dynamic values, as the first ones are by definition non-user-controllable without a very specific kind of server-side injection.

shhnjk · 2021-06-17T18:55:14Z

Regarding the naming, setHTML LGTM 😊

given that we cannot distinguish that type from a string it's still unclear to me what that means. And even once we can distinguish it, what it would mean then.

To give more high-level description, Web Platform currently doesn't have a way to say developers are assigning static string or dynamic string.
That is,

elem.innerHTML = "<b>hello</b>";

AND

elem.innerHTML = "<b>" + location.hash.slice(1) + "</b>";

Are both string assignment (even though one is static and one is dynamic). What @koto suggested is to give a primitive to developers, where the platform can understand the notion of static string assignment. This primitive is important in the platform, because if we know something is static, it is safe to append without sanitization.

BTW, elem.setHTML.withLiterals`this_is_not_injected_for_sure` sounds better (remember, this is the thread to decide namings 😋)

Sora2455 · 2021-06-18T08:46:28Z

I think it would be huge for combating injections, if platform was able to distinguish literals from dynamic values, as the first ones are by definition non-user-controllable without a very specific kind of server-side injection.

Very specific, but very common. (Though if anyone has actual stats for Reflected XSS (I couldn't find any with a cursory search) that would be great).

koto · 2021-06-18T11:30:23Z

Very specific, but very common.

I realize we're diverging from the thread, but I highly doubt that reflections inside template strings inside a script are common. That would require something like this on the server side:

<script>
foo = `bar<?php echo $_GET['xss'] ?>`
</script>

Additionally, this would only be risky if the injection was not able to escape from the template, but was able to inject HTML, which seems quire rare. Regardless, the best the platform can do to combat this, and other server-side XSSes is CSP script-src.

I don't want this issue to become about the literals, I only want to highlight that other setters might be worthy creating, so let's not just assume that sanitizer-bound setter is the only one to exist, so perhaps (.setHtml.sanitized(input, sanitizer) or el.setSanitizedHTML(input, sanitizer)) are better entry points for it.

mozfreddyb · 2021-06-23T08:04:47Z

I understand that you might want a function that can be easily used with a tagged template string, @koto, and I think it's really worth discussing but let's please move that to a different issue.

Are there any objections to .setHTML(input, sanitizer)?
Anne says we can consider the sanitizer optional and use the default sanitizer config, which would be very fine.
Note that the default config is promised to be XSS-safe within the boundaries of our Security Considerations, i.e., no protections against gadget XSS.

koto · 2021-06-23T08:14:54Z

Are there any objections to .setHTML(input, sanitizer)?

Yes. For the reasons outlined, I prefer .setSanitizedHTML or setHTML.sanitize(d) or .html.setSanitized. Creating a setter with a generic name setHTML and hooking it to the sanitizer (explicitly, or implicitly) limits how this could be extended in the future.

koto · 2021-06-23T08:21:31Z

Looking at the #102, Element.setSanitizedHTML(input, sanitizer?) and Element.setRawHTML would share the same naming scheme. What's the advantage of a generic setHTML?

annevk · 2021-06-23T08:38:23Z

Mainly that it's shorter and can then be more easily advertised (possibly together with raw) as the replacement for innerHTML. It would also mean that any future "set HTML" options would not feel out-of-place and would be a natural addition to that API.

koto · 2021-06-23T08:50:04Z

I see, those are valid advantages. The disadvantage I see is a slight surprise factor - it doesn't just "set" HTML, it's modifying it. Another variant might be HTMLElement.sanitize(input, sanitizer?)? It's the same length as innerHTML=, and its name describes exactly what would happen - with a downside of not being generic.

With the option for setRawHTML it's mostly cosmetic choice at this point - if we want to go with the one generic function and option bags, or a few specific per-purpose functions. I prefer the latter shape, but don't want to block on this.

otherdaniel · 2021-06-23T12:54:38Z

Some thoughts:

I can live with any of the names.
I like SetHTML with an optionally implied Sanitizer from an 'advertising' perspective, in that we can explain it as a better .innerHTML. I find it questionable from the principle-of-least-surprise perspective, since I wouldn't expect a generically named function to perform this fairly specific operation. I'd (slightly) prefer to be explicit, so that if a user wants sanitization, they should have to say so.
I'm skeptical on the extensibility. I wonder if there's a list of candidate extensions people have in mind?

mozfreddyb · 2021-06-23T13:02:47Z

As of now, the only other option that is currently discussed is a parsing option whether to parse a declarative shadow DOM (whatwg/dom#912).

If that parsing option is to become a thing, we can't implement our parsing invocation without taking options for the parser.
Which brings us to whateverTheMethodIsCalled(input, optionsBag) where the optionsBag contains a sanitizer property (e.g, {sanitizer: mySanitizer } (or the default sanitizer will be used).

That said, it would be a bit surprising if someone wants to setHTML according to the declarative shadow DOM use-case only and find their string to be sanitized? I mean, I like an always-secure setter, but I can also see this as very confusing :|

otherdaniel · 2021-06-23T13:30:21Z

As of now, the only other option that is currently discussed is a parsing option whether to parse a declarative shadow DOM (whatwg/dom#912).

Thanks, I hadn't seen that. If that indeed becomes a thing, then I presume any parsing Sanitizer function would want the option, in which case it'd be a natural fit for the Sanitizer config. (See also #45.)

That said, it would be a bit surprising if [...]

IMHO, that speaks for explicit-ness in some form. Either by having a required (not optional) Sanitizer argument, or by referencing it in the name. (That said, I acknowledge this is a matter of taste, and other people will see it differently. I can live with it either way.)

annevk · 2021-06-23T14:04:31Z

It could be reasonable to support { sanitizer: "unsafe-disable" } at some point if that is something that is needed (perhaps if it then required a literal it wouldn't even have to be named unsafe). What I'm mainly looking for in the innerHTML replacement:

Safe-by-default.
As short or shorter than innerHTML so it doesn't lose out on ergonomics. For something that you need to use often this is more important than it may seem.
Extensible through the second argument in some way. This could be overloading with a dictionary at some point or having a dictionary there from the start. But we shouldn't have the innerHTML replacement constrained in such a way that this is not possible in the future.

otherdaniel · 2021-06-23T14:29:44Z

Hmm. If I take Anne's requirements, and my (admittedly rather sketchy) understanding of WebIDL's overloading rules, then we should have two overloads

Element.SetHTML(DOMString)
Element.SetHTML(DOMString, Sanitizer)

(Merging those by making Sanitizer optional would make it difficult to define future overloads. Although I take it that's a tricky topic.)

The single param version would be whatever the platform considers to be sensible default behaviour; I assume with default sanitization with whatever current or future defaults we pick. Future overloads could offer any combination of additional arguments or different types, since a (non-nullable) interface is distinguishable from pretty much anything else.

shhnjk · 2021-06-24T04:56:34Z

If we are going to this route, I would prefer to have the default option as literal.

That is:

setHTML`<script>alert(1)</script>` (just works because static HTML assignment).
setHTML(“<script>alert(1)</script>”, {/*use sanitizer*/ true, /*optional sanitizer Object*/ sanitizer})

This way, the latter object can have room for extension in the future.

mozfreddyb · 2021-07-13T08:41:47Z

setHTML 🎉

koto · 2021-07-13T09:14:00Z

+1 to setHTML !

The tag functions have a specific signature (TemplateArray tpl, any... interpolates), that doesn't match the signature of the second variant (string payload, object optionBag). To merge those into a single setHTML, we'd need something like this maybe?

interface mixin InnerHTML {
  [CEReactions] attribute [LegacyNullToEmptyString] DOMString innerHTML;
  undefined setHTML(object|DOMString innerHTML, optional any... arguments);
};

setHTML algorithm would have to branch on the type of the first argument. It's a bit ugly as the typing is mostly lost, but the shape exposed to the webdevs - a single function - is very useful I think. Perhaps even using

undefined setHTML(object|DOMString innerHTML, optional object options);

would work, since we're not expecting any interpolation anyway.

mozfreddyb · 2021-07-13T09:25:04Z

This issue is merely about the name. The template thing was discussed in #102 ;)

mozfreddyb mentioned this issue Jun 15, 2021

Sanitizer API creating mock context-element can cause XSS when used in different context #42

Closed

mozfreddyb mentioned this issue Jun 23, 2021

proposed api: innerHTML setter that takes template strings #102

Closed

mozfreddyb closed this as completed Jul 13, 2021

shhnjk mentioned this issue Aug 4, 2021

Create [TrustedTypes].fromLiteral method w3c/trusted-types#347

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Naming of the innerHTML setter (e.g., `setInnerHTML`.) #100

Naming of the innerHTML setter (e.g., `setInnerHTML`.) #100

mozfreddyb commented Jun 15, 2021

koto commented Jun 15, 2021

mozfreddyb commented Jun 15, 2021

koto commented Jun 15, 2021

mozfreddyb commented Jun 15, 2021

koto commented Jun 15, 2021

shhnjk commented Jun 16, 2021

mozfreddyb commented Jun 17, 2021

annevk commented Jun 17, 2021

mozfreddyb commented Jun 17, 2021

annevk commented Jun 17, 2021

otherdaniel commented Jun 17, 2021

annevk commented Jun 17, 2021

koto commented Jun 17, 2021

otherdaniel commented Jun 17, 2021

annevk commented Jun 17, 2021

koto commented Jun 17, 2021

shhnjk commented Jun 17, 2021 •

edited

Loading

Sora2455 commented Jun 18, 2021

koto commented Jun 18, 2021 •

edited

Loading

mozfreddyb commented Jun 23, 2021

koto commented Jun 23, 2021

koto commented Jun 23, 2021

annevk commented Jun 23, 2021

koto commented Jun 23, 2021

otherdaniel commented Jun 23, 2021

mozfreddyb commented Jun 23, 2021

otherdaniel commented Jun 23, 2021

annevk commented Jun 23, 2021

otherdaniel commented Jun 23, 2021

shhnjk commented Jun 24, 2021 •

edited

Loading

mozfreddyb commented Jul 13, 2021

koto commented Jul 13, 2021

mozfreddyb commented Jul 13, 2021

Naming of the innerHTML setter (e.g., setInnerHTML.) #100

Naming of the innerHTML setter (e.g., setInnerHTML.) #100

Comments

mozfreddyb commented Jun 15, 2021

koto commented Jun 15, 2021

mozfreddyb commented Jun 15, 2021

koto commented Jun 15, 2021

mozfreddyb commented Jun 15, 2021

koto commented Jun 15, 2021

shhnjk commented Jun 16, 2021

mozfreddyb commented Jun 17, 2021

annevk commented Jun 17, 2021

mozfreddyb commented Jun 17, 2021

annevk commented Jun 17, 2021

otherdaniel commented Jun 17, 2021

annevk commented Jun 17, 2021

koto commented Jun 17, 2021

otherdaniel commented Jun 17, 2021

annevk commented Jun 17, 2021

koto commented Jun 17, 2021

shhnjk commented Jun 17, 2021 • edited Loading

Sora2455 commented Jun 18, 2021

koto commented Jun 18, 2021 • edited Loading

mozfreddyb commented Jun 23, 2021

koto commented Jun 23, 2021

koto commented Jun 23, 2021

annevk commented Jun 23, 2021

koto commented Jun 23, 2021

otherdaniel commented Jun 23, 2021

mozfreddyb commented Jun 23, 2021

otherdaniel commented Jun 23, 2021

annevk commented Jun 23, 2021

otherdaniel commented Jun 23, 2021

shhnjk commented Jun 24, 2021 • edited Loading

mozfreddyb commented Jul 13, 2021

koto commented Jul 13, 2021

mozfreddyb commented Jul 13, 2021

Naming of the innerHTML setter (e.g., `setInnerHTML`.) #100

Naming of the innerHTML setter (e.g., `setInnerHTML`.) #100

shhnjk commented Jun 17, 2021 •

edited

Loading

koto commented Jun 18, 2021 •

edited

Loading

shhnjk commented Jun 24, 2021 •

edited

Loading