Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/net: regression in x/net/html package: The <!--[if mso]--> email HTML comments are now being escaped #58246

Closed
VojtechVitek opened this issue Feb 2, 2023 · 10 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@VojtechVitek
Copy link

VojtechVitek commented Feb 2, 2023

Hi,

We're rendering HTML emails in Go. We're using github.com/aymerick/douceur to do CSS inlining / HTML sanitization, which in turn uses golang.org/x/net/html package under the hood.

Recently, after upgrading x/net package, we have noticed that our HTML emails broke in certain email clients, ie. Windows / Microsoft Outlook, and rendered as a blank page.

We're using some Outlook-specific conditional comments in the HTML of our emails:

Example:

<!--[if gte mso 12]>
  <xml>
      <o:OfficeDocumentSettings>
      <o:AllowPNG/>
      <o:PixelsPerInch>96</o:PixelsPerInch>
      </o:OfficeDocumentSettings>
    </xml>
<![endif]-->

However, after this commit (#48237), the content of the HTML comments changed and is now being escaped:

<!--[if gte mso 12]&gt;
  &lt;xml&gt;
      &lt;o:OfficeDocumentSettings&gt;
      &lt;o:AllowPNG/&gt;
      &lt;o:PixelsPerInch&gt;96&lt;/o:PixelsPerInch&gt;
      &lt;/o:OfficeDocumentSettings&gt;
    &lt;/xml&gt;
&lt;![endif]-->

Emails are hard and it's quite normal to rely on these special <!--[if mso]> HTML comments to support Microsoft Outlook properly. See Targeting specific Outlook versions.

Questions:

  1. Is there any chance we could revert this change?
  2. Is there any chance we could provide a setting to turn off HTML comment escaping?
  3. Or should we consider this a breaking change "by design" and try to work around the issue (ie. by forking x/net/html)?

Thank you for your input.

Best,
Vojtech

@gopherbot gopherbot added this to the Unreleased milestone Feb 2, 2023
@mknyszek mknyszek added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Feb 2, 2023
@mknyszek mknyszek changed the title x/net: Regression in x/net/html package: The <!--[if mso]--> email HTML comments are now being escaped x/net: regression in x/net/html package: The <!--[if mso]--> email HTML comments are now being escaped Feb 2, 2023
@mknyszek
Copy link
Contributor

mknyszek commented Feb 2, 2023

CC @nigeltao

@mknyszek
Copy link
Contributor

mknyszek commented Feb 2, 2023

CC @namusyaka

@antichris
Copy link

antichris commented Feb 2, 2023

It seems logical that only the --> sequence should be escaped in the Data of a CommentToken. But escape() does not look at sequences, it just escapes every rune that's in escapedChars. Maybe a separate escapeCommentData() could be implemented?

@nigeltao
Copy link
Contributor

nigeltao commented Feb 4, 2023

I think it's unlikely that we'll (1) revert or (2) provide a setting. The behavior was changed because it was a bug that an "html.Parse and html.Render" round trip could change the semantics of the HTML. There was a potential "comment injection" security issue where a parse+render of malformed comment could inject non-comment content.

If you need to inject meaningful comments, and want control over escaping, perhaps an html.RawNode would help.

However, I will quote myself from another issue:

I'm not keen on using golang.org/x/net/html for server-side rendering. Use the standard library's html/template instead. Yes, the former has nodes and the latter is text based, but the latter is explicitly designed to render HTML securely.

@nigeltao
Copy link
Contributor

nigeltao commented Feb 4, 2023

Having said that...

Maybe a separate escapeCommentData() could be implemented?

This could possibly work. We'd have to think carefully about any security implications, though. For example, there are non-standard ways to create a CommentToken that stops at the first ">", not the first "-->".

@nigeltao
Copy link
Contributor

nigeltao commented Feb 4, 2023

Here's what the HTML spec says about parsing comments:

Given that second link... I don't know about you, but I'm going to need to think for a bit about when it's safe to not-escape > characters.

@antichris
Copy link

antichris commented Feb 4, 2023

It does look ominous at the first glance, but I'm pretty sure that the gist of it is that

  • when you're in the <!-- state (13.2.5.43), reads that turn it into <!--> or <!---> (transitioning from 13.2.5.43 and 13.2.5.44, respectively), end the comment with abrupt-closing-of-empty-comment parse error.
  • When in comment state (13.2.5.45), reading
    • <!--> simply ends the comment (after 13.2.5.49); the <! content is retained in comment token's data.
    • --!--> also ends the comment (after transitioning from 13.2.5.52 through 13.2.5.50 and 13.2.5.51); --! is retained in comment token's data.
    • --!> ends the comment with incorrectly-closed-comment parse error (13.2.5.52).
      This is probably another instance where > in the data of a comment token should be escaped.

Anything else between <!-- and --> gets appended to comment token's data.

That looks about it. Or am I missing some fine point?

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/466776 mentions this issue: html: parse comments per HTML spec

gopherbot pushed a commit to golang/net that referenced this issue Feb 10, 2023
Updates golang/go#58246

Change-Id: Iaba5ed65f5d244fd47372ef0c08fc4cdb5ed90f9
Reviewed-on: https://go-review.googlesource.com/c/net/+/466776
TryBot-Result: Gopher Robot <[email protected]>
Auto-Submit: Nigel Tao <[email protected]>
Reviewed-by: Damien Neil <[email protected]>
Run-TryBot: Nigel Tao <[email protected]>
Reviewed-by: Nigel Tao (INACTIVE; USE @golang.org INSTEAD) <[email protected]>
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/469055 mentions this issue: html: add "Microsoft Outlook comment" tests

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/469056 mentions this issue: html: have Render escape comments less often

gopherbot pushed a commit to golang/net that referenced this issue Feb 23, 2023
This only adds new tests. A follow-up commit will change behavior.

Updates golang/go#58246

Change-Id: I6adf5941d5cfd3c28f7b9328882ac280109ee028
Reviewed-on: https://go-review.googlesource.com/c/net/+/469055
TryBot-Result: Gopher Robot <[email protected]>
Run-TryBot: Nigel Tao <[email protected]>
Reviewed-by: Kunpei Sakai <[email protected]>
Reviewed-by: Damien Neil <[email protected]>
Reviewed-by: Bryan Mills <[email protected]>
@golang golang locked and limited conversation to collaborators Feb 28, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

5 participants