Code blocks should be editable after saving the post and reloading the editor #15780

davilera · 2019-05-22T15:42:25Z

Description

How has this been tested?

Created the block described in #13218.

Types of changes

The PR solves the issue by making sure the content of a code block is pulled as HTML during the parsing process and escaping/unescaping tag delimiters (i.e. < and >) properly.

~~Unfortunately, this seems to be (yet another) breaking change, as it requires < and > to be escaped before saving.~~ Edit: 1ac7cfa prevents this PR from being a breaking change.

Checklist:

My code is tested.
My code follows the WordPress code style.
My code follows the accessibility standards.
My code has proper inline documentation.
I've included developer documentation if appropriate.

packages/block-library/src/code/index.js

aduth · 2019-05-22T15:46:45Z

Unfortunately, this seems to be (yet another) breaking change, as it requires < and > to be escaped before saving.

Can you elaborate on what specifically is breaking? Will existing content be shown as "Invalid"?

davilera · 2019-05-22T16:35:52Z

Can you elaborate on what specifically is breaking? Will existing content be shown as "Invalid"?

I think it would, yes. However, there's an option to make a block compatible with previous versions, right? If so, I can explore this...

aduth · 2019-05-22T17:07:22Z

It'd be worth testing to see if it is an issue. The block validation logic explicitly includes some leniency where differences are purely on differences in encoding (source), so it might not be a problem.

If it does turn out to be breaking, we can try to use the deprecated feature for transitioning (reference.

…actually escape them (if anybody does, it's Gutenberg itself, implicitly, when saving)

davilera · 2019-05-23T07:16:31Z

OK so... I think I got this (without introducing breaking changes!). The problem was, Gutenberg (or the browser/JavaScript itself, I don't know) escapes "less than" characters (<) when the block is saved in the database. That is, something like #13218:

<a href="http://wordpress.org/">WordPress</a>  &lt;strong&gt;

becomes this (after my original PR #13996 is applied):

&lt;a href="http://wordpress.org/">WordPress&lt;/a>  &amp;lt;strong&amp;gt;

The problem is, when loading that block from the post content, characters like < aren't properly parsed and the problems described in #13218 arise. I tweaked the block so that:

Content is retrieved in HTML, so that the block can work with HTML entities.
The unescape function we use to set the value in the PlainText component also unescapes < and > characters, which might be there because of the save process.

Apparently, this seems to solve issue #13218 and all tests pass. I would recommend a few more testing, though.

davilera · 2019-05-28T09:11:55Z

Were you able to review this, @aduth? Is there anything else I can do to help?

aduth · 2019-05-28T21:01:31Z

2. The unescape function we use to set the value in the PlainText component also unescapes < and > characters, which might be there because of the save process.

Yep, wondering about why we wouldn't need to have a corresponding escaper in the setter, it occurs to me that this is already effectively escaped when the block is serialized. So I think the change is sensible in restoring a balance. It's also better for it to occur during the save than in escape since if we were to escape the value as maintained in the block attributes, it would wrongly be displayed as escaped in the PlainText textarea.

I'd wondered about whether we really need to unescape here, vs. setting the innerHTML of the textarea directly, since this is a common technique for "safely" decoding entities. That said, I'm inclined to avoid dangerouslySetInnerHTML as much as possible for reasons which should be self-evident by its name 😄 My only concern then is if there are other encoded entities which we need to be considering.

This looks good to me. I'll want to give it another pass in the morning with a fresher mind.

aduth · 2019-05-29T14:47:23Z

Yep, wondering about why we wouldn't need to have a corresponding escaper in the setter, it occurs to me that this is already effectively escaped when the block is serialized. So I think the change is sensible in restoring a balance.

In further confirming this, I expect it should be effectively the exact opposite of:

gutenberg/packages/escape-html/src/index.js

Lines 95 to 97 in 13e5851

    
           export function escapeHTML( value ) { 
        
           	return escapeLessThan( escapeAmpersand( value ) ); 
        
           }

There are two observations:

We need ampersand unescaping. This already existed prior to the pull request.
We technically don't need "greater than" unescaping. I expect it should cause no harm to keep it, however, since this unescaping is only meant to account for escaping we only expect to occur from the above serialization logic anyways.

I'd also wondered about server-side escaping we might need to anticipate, and specifically in cases of unfiltered_html for non-privileged users. In investigating how this is applied, it seems most relevant for tag-stripping and not for character escaping (reference).

I expect we'd used source: 'text' previously as a naive way to achieve this reverse behavior of the serialization, but since innerText will unescape much more than what is escaped by the serializer, it would result in the unexpected breakage described in situations like of #13218.

aduth · 2019-05-29T16:22:05Z

@davilera I have some small concern about trying to keep this in sync with the specific implementation of the serializer, and I'm curious your thoughts on an alternative I have in mind:

The idea would be to change the source to 'html' (like as proposed) and use dangerouslySetInnerHTML in the save implementation to bypass the serializer escaping. To avoid having characters shown as escaped in the textarea, we would need to either use dangerouslySetInnerHTML in the edit implementation (which is a bit less "safe" in my mind than in save), or another way to decode the entities before assigning the value. There is a @wordpress/html-entities package for this purpose (its implementation is admittedly not far from as the alternative). The attribute value would need to be maintained in its escaped form, so as to align with the new save implementation. We could use the textarea's innerHTML for this, which as far as I can tell will take care of this.

var e = document.createElement( 'textarea' ); e.textContent = '&'; console.log( e.innerHTML );
// '&amp;'

It seems this alternative might pose some more risk in trying to assure safety of value escaping, so I may be comfortable as well with the current proposal, even if it needs to account for factors outside its own consideration (the serializer behavior).

davilera · 2019-05-30T07:28:17Z

To avoid having characters shown as escaped in the textarea, we would need to either use dangerouslySetInnerHTML in the edit implementation (which is a bit less "safe" in my mind than in save), or another way to decode the entities before assigning the value.

To be honest, I don't have a strong opinion on any solution, as none is perfect and each poses its own problems. That said, I feel like using dangerouslySetInnerHTML is slightly worse (I'm not sure we'd be able to control what it does).

Regarding the current proposal, I also don't like using an asymmetric escape and unescape functions in utils.js because, as @aduth said, “it needs to account for factors outside its own consideration (the serializer behavior)”. If we make it symmetric, the block should work (I think), but we'd be introducing a breaking change (as we'd be changing what's saved in the DB).

What do you think, @aduth? Should we introduce a breaking change? I think the risk the current proposal poses is minimal: in principle, our block would only receive a < character in content if somebody else (e.g. the serializer) inserted it. If it was the user who wrote the < entity, this would be saved as &lt; (because of our escape method), which means things should work as expected. But I'd rather have a self-contained solution... Complicated decisions here!

davilera · 2019-12-17T14:08:11Z

If I'm not mistaken, the issue this PR addressed has been fixed by @ellatrix in #17994, so I'm closing it.

Fixes bug that prevents code block from being re-edited

1e85f2b

davilera requested review from ajitbohra, gziolo, jorgefilipecosta, notnownikki, Soean, talldan and youknowriad as code owners May 22, 2019 15:42

aduth reviewed May 22, 2019

View reviewed changes

packages/block-library/src/code/index.js Outdated Show resolved Hide resolved

Unescapes < and > characters (so that “load” works), but doesn't …

1ac7cfa

…actually escape them (if anybody does, it's Gutenberg itself, implicitly, when saving)

swissspidy added the [Package] Block library /packages/block-library label May 28, 2019

aduth added the [Block] Code Affects the Code Block label May 29, 2019

aduth mentioned this pull request Nov 6, 2019

Escape Editable HTML #17994

Merged

5 tasks

davilera closed this Dec 17, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Code blocks should be editable after saving the post and reloading the editor #15780

Code blocks should be editable after saving the post and reloading the editor #15780

davilera commented May 22, 2019 •

edited

Loading

aduth commented May 22, 2019

davilera commented May 22, 2019

aduth commented May 22, 2019

davilera commented May 23, 2019

davilera commented May 28, 2019

aduth commented May 28, 2019

aduth commented May 29, 2019 •

edited

Loading

aduth commented May 29, 2019

davilera commented May 30, 2019 •

edited

Loading

davilera commented Dec 17, 2019

Code blocks should be editable after saving the post and reloading the editor #15780

Code blocks should be editable after saving the post and reloading the editor #15780

Conversation

davilera commented May 22, 2019 • edited Loading

Description

How has this been tested?

Types of changes

Checklist:

aduth commented May 22, 2019

davilera commented May 22, 2019

aduth commented May 22, 2019

davilera commented May 23, 2019

davilera commented May 28, 2019

aduth commented May 28, 2019

aduth commented May 29, 2019 • edited Loading

aduth commented May 29, 2019

davilera commented May 30, 2019 • edited Loading

davilera commented Dec 17, 2019

davilera commented May 22, 2019 •

edited

Loading

aduth commented May 29, 2019 •

edited

Loading

davilera commented May 30, 2019 •

edited

Loading