Skip to content

Commit

Permalink
Add examples for non-roundtrippable HTML
Browse files Browse the repository at this point in the history
Fixes #1280.
  • Loading branch information
zcorpan committed Jun 7, 2016
1 parent 270b5f3 commit ebcfce6
Showing 1 changed file with 46 additions and 0 deletions.
46 changes: 46 additions & 0 deletions source
Original file line number Diff line number Diff line change
Expand Up @@ -107633,6 +107633,52 @@ document.body.appendChild(text);

</div>

<p class="note">Tree structures that do not roundtrip a serialise and reparse step can also be
produced by the <span>HTML parser</span> itself, although such cases are non-conforming.</p>

<div class="example">

<p>For example, consider the following markup:</p>

<pre>&lt;form id="outer">&lt;div>&lt;/form>&lt;form id="inner">&lt;input></pre>

<p>This will be parsed into:</p>

<ul class="domTree"><li class="t1"><code>html</code><ul><li class="t1"><code>head</code></li><li class="t1"><code>body</code><ul><li class="t1"><code>form</code> <span class="t2" data-x=""><code class="attribute name" data-x="attr-id">id</code>="<code class="attribute value" data-x="">outer</code>"</span><ul><li class="t1"><code>div</code><ul><li class="t1"><code>form</code> <span class="t2" data-x=""><code class="attribute name" data-x="attr-id">id</code>="<code class="attribute value" data-x="">inner</code>"</span><ul><li class="t1"><code>input</code></li></ul></li></ul></li></ul></li></ul></li></ul></li></ul>

<p>The <code>input</code> element will be associated with the inner <code>form</code> element.
Now, if this tree structure is serialised and reparsed, the <code data-x="">&lt;form
id="inner"></code> start tag will be ignored, and so the <code>input</code> element will be
associated with the outer <code>form</code> element instead.</p>

<pre>&lt;html>&lt;head>&lt;/head>&lt;body>&lt;form id="outer">&lt;div><mark>&lt;form id="inner"></mark>&lt;input>&lt;/form>&lt;/div>&lt;/form>&lt;/body>&lt;/html></pre>

<ul class="domTree"><li class="t1"><code>html</code><ul><li class="t1"><code>head</code></li><li class="t1"><code>body</code><ul><li class="t1"><code>form</code> <span class="t2" data-x=""><code class="attribute name" data-x="attr-id">id</code>="<code class="attribute value" data-x="">outer</code>"</span><ul><li class="t1"><code>div</code><ul><li class="t1"><code>input</code></li></ul></li></ul></li></ul></li></ul></li></ul>

</div>

<div class="example">

<p>As another example, consider the following markup:</p>

<pre>&lt;a>&lt;table>&lt;a></pre>

<p>This will be parsed into:</p>

<ul class="domTree"><li class="t1"><code>html</code><ul><li class="t1"><code>head</code></li><li class="t1"><code>body</code><ul><li class="t1"><code>a</code><ul><li class="t1"><code>a</code></li><li class="t1"><code>table</code></li></ul></li></ul></li></ul></li></ul>

<p>That is, the <code>a</code> elements are nested, because the second <code>a</code> element is
<span data-x="foster parent">foster parented</span>. After a serialise-reparse roundtrip, the
<code>a</code> elements and the <code>table</code> element would all be siblings, because the
second <code data-x="">&lt;a></code> start tag implicitly closes the first <code>a</code>
element.</p>

<pre>&lt;html>&lt;head>&lt;/head>&lt;body>&lt;a><mark>&lt;a></mark>&lt;/a>&lt;table>&lt;/table>&lt;/a>&lt;/body>&lt;/html></pre>

<ul class="domTree"><li class="t1"><code>html</code><ul><li class="t1"><code>head</code></li><li class="t1"><code>body</code><ul><li class="t1"><code>a</code></li><li class="t1"><code>a</code></li><li class="t1"><code>table</code></li></ul></li></ul></li></ul>

</div>

<p><dfn id="escapingString">Escaping a string</dfn> (for the purposes of the algorithm above)
consists of running the following steps:</p>

Expand Down

0 comments on commit ebcfce6

Please sign in to comment.