You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've noticed that the inclusion of script tags along with closing body or html tags causes the resulting HTML document to be malformed when parsed. Without script tags or when body and html closing tags are omitted, parsing occurs as expected. A slight modification of the HTML chunks in chunks_high_level.c illustrates the issue:
Parse chunk: <!DOCT
Parse chunk: YPE htm
Parse chunk: l>
Parse chunk: <html><head>
Parse chunk: <script>console.log('Hello, world!');</script>
Parse chunk: <ti
Parse chunk: tle>HTML chun
Parse chunk: ks parsing</
Parse chunk: title>
Parse chunk: </head><bod
Parse chunk: y><div cla
Parse chunk: ss=
Parse chunk: "bestof
Parse chunk: class
Parse chunk: ">
Parse chunk: good for me
Parse chunk: </div>
Parse chunk: </body>
Parse chunk: </html>
<!DOCTYPE html><html><head><script>console.log('Hello, world!');</script><title>HTML chunks parsing</title></head><body><div class="bestofclass">good for me</div></body></html></script></head><body></body></html>
Process finished with exit code 0
You'll notice the extraneous </script></head><body></body></html> string at the end, as though the initial script tag was never closed.
It's also quite possible that I'm misunderstanding how myhtml_parse_chunk is supposed to be used - if so, clarification would be greatly appreciated.
Thanks in advance for your time and attention!
The text was updated successfully, but these errors were encountered:
schrodingersket
changed the title
Script Tags Cause Incorrect Chunked Parsing With Trailing
Script Tags Cause Incorrect Chunked Parsing With Closing Body and HTML Tags
Jul 26, 2018
I've noticed that the inclusion of
script
tags along with closingbody
orhtml
tags causes the resulting HTML document to be malformed when parsed. Withoutscript
tags or whenbody
andhtml
closing tags are omitted, parsing occurs as expected. A slight modification of the HTML chunks in chunks_high_level.c illustrates the issue:This outputs the following:
You'll notice the extraneous
</script></head><body></body></html>
string at the end, as though the initialscript
tag was never closed.It's also quite possible that I'm misunderstanding how
myhtml_parse_chunk
is supposed to be used - if so, clarification would be greatly appreciated.Thanks in advance for your time and attention!
The text was updated successfully, but these errors were encountered: