You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
We are seeing a difference in parser behaviour between 1.11.3, 1.15.4 and 1.16.1-SNAPSHOT with respect to whitespace and newlines between the html tags and text. We are using xml parser and prettyprint set to true.
We are bit confused about which is the correct behaviour as we are upgrading the jsoup version from 1.11.3 to 1.15.4
to understand the difference I have highlighted it in the image below which was captured using notepad++ to showcase the difference in whitespace between various versions of jsoup
In 1.11.3 we see only <div> tag, but with 1.15.4 newline between <div> and <b> tag is lost.
With 1.11.3 we see there is a whitespace after every <br /> and after the text "v3yfygygyg" but with 1.15.4 and 1.16.1-SNAPSHOT the whitespace is not there.
with 1.11.3 we see there is a whitespace after <div align="left"> which is not present with 1.15.4 and 1.16.1-SNAPSHOT version of jsoup
with 1.11.3 we see there is a whitespace after </div> tag which is not present with 1.15.4 and 1.16.1-SNAPSHOT version of jsoup
@jhy We are bit confused with what is the correct behaviour of the parser. Was there an issue with 1.11.3 which is fixed now or is this a new issue? Please can you let us know what should be the correct behaviour.
Thank you....
The text was updated successfully, but these errors were encountered:
Reeniya
changed the title
Difference in Parser behaviour between 1.11.3, 1.15.4 and 1.16.1-snapshot with respenct to whitespace and newline
Difference in Parser behaviour between 1.11.3, 1.15.4 and 1.16.1-snapshot with respect to whitespace and newline
Mar 9, 2023
Reeniya
changed the title
Difference in Parser behaviour between 1.11.3, 1.15.4 and 1.16.1-snapshot with respect to whitespace and newline
Difference in Parser behaviour between 1.11.3, 1.15.4 and 1.16.1-SNAPSHOT with respect to whitespace and newline
Mar 9, 2023
The output of the pretty-printer is subject to change as we make improvements.
If the output of the printer causes a change to the way a browser renders the HTML, I would generally consider it a bug. E.g. see #1926. But there will be changes between releases. I believe the current output is better than the previous output, and so am inclined to keep it as-is.
Hi,
We are seeing a difference in parser behaviour between 1.11.3, 1.15.4 and 1.16.1-SNAPSHOT with respect to whitespace and newlines between the html tags and text. We are using xml parser and prettyprint set to true.
For example:
Input:
with Jsoup version 1.11.3 we get the parser output has
with Jsoup version 1.15.4 we get the parser output has
We can re-create this issue using: https://try.jsoup.org/~HLY5GwlDvfC8Fn8tiGSpgCyIFFo
I observed that some fixes was done around whitespace and newline character so I consumed 1.16.1-snapshot version.
with Jsoup version 1.16.1-snapshot version we get the parser output has
We are bit confused about which is the correct behaviour as we are upgrading the jsoup version from 1.11.3 to 1.15.4
to understand the difference I have highlighted it in the image below which was captured using notepad++ to showcase the difference in whitespace between various versions of jsoup
In 1.11.3 we see only
<div>
tag, but with 1.15.4 newline between<div>
and<b>
tag is lost.With 1.11.3 we see there is a whitespace after every
<br />
and after the text "v3yfygygyg" but with 1.15.4 and 1.16.1-SNAPSHOT the whitespace is not there.with 1.11.3 we see there is a whitespace after
<div align="left">
which is not present with 1.15.4 and 1.16.1-SNAPSHOT version of jsoupwith 1.11.3 we see there is a whitespace after
</div>
tag which is not present with 1.15.4 and 1.16.1-SNAPSHOT version of jsoup@jhy We are bit confused with what is the correct behaviour of the parser. Was there an issue with 1.11.3 which is fixed now or is this a new issue? Please can you let us know what should be the correct behaviour.
Thank you....
The text was updated successfully, but these errors were encountered: