Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(#16458): regression in xml syntax parsing #19522

Merged
merged 3 commits into from
Feb 15, 2024

Conversation

i10416
Copy link
Contributor

@i10416 i10416 commented Jan 23, 2024

close #16458
xLiteral mistakenly assumed that the element just after < is always non-special element, but it is not true. It could be xCharData, comment, xProcInstr.

xLiteral mistakenly assumed that the element just after `<` is always
non-special element, but it is not true. It could be xCharData, comment,
xProcInstr.
@@ -0,0 +1,37 @@
def x = <div>FooBar</div><!-- /.modal-content -->

package scala.xml {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copy-pasted and modifed code from tests/run/xml.scala. Without this, the test fails due to missing xml module.

@@ -371,10 +371,17 @@ object MarkupParsers {
// parse more XML?
if (charComingAfter(xSpaceOpt()) == '<') {
while {
xSpaceOpt()
nextch()
ts.append(element)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why my previous fix 624a6e0 used ts.append(element) instead of content_LT(ts) which is what scala 2 has. (I would expect the codebases not to diverge much.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why too, but I suspect it was just a mistake as element cannot consume XML special nodes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@som-snytt som-snytt Jan 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is what I was comparing. oh, my previous comment says just that.

It was a quick fix, but the scala 2 pr was also my quick fix, so I have no idea what happened that day. Thanks for following it up.

@som-snytt
Copy link
Contributor

som-snytt commented Jan 23, 2024

I copied your test and just changed ts.append(element) to content_LT(ts) and it passed.

I didn't put in any brain power, but I see content_LT handles the empty element.

--- a/compiler/src/dotty/tools/dotc/parsing/xml/MarkupParsers.scala
+++ b/compiler/src/dotty/tools/dotc/parsing/xml/MarkupParsers.scala
@@ -373,7 +373,8 @@ object MarkupParsers {
           while {
             xSpaceOpt()
             nextch()
-            ts.append(element)
+            //ts.append(element)
+            content_LT(ts)
             charComingAfter(xSpaceOpt()) == '<'

i10416 added a commit to i10416/dotty that referenced this pull request Jan 24, 2024
Just replacing element with content_LT, it works.

See scala#19522 (comment)
@i10416
Copy link
Contributor Author

i10416 commented Jan 24, 2024

I copied your test and just changed ts.append(element) to content_LT(ts) and it passed.

Thank you. I checked it and it works!

I added some precondition check in the previous commit because I suspected the xSpaceOpt();nextch();content_LT(ts); would mistakenly parse < !-- comment -->, but it does not. It properly fails with in XML literal: name expected, but char ' ' cannot start a name.

Just replacing element with content_LT, it works.

See scala#19522 (comment)
@i10416 i10416 force-pushed the fix/16458-xml-syntax-parse-regressions branch from 2a88851 to 27046fe Compare January 24, 2024 04:13
object Test {
import scala.xml.*
def main(args: Array[String]): Unit = {
val xml = <div>FooBar</div><!-- /.modal-content -->
Copy link
Contributor Author

@i10416 i10416 Jan 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced tests/pos/i16458.scala with tests/run/i16458.scala to make sure we get expected parse result.

@i10416 i10416 force-pushed the fix/16458-xml-syntax-parse-regressions branch from 788fc43 to 8bdbe77 Compare January 24, 2024 05:24
To check parsing properly, it is better to run a test and assert parse
result.
@i10416 i10416 force-pushed the fix/16458-xml-syntax-parse-regressions branch from 8bdbe77 to 9de9d57 Compare January 24, 2024 05:25
@i10416 i10416 requested a review from som-snytt January 24, 2024 07:50
Copy link
Contributor

@odersky odersky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know very little about XML parsing but judging from the discussion everybody agrees this is OK.

@odersky odersky merged commit 50d62f7 into scala:main Feb 15, 2024
19 checks passed
@Kordyjan Kordyjan added this to the 3.4.2 milestone Mar 28, 2024
WojciechMazur pushed a commit that referenced this pull request Jul 1, 2024
Just replacing element with content_LT, it works.

See #19522 (comment)

[Cherry-picked 27046fe]
WojciechMazur pushed a commit that referenced this pull request Jul 2, 2024
Just replacing element with content_LT, it works.

See #19522 (comment)

[Cherry-picked 27046fe]
WojciechMazur added a commit that referenced this pull request Jul 2, 2024
Backports #19522 to the LTS branch.

PR submitted by the release tooling.
[skip ci]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Regression in parsing XML syntax
4 participants