Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XML parser reverses attribute order #65

Closed
cogpp opened this issue Jul 23, 2015 · 22 comments · Fixed by #172
Closed

XML parser reverses attribute order #65

cogpp opened this issue Jul 23, 2015 · 22 comments · Fixed by #172
Milestone

Comments

@cogpp
Copy link

cogpp commented Jul 23, 2015

This issue is copied across from https://issues.scala-lang.org/browse/SI-6341

The original person who raised the issue attached a patch to add a failing test case.

When parsing XML and then writing it out again, the order of attributes is reversed.

This prevents certain use cases where one wants to modify an XML file and have a minimal amount of changes. E.g. adding a new element to a file causes all existing attributes to get reversed.

The cause is here:

https://github.com/scala/scala/blob/81e3121eb09010375783d13e8c4c853686a34ffd/src/library/scala/xml/parsing/MarkupParser.scala#L302

aMap: MetaData is built by prepending new attributes to the linked list, causing the reversal.

The attached patch adds a failing test case. (It failed when I wrote it originally, now I can't test it because the build fails).

commit 272ed40345c9f8f4af878f15299155cecfd8db2e
Author: Robin Stocker [email protected]
Date: Sat Sep 8 19:11:50 2012 +0200

Add failing test for scala.xml attribute order problem

diff --git a/test/files/run/xml-attribute-order-parse.check b/test/files/run/xml-attribute-order-parse.check
new file mode 100644
index 0000000..6637a4f
--- /dev/null
+++ b/test/files/run/xml-attribute-order-parse.check
@@ -0,0 +1 @@
+
diff --git a/test/files/run/xml-attribute-order-parse.scala b/test/files/run/xml-attribute-order-parse.scala
new file mode 100644
index 0000000..3be386c
--- /dev/null
+++ b/test/files/run/xml-attribute-order-parse.scala
@@ -0,0 +1,9 @@
+import xml.XML
+
+object Test {

  • def main(args: Array[String]): Unit = {
  • val input = """"""
  • val doc = XML.loadString(input)
  • println(doc)
  • }
    +}
@biswanaths
Copy link
Contributor

Thanks Graeme for taking time to re-reporting this issue from scala JIRA. We are in the process of migrating all the open issues from JIRA to github. I have left my test work on the migration as comment so that community can review it. ( #62)

Thanks again.

@nikhilsinha
Copy link

If no one's picked this up yet, I'd be happy to take a wack at it.

@biswanaths
Copy link
Contributor

Glad that you would like to help. We will be more than happy if we can get some help from you. Please let me know if you need any help from me.

Good luck.

@mbeckerle
Copy link

I wanted to comment on this issue.

Technically, order of attributes is not significant in XML. A proper equality comparison of two XML objects would find them equal even if the attributes are in different orders.

However, I understand the motivation for wanting to reduce deltas. The library shouldn't change things that don't need to be changed.

A related issue is changes of namespace prefixes for attributes or elements. The actual prefixes used are part of the XML Infoset, so techically <a:b xmlns:a="urn:foo"/> is not always equal to <c:b xmlns:c="urn:foo"/> because their infoset contents are different. XML is allowed to be sensitive to the specific characters used in the prefixes of the QNames.

This suggests that there are really several different levels of equality and correspondingly preservation (which is what this issue is about).

  • preserve order of attributes (and do comparison/equality where order of attributes matters)
  • preserve XML infoset (and equality where everything about XML infoset is compared, and only that - so attribute order doesn't matter, but specific prefix characters do matter)
  • preserve XML namespace-aware infoset (and equality where namespace prefixes are interpreted as their namespaces and the specific characters of the prefix are disregarded, and attribute order doesn't matter)

ashawley added a commit to ashawley/scala-xml that referenced this issue Sep 17, 2015
ashawley added a commit to ashawley/scala-xml that referenced this issue Sep 17, 2015
@ashawley
Copy link
Member

@biswanaths Do I need to make a pull request out of my branch, or can the commit (only the 2nd one, sorry) get pulled in here as a first-class citizen by you (as the admin)?

@hhimanshu
Copy link

any updates on this pull request?

@SethTisue
Copy link
Member

What if you use ConstructingParser, is the order still scrambled?

@hhimanshu
Copy link

hhimanshu commented Dec 2, 2016

hey @SethTisue
I guess I am not able to load my document using ConsructingParser

My file

 $cat /tmp/person.xml
 <root xmlns="http://www.nyorg.com/metadata/config" xmlns:core="http://www.myorg.com/metadata/commonschematypes" xmlns:xi="http://www.w3.org/2001/XInclude" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="../xschema/AppConfig.xsd">
    <!-- This is important comment about person-->
    <person>
      <firstName>John</firstName>
      <lastName>Doe</lastName>
      <email>[email protected]</email>
      <city>SomeWhere</city>
      <property name="myName" value="someValue" />
    </person>
  </root>

This is how I am using it

val source = io.Source.fromURL(new java.io.File("/tmp/person.xml").toURL)
val d = scala.xml.parsing.ConstructingParser.fromSource(source, preserveWS = true)
val doc = d.document()
println(doc)

What I get back?

:1:1: < expected<root xmlns="http://www.nyorg.com/metadata/config" xmlns:core="http://www.myorg.com/metadata/commonschematypes" xmlns:xi="http://www.w3.org/2001/XInclude" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="../xschema/AppConfig.xsd">^
doc: scala.xml.Document = null
null

On this other hand when I load the same file using XML.loadFile as

val doc = XML.loadFile("/tmp/person.xml")
println(doc)

It seems to load it correctly

<root xsi:noNamespaceSchemaLocation="../xschema/AppConfig.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xi="http://www.w3.org/2001/XInclude" xmlns:core="http://www.myorg.com/metadata/commonschematypes" xmlns="http://www.nyorg.com/metadata/config">
    
    <person>
      <firstName>John</firstName>
      <lastName>Doe</lastName>
      <email>[email protected]</email>
      <city>SomeWhere</city>
      <property value="someValue" name="myName"/>
    </person>
  </root>

but semantically has different orders and removed comments as documented http://stackoverflow.com/questions/40941628/scala-loadfile-xml-reorders-original-xml-and-removes-comments

any help resolving this is appreciated

@SethTisue
Copy link
Member

SethTisue commented Dec 2, 2016

@hhimanshu I'm sorry but this is an issue tracker, not a support forum. if you have a question about ConstructingParser, ask on Stack Overflow. if you become convinced there is a bug in ConstructingParser, you can open a new issue here.

@ashawley
Copy link
Member

ashawley commented Dec 2, 2016

Looks like Constructing Parser has the same issue. I'll add tests to the previous tests and make them in to a PR.

@hhimanshu
Copy link

@hhimanshu
Copy link

I was able to use ConstructingParser, but I can prove that even ConstructingParser reorders attributes. For details, see http://stackoverflow.com/a/40942534/379235

@SethTisue
Copy link
Member

good to know, thanks @hhimanshu

@hhimanshu
Copy link

Shall I open a Scala Issue for that?

@SethTisue
Copy link
Member

Shall I open a Scala Issue for that?

for what?

@hhimanshu
Copy link

ConstructingParser not preserving the order of element attributes

@SethTisue
Copy link
Member

SethTisue commented Dec 2, 2016

oh, I don't think we need a separate ticket for that, it's fine just as comments here on this one.

@hhimanshu
Copy link

ok Thanks

@hhimanshu
Copy link

I am thinking if there is anything I could do at the moment to unblock myself

@hhimanshu
Copy link

The other way (not the best way, but probably acceptable for my use case) is following

  • Load the XML (using ConstructingParser)
  • Perform Manipulation(s)
  • f1 = Persist XML (intermediate representation with reversed attributes)
  • Load the XML (using ConstructingParser)
  • f1 = persist XML (final representation)

This is what I did

import scala.xml._
import java.io.File
import scala.xml.parsing.ConstructingParser



val path = "/tmp/"
val filePath = path + "person.xml"
val newFilePath = path + "person.intermediate.xml"
val finalFilePath = path + "person.final.xml"
val source = io.Source.fromURL(new File(filePath).toURL)

val cp = ConstructingParser.fromSource(source, true)
XML.save(newFilePath, cp.document().head)

val source1 = io.Source.fromURL(new File(newFilePath).toURL)
val cp1 = ConstructingParser.fromSource(source1, true)
XML.save(finalFilePath, cp1.document().head)

and on command-line, I see

$ /tmp cat person.xml; echo "----------"; cat person.final.xml
<?xml version="1.0" encoding="UTF-8" ?>
<root>
    <!-- This is important comment -->
    <person>
        <fName>John</fName>
        <lName>Doe</lName>
        <property name="someName" value="someValue">p1</property>
    </person>
</root>
----------
<root>
    <!-- This is important comment -->
    <person>
        <fName>John</fName>
        <lName>Doe</lName>
        <property name="someName" value="someValue">p1</property>
    </person>
</root>

@ashawley
Copy link
Member

ashawley commented Apr 5, 2018

I was able to use ConstructingParser, but I can prove that even ConstructingParser reorders attributes.

The proposed fix in #172 also fixes ConstructingParser as well.

@ashawley
Copy link
Member

ashawley commented Apr 5, 2018

New comment to scala-user on Aug-14-2017:

http://www.scala-archive.org/Scala-XML-and-attribute-reordering-td3083550.html

@ashawley ashawley modified the milestones: 1.1.1, 1.2.0 May 24, 2018
@ashawley ashawley modified the milestones: 1.2.0, 2.0 Mar 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants