Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fails to load XML with incorrect DTD #2725

Closed
scabug opened this issue Nov 28, 2009 · 8 comments
Closed

fails to load XML with incorrect DTD #2725

scabug opened this issue Nov 28, 2009 · 8 comments

Comments

@scabug
Copy link

scabug commented Nov 28, 2009

XML:

<?xml version='1.0'?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/404040404.dtd">
<html/>

Code:

scala.xml.XML.loadFile(new java.io.File("1.xml"))

fails:

java.io.FileNotFoundException: http://www.w3.org/404040404.dtd
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1288)
	at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:677)
	at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(XMLEntityManager.java:1315)
	at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startDTDEntity(XMLEntityManager.java:1282)
	at com.sun.org.apache.xerces.internal.impl.XMLDTDScannerImpl.setInputSource(XMLDTDScannerImpl.java:283)
	at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$$DTDDriver.dispatch(XMLDocumentScannerImpl.java:1192)
	at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$$DTDDriver.next(XMLDocumentScannerImpl.java:1089)
	at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$$PrologDriver.next(XMLDocumentScannerImpl.java:1002)
	at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648)
	at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:510)
	at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:807)
	at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737)
	at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:107)
	at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1205)
	at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$$JAXPSAXParser.parse(SAXParserImpl.java:522)
	at javax.xml.parsers.SAXParser.parse(SAXParser.java:395)
	at scala.xml.factory.XMLLoader$$class.loadXML(XMLLoader.scala:41)
	at scala.xml.XML$$.loadXML(XML.scala:43)
	at scala.xml.factory.XMLLoader$$class.loadFile(XMLLoader.scala:48)
	at scala.xml.XML$$.loadFile(XML.scala:43)

Doctype declaration should be ignored when parsing XML.

@scabug
Copy link
Author

scabug commented Nov 28, 2009

Imported From: https://issues.scala-lang.org/browse/SI-2725?orig=1
Reporter: @stepancheg

@scabug
Copy link
Author

scabug commented Nov 28, 2009

@stepancheg said:
scala-2.8.0.r19920-b20091128020206

@scabug
Copy link
Author

scabug commented Nov 28, 2009

@paulp said:
OMG the way to disable that was hard to find. None of the obvious stuff with settings and less obvious stuff with entity resolvers had any effect, but eventually I found it although it appears to be specific to this sax parser. I am once again blown away at how hard XML makes easy things. We could really use a (motivated) XML expert around here, I have a lot of issues for him to straighten out. Fixed in r19926.

@scabug
Copy link
Author

scabug commented May 10, 2010

@paulp said:
It is reported on a list this crashes under openjdk. I assume there is some way to check if a feature exists first.

Whenever I try to use the RC2 (or RC1) REPL it crashes with the following
error:
    
$$ scala
Welcome to Scala version 2.8.0.RC2 (OpenJDK 64-Bit Server VM, Java 1.6.0_17).   
Type in expressions to have them evaluated.
Type :help for more information.
...
Caused by: org.xml.sax.SAXNotRecognizedException: Feature
'http://apache.org/xml/features/nonvalidating/load-external-dtd' is
not recognized.
        at
org.apache.xerces.parsers.AbstractSAXParser.setFeature(AbstractSAXParser.java:
1666)
        at
org.apache.xerces.jaxp.SAXParserImpl$$JAXPSAXParser.setFeature0(SAXParserImpl.j
ava:542)  
        at

@scabug
Copy link
Author

scabug commented May 22, 2010

@paulp said:
Reverted in r22014. Reassigning to xml team.

@scabug
Copy link
Author

scabug commented May 22, 2010

@paulp said:
Sorry, in r22013.

@scabug
Copy link
Author

scabug commented Mar 15, 2012

@dcsobral said:
Funny that I never saw this ticket. I use the following, which has a different feature. I'd need to test it against openjdk, though.

import scala.xml.Elem
import scala.xml.factory.XMLLoader
import javax.xml.parsers.SAXParser
object MyXML extends XMLLoader[Elem] {
  override def parser: SAXParser = {
    val f = javax.xml.parsers.SAXParserFactory.newInstance()
    f.setNamespaceAware(false)
    f.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
    f.newSAXParser()
  }
}

@scabug
Copy link
Author

scabug commented Jul 17, 2015

@SethTisue said:
The scala-xml library is now community-maintained. Issues with it are now tracked at https://github.com/scala/scala-xml/issues instead of here in the Scala JIRA.

Interested community members: if you consider this issue significant, feel free to open a new issue for it on GitHub, with links in both directions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants