Christian,

This is not a bug. "file:entity.xml" is already an absolute URI. Resolving
it against a base URI will always result in "file:entity.xml".

See the definition of an absolute URI [1] and the algorithm for relative
resolution [2] described in RFC 3986.

Thanks.

[1] http://tools.ietf.org/html/rfc3986#section-4.3
[2] http://tools.ietf.org/html/rfc3986#section-5.2

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: [email protected]
E-mail: [email protected]

Christian Roth <[email protected]> wrote on 07/19/2011 06:28:00 AM:

> Hello,
>
> I am having an issue with relative URLs that specify their protocol
> in external entity declarations.
>
> In short,
>
>   <!ENTITY ent SYSTEM "entity.xml">
>
> resolves correctly, the semantically identical
>
>   <!ENTITY ent SYSTEM "file:entity.xml">
>
> does not.
>
> In the first case, Xerces correctly calculates the absolute path to
> entity.xml as being relative to the instance document's base path.
>
> In the second case, Xerces does not - it looks like it assumes "
> file:entity.xml" is an absolute path and hands it verbatim to the
> systems entity resolver. This looks like a bug to me.
>
> Here's a sample file set to reproduce the issue (put them all four
> at the same directory level):
>
>
> -- "frame-good.xml" : the document which works (no protocol specified) --
> <?xml version="1.0" encoding="UTF-8"?>
> <!DOCTYPE doc SYSTEM "doc.dtd"
> [
> <!ENTITY ent SYSTEM "entity.xml">
> ]>
> <doc>&ent;</doc>
> -- eof --
>
>
> -- "frame-bad.xml" : the document which does NOT work (protocol
specified) --
> <?xml version="1.0" encoding="UTF-8"?>
> <!DOCTYPE doc SYSTEM "doc.dtd"
> [
> <!ENTITY ent SYSTEM "file:entity.xml">
> ]>
> <doc>&ent;</doc>
> -- eof --
>
>
> -- "entity.xml" : the file included via entity ref --
> <?xml version="1.0" encoding="UTF-8"?>
> <dummy/>
> -- eof --
>
>
> -- "doc.dtd" : the DTD file to validate against --
> <?xml version="1.0" encoding="UTF-8"?>
> <!ELEMENT doc (dummy) >
> <!ELEMENT dummy EMPTY >
> -- eof --
>
>
> I am testing with Xerces J 2.11.0 and am using its samples.jar as
follows:
>
> java -classpath xercesImpl.jar:xercesSamples.jar:xml-apis.jar
> sax.Counter -v frame-good.xml
>
> works,
>
> java -classpath xercesImpl.jar:xercesSamples.jar:xml-apis.jar
> sax.Counter -v frame-bad.xml
>
> does not but instead gives the following error:
>
> error: Parse error occurred - entity.xml (No such file or directory)
> java.io.FileNotFoundException: entity.xml (No such file or directory)
>    at java.io.FileInputStream.open(Native Method)
>    at java.io.FileInputStream.<init>(FileInputStream.java:120)
>    at java.io.FileInputStream.<init>(FileInputStream.java:79)
>    at sun.net.www.protocol.file.FileURLConnection.connect
> (FileURLConnection.java:70)
>    at sun.net.www.protocol.file.FileURLConnection.getInputStream
> (FileURLConnection.java:161)
>    at org.apache.xerces.impl.XMLEntityManager.setupCurrentEntity
> (Unknown Source)
>    at org.apache.xerces.impl.XMLEntityManager.startEntity(Unknown Source)
>    at org.apache.xerces.impl.XMLEntityManager.startEntity(Unknown Source)
>    at
> org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanEntityReference
> (Unknown Source)
>    at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl
> $FragmentContentDispatcher.dispatch(Unknown Source)
>    at
> org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument
> (Unknown Source)
>    at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
>    at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
>    at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
>    at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
>    at sax.Counter.main(Unknown Source)
>
>
>
> Am I wrong or is Xerces wrong?
>
> Kind regards
> Christian
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]

Reply via email to