Ok, can someone explain to me what the method 'loadGrammar' actually
does and when and why I would use it? Likewise, the same questions for
the method 'useCachedGrammarInParse'. Hopefully, this will help me to
understand how the Parsers validates against a DTD. What is the meaning
of 'grammar' in this context and how does the parser generate and use
it?
BTW... I read the header files docs on these methods, but could not make
sense of what is meant by 'Preparse' schema/dtd grammar. See below.
Thanks
/**
* Preparse schema grammar (XML Schema, DTD, etc.) via a file path
or URL
*
* This method invokes the preparsing process on a schema grammar
XML
* file specified by the file path parameter. If the 'toCache' flag
* is enabled, the parser will cache the grammars for re-use. If a
grammar
* key is found in the pool, no caching of any grammar will take
place.
*
* <p><b>"Experimental - subject to change"</b></p>
*
* @param systemId A const char pointer to a native string which
contains
* the path to the XML grammar file to be
preparsed.
* @param grammarType The grammar type (Schema or DTD).
* @param toCache If <code>true</code>, we cache the preparsed
grammar,
* otherwise, no chaching. Default is
<code>false</code>.
* @return The preparsed schema grammar object (SchemaGrammar or
* DTDGrammar). That grammar object is owned by the parser.
*
* @exception SAXException Any SAX exception, possibly
* wrapping another exception.
* @exception XMLException An exception from the parser or client
* handler code.
* @exception DOMException A DOM exception as per DOM spec.
*/
Grammar* loadGrammar(const char* const systemId,
const short grammarType,
const bool toCache = false);
/** Set the 'Use cached grammar' flag
*
* This method allows users to enable or disable the use of cached
* grammars. When set to true, the parser will use the cached
grammar,
* instead of building the grammar from scratch, to validate XML
* documents.
void useCachedGrammarInParse(const bool newState);
DeWayne Dantlzer
206-544-3658
-----Original Message-----
From: David Bertoni [mailto:[email protected]]
Sent: Friday, January 23, 2009 2:06 PM
To: [email protected]
Subject: Re: How do you validate against an external DTD
Dantzler, DeWayne C wrote:
> Hello
>
> I'm currently using Xerces C++ 2.7.0 DOM Parser. I want to validate
> against an external DTD specified by the caller and not the DTD called
> out in the XML document. How is this done with the Xerces API?
The easiest way is with an EntityResolver or a DOMLSResourceResolver,
depending on which parsing interface you're using.
>
> Here's what I've done.
>
> Code snippet:
> ==========================================================
> //perform the necessary DOM Parser init and setup ... omitting all the
> gory details
>
> //setup the Entity Resolver to redirect the parser to use the external
> DTD //to resolve external ENTITY's and not the DTD called out in the
> XML file by the DOCTYPE declaration //The entire XML file will also be
> validated against the external DTD as well
>
> csCtkXmlEntityResolver *resolver = new
csCtkXmlEntityResolver(dtd);
> <<--- caller specified dtd
> _XmlDOMParser->setEntityResolver(resolver);
>
> //Turn on the necessary features to allow validation by an
> external DTD
> _XmlDOMParser->setValidationScheme(_XmlDOMParser->Val_Always);
> _XmlDOMParser->setDoSchema(false);
> _XmlDOMParser->setLoadExternalDTD(true);
This value is true by default.
>
> //TODO - How do you load an external DTD???
> //Not sure if this routine does it, but I'll give it a try and
it
> didn't!!
>
> _XmlDOMParser->loadGrammar(dtd, Grammar::DTDGrammarType, false);
You've loaded the grammar, but told the parser not to cache it, so it
knows nothing about it. Please pass 'true' as the last parameter.
> _XmlDOMParser->setSkipDTDValidation(false);
> _XmlDOMParser->useCachedGrammarInParse(true);
>
> ....
> //attempt to parese the XML and get pages of errors
> }
> ==============================================================
>
> Results: Lots of Parser Errors but several of the elments are defined
> in the DTD
>
> <?xml version="1.0" encoding="UTF-8"?> line 2::column 6: <bcme
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>
> Parser Error Message: Unknown element 'bcme'
>
> <?xml version="1.0" encoding="UTF-8"?> line 2::column 17: <bcme
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>
> Parser Error Message: Attribute 'xmlns:xsi' is not declared for
> element 'bcme'
>
> <?xml version="1.0" encoding="UTF-8"?> <bcme
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> line 3::column 33:
> xsi:noNamespaceSchemaLocation="http://www.boeing.com/commercial/cas_xm
> l/
> s1000d/3-0BE1-3/schema/bcme.xsd">
>
> Parser Error Message: Attribute 'xsi:noNamespaceSchemaLocation' is not
> declared for element 'bcme'
>
> ==============================================================
> XML Doc Snippet
>
> <?xml version="1.0" encoding="UTF-8"?> <bcme
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>
> xsi:noNamespaceSchemaLocation="http://www.mycompany.com/cas_xml/schema
> /a
> cme.xsd">
> <edition>
> <editionid>BCMM-21-21-03-81205_20080301.1207260301</editionid>
> <editionType>New</editionType>
> <editionNumber>1207260301</editionNumber>
> </edition>
> ==================================================================
This document looks like it uses a schema for validation, not a DTD. In
general, documents that are structured to use XML schema for validation
can be difficult to validate using a DTD. You might still get some
validation errors, depending on the instance document, and how flexible
the DTD is.
Dave