[
http://jira.codehaus.org/browse/DOXIA-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=169064#action_169064
]
Lukas Theussl commented on DOXIA-226:
-------------------------------------
In addition, whitespace is never ignorable/collapsible/trimmable within
verbatim blocks, ie within <source></source> or <pre></pre> in xdocs.
> Make XML based parsers better handle whitespace
> -----------------------------------------------
>
> Key: DOXIA-226
> URL: http://jira.codehaus.org/browse/DOXIA-226
> Project: Maven Doxia
> Issue Type: Improvement
> Reporter: Benjamin Bentmann
> Fix For: 1.2
>
>
> Regarding whitespace in XML documents, one needs to consider the following
> aspects:
> - ignorable whitespace, i.e. view "{{<tr> <td/> </tr>}}" and
> "{{<tr><td/></tr>}}" as equivalent
> - collapsible whitespace, i.e. view "{{Text Text}}" and "{{Text
> Text}}" as equivalent
> - trimmable whitespace, i.e. view "{{<p> Text </p>}}" and "{{<p>Text</p>}}"
> as equivalent
> Those distinctions require a DTD/XSD in combination with a validating parser
> and/or application-specific knowledge. For robustness, doxia parsers for
> XML-based formats should not depend on the existence of a schema definition
> such that they reliably deliver events into the sinks. Hence I suggest to
> hard-code the required logic for proper whitespace handling into each parser.
> Currently, whitespace handling is rather static, e.g. {{XhtmlBaseParser}}
> pushes all input whitespace into the sink. This might cause troubles with
> sinks that are not expected to receive ignorable whitespace. To address this
> issue, it seems helpful if {{AbstractXmlParser}} provided a default
> implementation of {{handleText()}} that subclasses can simply control via
> state flags instead of implementing {{handleText()}} from scratch in each
> parser. Copy&Paste - which caused DOXIA-225 - needs to be avoided.
> More precisely, I image the following changes:
> - Have {{AbstractXmlParser}} maintain a stack of tuples (ignorable,
> collapsible, trimmable) where each tuple describes the whitespace handling
> for the currently parsed element
> - Have {{AbstractXmlParser}} push/pop a tuple from this stack before/after
> calling {{handleStartTag()}}/{{handleEndTag()}}
> - Have {{AbstractXmlParser}} provide setters to allow subclasses to control
> the desired whitespace handling in their {{handleStartTag()}} implementation
> - Have {{AbstractXmlParser}} implement {{handleText()}} where it evalutes the
> top-most tuple from the stack
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira