Hi Shalin,

I wasn't talking about the behavior of parsers in the wild, but rather about 
the XML specification (paraphrasing):

1. An XML document is not well-formed unless it matches the production labeled 
document.
2. Violations of well-formedness constraints are fatal errors.
3. Once a fatal error is detected, an XML parser MUST NOT continue normal 
processing.

So although there are undoubtedly parsers that will parse '<' in attribute 
values, in so doing, these parsers are non-conformant with the XML 
specification.  This is important only to the extent that people who create 
documents that target non-conforming features of parsers can't reliably expect 
these documents to be parsed by conformant parsers; XML's 
write-once-parse-anywhere promise thereby inexorably evaporates.

Telling people that it's not a problem (or required!) to write non-well-formed 
XML, because a particular XML parser can't accept well-formed XML is kind of 
insidious.  I for one will not stand idly by and permit this outrage to remain 
unchallenged!!!

:)

Steve

On 10/22/2008 at 4:01 AM, Shalin Shekhar Mangar wrote:
> Actually, most XML parsers don't require you to escape such
> characters in attributes. You are welcome to try this out,
> just look at the example-DIH :)
> 
> On Tue, Oct 21, 2008 at 11:11 PM, Steven A Rowe
> <[EMAIL PROTECTED]> wrote:
> 
> > Wow, I really should read more closely before I respond - I see now,
> > Noble, that you were talking about DIH's ability to parse escaped '<'s
> > in attribute values, rather than about whether '<' was an acceptable
> > character in attribute values.
> > 
> > I should repurpose my remarks to note to Shalin, though, that all
> > (conformant) XML parsers have to be able to handle escaped '<'s in
> > attribute values, since an XML document with a '<' in an attribute
> > value is not well-formed.
> > 
> > Steve
> > 
> > On 10/21/2008 at 1:10 PM, Steven A Rowe wrote:
> > > On 10/21/2008 at 12:14 AM, Noble Paul നോബിള്‍ नोब्ळ् wrote:
> > > > On Tue, Oct 21, 2008 at 12:56 AM, Shalin Shekhar Mangar
> > > <[EMAIL PROTECTED]> wrote:
> > > > > Your data-config looks fine except for one thing --
> you do not need
> > to
> > > > > escape '<' character in an XML attribute. It maybe throwing off the
> > > > > parsing code in DataImportHandler.
> > > > 
> > > > not really '<' is fine in attribute
> > > 
> > > Noble, I think you're wrong - AFAICT from the XML spec., '<' is *not*
> > > fine in an attribute value - from
> > > <http://www.w3.org/TR/REC-xml/#NT-AttValue>:
> > > 
> > >   [10]  AttValue ::= '"' ([^<&"] | Reference)* '"'
> > >                  |   "'" ([^<&'] | Reference)* "'"
> > > 
> > > where an attribute <http://www.w3.org/TR/REC-xml/#dt-stag> is:
> > > 
> > >   [41] Attribute ::= Name Eq AttValue
> > > 
> > > Steve

Reply via email to