Hi Shalin, I wasn't talking about the behavior of parsers in the wild, but rather about the XML specification (paraphrasing):
1. An XML document is not well-formed unless it matches the production labeled document. 2. Violations of well-formedness constraints are fatal errors. 3. Once a fatal error is detected, an XML parser MUST NOT continue normal processing. So although there are undoubtedly parsers that will parse '<' in attribute values, in so doing, these parsers are non-conformant with the XML specification. This is important only to the extent that people who create documents that target non-conforming features of parsers can't reliably expect these documents to be parsed by conformant parsers; XML's write-once-parse-anywhere promise thereby inexorably evaporates. Telling people that it's not a problem (or required!) to write non-well-formed XML, because a particular XML parser can't accept well-formed XML is kind of insidious. I for one will not stand idly by and permit this outrage to remain unchallenged!!! :) Steve On 10/22/2008 at 4:01 AM, Shalin Shekhar Mangar wrote: > Actually, most XML parsers don't require you to escape such > characters in attributes. You are welcome to try this out, > just look at the example-DIH :) > > On Tue, Oct 21, 2008 at 11:11 PM, Steven A Rowe > <[EMAIL PROTECTED]> wrote: > > > Wow, I really should read more closely before I respond - I see now, > > Noble, that you were talking about DIH's ability to parse escaped '<'s > > in attribute values, rather than about whether '<' was an acceptable > > character in attribute values. > > > > I should repurpose my remarks to note to Shalin, though, that all > > (conformant) XML parsers have to be able to handle escaped '<'s in > > attribute values, since an XML document with a '<' in an attribute > > value is not well-formed. > > > > Steve > > > > On 10/21/2008 at 1:10 PM, Steven A Rowe wrote: > > > On 10/21/2008 at 12:14 AM, Noble Paul നോബിള് नोब्ळ् wrote: > > > > On Tue, Oct 21, 2008 at 12:56 AM, Shalin Shekhar Mangar > > > <[EMAIL PROTECTED]> wrote: > > > > > Your data-config looks fine except for one thing -- > you do not need > > to > > > > > escape '<' character in an XML attribute. It maybe throwing off the > > > > > parsing code in DataImportHandler. > > > > > > > > not really '<' is fine in attribute > > > > > > Noble, I think you're wrong - AFAICT from the XML spec., '<' is *not* > > > fine in an attribute value - from > > > <http://www.w3.org/TR/REC-xml/#NT-AttValue>: > > > > > > [10] AttValue ::= '"' ([^<&"] | Reference)* '"' > > > | "'" ([^<&'] | Reference)* "'" > > > > > > where an attribute <http://www.w3.org/TR/REC-xml/#dt-stag> is: > > > > > > [41] Attribute ::= Name Eq AttValue > > > > > > Steve