Yonik Seeley wrote:
On Sat, Mar 1, 2008 at 6:47 PM, Leonardo Santagada <[EMAIL PROTECTED]> wrote:
 Can't he put this code on the server before the xml parsing somehow? I
 would do like you said and do it on the client, but just out of
 curiosity is this really impossible?
We'd have to implement our own xml parser (or a subset of one) for that.
I am not sure this is such a good idea. After all, XML does not allow these characters, so if you write your own parser, that would not be a standard compliant XML parser and you would need to more or less re-invent the whole tool-chain for your slightly-modified-but-not-quite-XML format. A better strategy I think would be to put the responsibility on the client to send correct XML if they say they send XML. If necessary, a different escaping mechanism like the \u<codepoint> used in many programming languages could be used for the XML transport layer.

A simple search+replace of &#xx; could do the wrong thing I think
(might be an actual literal in a CDATA block for example).
This would also not get you beyond the XML parser, since to the parser &#6; looks exactly the same as the character expressed with its binary value.
The
easiest place to fix it is before the field values are serialized into
XML.
Indeed!

All the best,

Christian

--

Christian Wittern Institute for Research in Humanities, Kyoto University
47 Higashiogura-cho, Kitashirakawa, Sakyo-ku, Kyoto 606-8265, JAPAN

Reply via email to