I believe you can use the following unicode characters in XML
documents: U+0009, U+000A, U+000D, [U+0020-U+D7FF], [U+E000-U+FFFD],
and [U+10000-U+10FFFF]
One of your documents contains a U0022 character which is an invalid
space character for XML.
http://www.unicode.org/unicode/reports/tr20/#White
If your data is all text, you can probably safely remove the
disallowed whitespace characters.
-Bryan
On Dec 23, 2008, at Dec 23, 5:50 AM, rohit arora wrote:
Hi,
When i give post command to build my Index on my (databases / XML)
file it gives me
an error which is like .
com.ctc.wstx.exc.WstxUnexpectedCharException: Illegal character
((CTRL-CHAR, code 22))
at [row,col {unknown-source}]: [1676,86]
I find a inbuild function in perl to convert all my character data
in "UTF-8" format
I find that there are many Unicode Character that are not legal XML
Character.
Can any one help me to find the list of all the legal XML Character
so that
I can strip all character except those characters.
with regards
Rohit Arora