I believe you can use the following unicode characters in XML documents: U+0009, U+000A, U+000D, [U+0020-U+D7FF], [U+E000-U+FFFD], and [U+10000-U+10FFFF]

One of your documents contains a U0022 character which is an invalid space character for XML.

http://www.unicode.org/unicode/reports/tr20/#White

If your data is all text, you can probably safely remove the disallowed whitespace characters.


-Bryan




On Dec 23, 2008, at Dec 23, 5:50 AM, rohit arora wrote:



Hi,

When i give post command to build my Index on my (databases / XML) file it gives me
an error which is like .

com.ctc.wstx.exc.WstxUnexpectedCharException: Illegal character ((CTRL-CHAR, code 22))
 at [row,col {unknown-source}]: [1676,86]

I find a inbuild function in perl to convert all my character data in "UTF-8" format I find that there are many Unicode Character that are not legal XML Character.

Can any one help me to find the list of all the legal XML Character so that
I can strip all character except those characters.


with regards
 Rohit Arora




Reply via email to