Hello again, i have tried to use class: http://xerces.apache.org/xerces-c/apiDocs-2/classXMLFormatter.html#_details
with attributes: NoEscapes , UnRep_Replace and the problematic char was replaced by: ^Z But it is still not solving problem with Oracle DB XML parser to parse this xml. I have got this error: ORA-31011: XML parsing failed ORA-19202: Error occurred in XML processing LPX-00216: invalid character 26 (0x1A) Error at line 22 I would like to replace unknown character with my own character, which will be parseable (for example char "?" or "_"). How can I change replacement character, which is used as default? Thank anybody for any idea. Have a nice day, Jan > ------------ Původní zpráva ------------ > Od: Jan Suchý <[email protected]> > Předmět: RE: xerces/ICU unicode alias for weak encoding when > serializing/converting to CP > Datum: 16.12.2008 09:35:40 > ---------------------------------------- > Hello Jesse, > thank you for your answer :-) it seems to be promising. I'll look at it. > Jan > > > > ------------ Původní zpráva ------------ > > Od: Jesse Pelton <[email protected]> > > Předmět: RE: xerces/ICU unicode alias for weak encoding when > > serializing/converting to CP > > Datum: 15.12.2008 18:15:49 > > ---------------------------------------- > > The constructors for the Xerces XMLFormatter object all take an UnRepFlags > > argument that allows you to specify how to handle unrepresentable > > characters. > > > So does XMLFormatter::formatBuf(). It appears that the transcoder gets to > > decide what character to replace unrepresentable characters with. > > > > Hope that helps. > > > > -----Original Message----- > > From: Jan Suchý [mailto:[email protected]] > > Sent: Monday, December 15, 2008 4:25 AM > > To: [email protected] > > Subject: xerces/ICU unicode alias for weak encoding when > serializing/converting > > to CP > > > > Hello all, > > I need to obtain output XML in iso-8859-2 encoding. > > I am using UTF-8 as input encoding. > > There is some character, in UTF-8 xml, which is not representable in > > iso-8859-2. > > I am using ICU 3.8, xerces 2.8 and Xqilla svn 702. > > > > After serializing XML to iso-8859-2 the problematic character is serialized > by > > ICU/xerces/xq to: > > > > – > > > > The problem is, that if I will send message in iso-8859-2 with character > > – inside to Oracle DB, the Oracle parser > > > > does not like this character and this error is obtained: > > > > ORA-31011: XML parsing failed, LPX-00217: invalid character 8211 (U+2013) > > > > So, what I am looking for is some method, how to say to the ICU or to Xerces > or > > to XQ, that the Unicode character, must > > > > not be included in result and must be for example replaced by character "?", > to > > avoid Oracle parser to process it. > > > > I would like to find clear solution, like saying to ICU not calling callback > > function or define own alias or behavior on > > > > this situation. Is it possible? > > Any ideas? > > Thank you > > Jan Suchy > > > > > > > > >
