On Sat, Dec 13, 2008 at 1:45 PM, Ryan McKinley <ryan...@gmail.com> wrote: > Is there any standard way to escape invalid xml control characters?
Not that I know of... it's a shame that XML can't carry the full unicode range. Good reason to get binary or JSON indexing interface at some point... I think Noble was working on one. > If so, we should add that to XML.escapeCharData() -- this gets called from > ClientUtils.writeXML() > > It looks like the XML class already has something set for 22, so I'm not > sure what could be happening. > > I have also tried: > > StringBuilder body = content.toString() > for( int i=0; i<body.length(); i++ ) { > int c = body.charAt( i ); > if( c < ' ' && c != 9 && c != 10 && c != 13 ) { // 9 = TAB, 10 = New > Line, 13 = CR > log.warn( "Contains invalid character: '"+c+"' " ); > // replace control character with space > body.setCharAt( i, ' ' ); > } > } > doc.setField( "body", body.toString() ); > > but that still gives the same error. Hmmm, do you know what chars it's choking on now? -Yonik