On Sat, Dec 13, 2008 at 1:45 PM, Ryan McKinley <ryan...@gmail.com> wrote:
> Is there any standard way to escape invalid xml control characters?

Not that I know of... it's a shame that XML can't carry the full unicode range.
Good reason to get binary or JSON indexing interface at some point...
I think Noble was working on one.

> If so, we should add that to XML.escapeCharData() -- this gets called from
>  ClientUtils.writeXML()
>
> It looks like the XML class already has something set for 22, so I'm not
> sure what could be happening.
>
> I have also tried:
>
>    StringBuilder body = content.toString()
>    for( int i=0; i<body.length(); i++ ) {
>      int c = body.charAt( i );
>      if( c < ' ' && c != 9 && c != 10 && c != 13 ) { // 9 = TAB, 10 = New
> Line, 13 = CR
>        log.warn( "Contains invalid character: '"+c+"' " );
>        // replace control character with space
>        body.setCharAt( i, ' ' );
>      }
>    }
>    doc.setField( "body", body.toString() );
>
> but that still gives the same error.

Hmmm, do you know what chars it's choking on now?

-Yonik

Reply via email to