Looking at this second time, maybe we have an X/Y problem (sp?). Why was that symbol in there in the first place?
Was it a field separator instead of using multiple fields? Was it a character in an encoding other than UTF-8? My guess is that the character will not make sense to Solr during either indexing or Solr, so what's the reason of trying to get it in. Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Wed, Jan 16, 2013 at 9:18 AM, Yonik Seeley <yo...@lucidworks.com> wrote: > On Tue, Jan 15, 2013 at 3:55 PM, Alexandre Rafalovitch > <arafa...@gmail.com> wrote: > > Basically, the > > recommendation is to avoid CDATA and automatically encode characters such > > as yours, as well as less/more and ampersand. > > Unfortunately that doesn't even work. Just as a raw control character > like a 0 byte is invalid XML, so is an encoded 0 byte like � > XML on it's own is simply incapable of representing all unicode code > points (without some further encoding on top like base64 or whatever). > > You could always use JSON... > > -Yonik > http://lucidworks.com >