Looking at this second time, maybe we have an X/Y problem (sp?). Why was
that symbol in there in the first place?

Was it a field separator instead of using multiple fields? Was it a
character in an encoding other than UTF-8?

My guess is that the character will not make sense to Solr during either
indexing or Solr, so what's the reason of trying to get it in.

Regards,
   Alex.

Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


On Wed, Jan 16, 2013 at 9:18 AM, Yonik Seeley <yo...@lucidworks.com> wrote:

> On Tue, Jan 15, 2013 at 3:55 PM, Alexandre Rafalovitch
> <arafa...@gmail.com> wrote:
> > Basically, the
> > recommendation is to avoid CDATA and automatically encode characters such
> > as yours, as well as less/more and ampersand.
>
> Unfortunately that doesn't even work.  Just as a raw control character
> like a 0 byte is invalid XML, so is an encoded 0 byte like &#0;
> XML on it's own is simply incapable of representing all unicode code
> points (without some further encoding on top like base64 or whatever).
>
> You could always use JSON...
>
> -Yonik
> http://lucidworks.com
>

Reply via email to