Re: Invalid UTF-8 character 0xfffe during shard update

Raymond Wiker Mon, 05 Aug 2013 23:46:48 -0700

Ok, let me rephrase that slightly: does your database extraction include
BLOBs or CLOBs that are actually complete documents, that might be UTF-8
encoded text?


>From the stack trace in your second post, it seems that the error occurs
while parsing an XML file uploaded via the UpdateRequestHandler. I'm
guessing (please note) that Solr is using an XML representation of your
documents (records) for communication between replicas; it could be that
the code that constructs the XML request does not check for the BOM
"character".


On Mon, Aug 5, 2013 at 11:10 PM, Federico Chiacchiaretta <
federico.c...@gmail.com> wrote:

> No, the content has no XML tags included (hope I understood what you were
> asking here).
>
> Federico
>
>
> 2013/8/5 Raymond Wiker <rwi...@gmail.com>
>
> > On Aug 5, 2013, at 20:12 , Federico Chiacchiaretta <
> > federico.c...@gmail.com> wrote:
> > > Hi Raymond,
> > > I agree with you, 0xfffe is a special character, that is why I was
> asking
> > > how it's handled in solr.
> > > In my document, 0xfffe does not appear at the beginning, it's in the
> > > content.
> > >
> > > Just an update about testing I'm doing: in a SolrCloud two shards
> > > environment, if I launch dataimport on one node of the shard that will
> be
> > > target for that doc, all the docs got written properly; if I launch
> > > dataimport on one node of the other shard and then it forwards to the
> > > target, I get the error.
> >
> > Does your content include entire XML documents? It could be that the
> > process of packaging the content could create a structure that includes
> an
> > entire document (with BOM) somewhere inside a compound document (just
> > guessing here.)
>

Re: Invalid UTF-8 character 0xfffe during shard update

Reply via email to