Re: Unable to index rich-text documents in Solr Cloud

Shawn Heisey Wed, 18 Mar 2015 22:20:13 -0700

On 3/18/2015 1:22 AM, Zheng Lin Edwin Yeo wrote:
> I'm having some issues with indexing rich-text documents from the Solr
> Cloud. When I tried to index a pdf or word document, I get the following
> error:
> 
> 
> org.apache.solr.common.SolrException: Bad Request
> 
> 
> 
> request: 
> http://192.168.2.2:8984/solr/logmill/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2F192.168.2.2%3A8983%2Fsolr%2Flogmill%2F&wt=javabin&version=2


This request appears to be one of the requests that SolrCloud makes
between its different nodes, but it is using the /update handler.  I
assume that when you sent the request, you sent it to the
/update/extract handler because it's a rich text document?  The /update
handler can't do rich text documents, it's only for documents in json,
xml, csv, javabin, etc that are formatted in specific ways.

One thing I'm wondering is whether the Extracting handler requires a
shards.qt parameter, also set to /update/extract, to work right with
SolrCloud.  I have never used that handler myself, so I've got no idea
what is required to make it work right.

Thanks,
Shawn

Re: Unable to index rich-text documents in Solr Cloud

Reply via email to