Shot in the dark, but is the PDF file significantly larger than the others? Perhaps your simply exceeding the packet limits for the servlet container?
Best, Erick On Wed, Mar 18, 2015 at 12:22 AM, Zheng Lin Edwin Yeo <edwinye...@gmail.com> wrote: > Hi everyone, > > I'm having some issues with indexing rich-text documents from the Solr > Cloud. When I tried to index a pdf or word document, I get the following > error: > > > org.apache.solr.common.SolrException: Bad Request > > > > request: > http://192.168.2.2:8984/solr/logmill/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2F192.168.2.2%3A8983%2Fsolr%2Flogmill%2F&wt=javabin&version=2 > at > org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:241) > at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) > at java.lang.Thread.run(Unknown Source) > > > I'm able to index .xml and .csv files in Solr Cloud with the same > configuration. > > I have setup Solr Cloud using the default zookeeper in Solr 5.0.0, and > I have 2 shards with the following details: > Shard1: 192.168.2.2:8983 > Shard2: 192.168.2.2:8984 > > Prior to this, I'm already able to index rich-text documents without > the Solr Cloud, and I'm using the same solrconfig.xml and schema.xml, > so my ExtractRequestHandler is already defined. > > Is there other settings required in order to index rich-text documents > in Solr Cloud? > > > Regards, > Edwin