Hi Erick, No, the PDF file is a testing file which only contains 1 sentence.
I've managed to get it to work by removing startup="lazy" in the ExtractingRequestHandler and added the following lines: <str name="uprefix">ignored_</str> <str name="captureAttr">true</str> <str name="fmap.a">links</str> <str name="fmap.div">ignored_</str> Does the presence of startup="lazy" affect the function of ExtractingRequestHandler , or is it one of the str name values? Regards, Edwin On 18 March 2015 at 23:19, Erick Erickson <erickerick...@gmail.com> wrote: > Shot in the dark, but is the PDF file significantly larger than the > others? Perhaps your simply exceeding the packet limits for the > servlet container? > > Best, > Erick > > On Wed, Mar 18, 2015 at 12:22 AM, Zheng Lin Edwin Yeo > <edwinye...@gmail.com> wrote: > > Hi everyone, > > > > I'm having some issues with indexing rich-text documents from the Solr > > Cloud. When I tried to index a pdf or word document, I get the following > > error: > > > > > > org.apache.solr.common.SolrException: Bad Request > > > > > > > > request: > http://192.168.2.2:8984/solr/logmill/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2F192.168.2.2%3A8983%2Fsolr%2Flogmill%2F&wt=javabin&version=2 > > at > org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:241) > > at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown > Source) > > at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown > Source) > > at java.lang.Thread.run(Unknown Source) > > > > > > I'm able to index .xml and .csv files in Solr Cloud with the same > configuration. > > > > I have setup Solr Cloud using the default zookeeper in Solr 5.0.0, and > > I have 2 shards with the following details: > > Shard1: 192.168.2.2:8983 > > Shard2: 192.168.2.2:8984 > > > > Prior to this, I'm already able to index rich-text documents without > > the Solr Cloud, and I'm using the same solrconfig.xml and schema.xml, > > so my ExtractRequestHandler is already defined. > > > > Is there other settings required in order to index rich-text documents > > in Solr Cloud? > > > > > > Regards, > > Edwin >