If it actually happens with replicationFactor=1, it doesn't likely have 
anything to do with the update handler issue I'm referring to. In some cases 
like these, people have better luck with Jetty than Tomcat - we test it much 
more. For instance, it's setup to help avoid search side distributed deadlocks.

In any case, there is something special about it - I do and have seen a lot of 
heavy indexing to SolrCloud by me and others without running into this. Both 
with replicationFacotor=1 and greater. So there is something specific in how 
the load is being done or what features/methods are being used that likely 
causes it or makes it easier to cause.

But again, the issue I know about involves threads that are not even created in 
the replicationFactor = 1 case, so that could be a first report afaik.

- Mark

On Jun 17, 2013, at 5:52 PM, Rishi Easwaran <rishi.easwa...@aol.com> wrote:

> Update!!
> 
> This happens with replicationFactor=1
> Just for kicks I created a collection with a 24 shards, replicationfactor=1 
> cluster on my exisiting benchmark env.
> Same behaviour, SOLR cloud just hangs. Nothing in the logs, top/heap/cpu most 
> metrics looks fine.
> Only indication seems to be netstat showing incoming request not being read 
> in.
> 
> Yago,
> 
> I saw your previous post 
> (http://lucene.472066.n3.nabble.com/updating-docs-in-solr-cloud-hangs-td4067388.html#a4067631)
> Following it, Last week, I upgraded to SOLR 4.3, to see if the issue gets 
> fixed, but no luck.
> Looks like this is a dominant and easily reproducible issue on SOLR cloud.
> 
> 
> Thanks,
> 
> Rishi. 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> -----Original Message-----
> From: Yago Riveiro <yago.rive...@gmail.com>
> To: solr-user <solr-user@lucene.apache.org>
> Sent: Mon, Jun 17, 2013 5:15 pm
> Subject: Re: Solr Cloud Hangs consistently .
> 
> 
> I can confirm that the deadlock happen with only 2 replicas by shard. I need 
> shutdown one node that host a replica of the shard to recover the indexation 
> capability.
> 
> -- 
> Yago Riveiro
> Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
> 
> 
> On Monday, June 17, 2013 at 6:44 PM, Rishi Easwaran wrote:
> 
>> 
>> 
>> Hi All,
>> 
>> I am trying to benchmark SOLR Cloud and it consistently hangs. 
>> Nothing in the logs, no stack trace, no errors, no warnings, just seems 
>> stuck.
>> 
>> A little bit about my set up. 
>> I have 3 benchmark hosts, each with 96GB RAM, 24 CPU's and 1TB SSD. Each 
>> host 
> is configured to have 8 SOLR cloud nodes running at 4GB each.
>> JVM configs: http://apaste.info/57Ai
>> 
>> My cluster has 12 shards with replication factor 2- http://apaste.info/09sA
>> 
>> I originally stated with SOLR 4.2., tomcat 5 and jdk 6, as we are already 
> running this configuration in production in Non-Cloud form. 
>> It got stuck repeatedly.
>> 
>> I decided to upgrade to the latest and greatest of everything, SOLR 4.3, 
>> JDK7 
> and tomcat7. 
>> It still shows same behaviour and hangs through the test.
>> 
>> My test schema and config.
>> Schema.xml - http://apaste.info/imah
>> SolrConfig.xml - http://apaste.info/ku4F
>> 
>> The test is pretty simple. its a jmeter test with update command via SOAP 
>> rpc 
> (round robin request across every node), adding in 5 fields from a csv file - 
> id, guid, subject, body, compositeID (guid!id).
>> number of jmeter threads = 150. loop count = 20, num of messages to add/per 
> guid = 3; total 150*3*20 = 9000 documents. 
>> 
>> When cloud gets stuck, i don't get anything in the logs, but when i run 
> netstat i see the following.
>> Sample netstat on a stuck run. http://apaste.info/hr0O 
>> hycl-d20 is my jmeter host. ssd-d01/2/3 are my cloud hosts.
>> 
>> At the moment my benchmarking efforts are at a stand still.
>> 
>> Any help from the community would be great, I got some heap dumps and stack 
> dumps, but haven't found a smoking gun yet.
>> If I can provide anything else to diagnose this issue. just let me know.
>> 
>> Thanks,
>> 
>> Rishi. 
> 
> 
> 

Reply via email to