If it actually happens with replicationFactor=1, it doesn't likely have anything to do with the update handler issue I'm referring to. In some cases like these, people have better luck with Jetty than Tomcat - we test it much more. For instance, it's setup to help avoid search side distributed deadlocks.
In any case, there is something special about it - I do and have seen a lot of heavy indexing to SolrCloud by me and others without running into this. Both with replicationFacotor=1 and greater. So there is something specific in how the load is being done or what features/methods are being used that likely causes it or makes it easier to cause. But again, the issue I know about involves threads that are not even created in the replicationFactor = 1 case, so that could be a first report afaik. - Mark On Jun 17, 2013, at 5:52 PM, Rishi Easwaran <rishi.easwa...@aol.com> wrote: > Update!! > > This happens with replicationFactor=1 > Just for kicks I created a collection with a 24 shards, replicationfactor=1 > cluster on my exisiting benchmark env. > Same behaviour, SOLR cloud just hangs. Nothing in the logs, top/heap/cpu most > metrics looks fine. > Only indication seems to be netstat showing incoming request not being read > in. > > Yago, > > I saw your previous post > (http://lucene.472066.n3.nabble.com/updating-docs-in-solr-cloud-hangs-td4067388.html#a4067631) > Following it, Last week, I upgraded to SOLR 4.3, to see if the issue gets > fixed, but no luck. > Looks like this is a dominant and easily reproducible issue on SOLR cloud. > > > Thanks, > > Rishi. > > > > > > > > > > > > -----Original Message----- > From: Yago Riveiro <yago.rive...@gmail.com> > To: solr-user <solr-user@lucene.apache.org> > Sent: Mon, Jun 17, 2013 5:15 pm > Subject: Re: Solr Cloud Hangs consistently . > > > I can confirm that the deadlock happen with only 2 replicas by shard. I need > shutdown one node that host a replica of the shard to recover the indexation > capability. > > -- > Yago Riveiro > Sent with Sparrow (http://www.sparrowmailapp.com/?sig) > > > On Monday, June 17, 2013 at 6:44 PM, Rishi Easwaran wrote: > >> >> >> Hi All, >> >> I am trying to benchmark SOLR Cloud and it consistently hangs. >> Nothing in the logs, no stack trace, no errors, no warnings, just seems >> stuck. >> >> A little bit about my set up. >> I have 3 benchmark hosts, each with 96GB RAM, 24 CPU's and 1TB SSD. Each >> host > is configured to have 8 SOLR cloud nodes running at 4GB each. >> JVM configs: http://apaste.info/57Ai >> >> My cluster has 12 shards with replication factor 2- http://apaste.info/09sA >> >> I originally stated with SOLR 4.2., tomcat 5 and jdk 6, as we are already > running this configuration in production in Non-Cloud form. >> It got stuck repeatedly. >> >> I decided to upgrade to the latest and greatest of everything, SOLR 4.3, >> JDK7 > and tomcat7. >> It still shows same behaviour and hangs through the test. >> >> My test schema and config. >> Schema.xml - http://apaste.info/imah >> SolrConfig.xml - http://apaste.info/ku4F >> >> The test is pretty simple. its a jmeter test with update command via SOAP >> rpc > (round robin request across every node), adding in 5 fields from a csv file - > id, guid, subject, body, compositeID (guid!id). >> number of jmeter threads = 150. loop count = 20, num of messages to add/per > guid = 3; total 150*3*20 = 9000 documents. >> >> When cloud gets stuck, i don't get anything in the logs, but when i run > netstat i see the following. >> Sample netstat on a stuck run. http://apaste.info/hr0O >> hycl-d20 is my jmeter host. ssd-d01/2/3 are my cloud hosts. >> >> At the moment my benchmarking efforts are at a stand still. >> >> Any help from the community would be great, I got some heap dumps and stack > dumps, but haven't found a smoking gun yet. >> If I can provide anything else to diagnose this issue. just let me know. >> >> Thanks, >> >> Rishi. > > >