Re: Dealing with bad apples in a SolrCloud cluster

2014-11-26 Thread Ramkumar R. Aiyengar
As Eric mentions, his change to have a state where indexing happens but querying doesn't surely helps in this case. But these are still boolean decisions of send vs don't send. In general, it would be nice to abstract the routing policy so that it is pluggable. You could then do stuff like have a

Re: Dealing with bad apples in a SolrCloud cluster

2014-11-21 Thread Erick Erickson
; no replica then extending shards.tolerant concept to use some > timeout/acceptable-latency value sounds interesting. > > -Mohsin > > - Original Message - > From: thelabd...@gmail.com > To: solr-user@lucene.apache.org > Sent: Friday, November 21, 2014 10:56:51 AM GMT -08:0

Re: Dealing with bad apples in a SolrCloud cluster

2014-11-21 Thread Mohsin Beg Beg
Message - From: thelabd...@gmail.com To: solr-user@lucene.apache.org Sent: Friday, November 21, 2014 10:56:51 AM GMT -08:00 US/Canada Pacific Subject: Dealing with bad apples in a SolrCloud cluster Just soliciting some advice from the community ... Let's say I have a 10-node SolrCloud cl

Re: Dealing with bad apples in a SolrCloud cluster

2014-11-21 Thread ralph tice
bq. We ran into one of failure modes that only AWS can dream up recently, where for an extended amount of time, two nodes in the same placement group couldn't talk to one another, but they could both see Zookeeper, so nothing was marked as down. I had something similar happen with one of my SolrCl

RE: Dealing with bad apples in a SolrCloud cluster

2014-11-21 Thread steve
"Last Gasp" is the last message that Sun Storage controllers would send to each other when things whet sideways... For what it's worth. > Date: Fri, 21 Nov 2014 14:07:12 -0500 > From: michael.della.bi...@appinions.com > To: solr-user@lucene.apache.org > Subject: Re: D

Re: Dealing with bad apples in a SolrCloud cluster

2014-11-21 Thread Michael Della Bitta
Good discussion topic. I'm wondering if Solr doesn't need some sort of "shoot the other node in the head" functionality. We ran into one of failure modes that only AWS can dream up recently, where for an extended amount of time, two nodes in the same placement group couldn't talk to one anot

Re: Dealing with bad apples in a SolrCloud cluster

2014-11-21 Thread Mark Miller
bq. esp. since we've set max threads so high to avoid distributed dead-lock. We should fix this for 5.0 - add a second thread pool that is used for internal requests. We can make it optional if necessary (simpler default container support), but it's a fairly easy improvement I think. - Mark On

Dealing with bad apples in a SolrCloud cluster

2014-11-21 Thread Timothy Potter
Just soliciting some advice from the community ... Let's say I have a 10-node SolrCloud cluster and have a single collection with 2 shards with replication factor 10, so basically each shard has one replica on each of my nodes. Now imagine one of those nodes starts getting into a bad state and st