As Eric mentions, his change to have a state where indexing happens but
querying doesn't surely helps in this case.
But these are still boolean decisions of send vs don't send. In general, it
would be nice to abstract the routing policy so that it is pluggable. You
could then do stuff like have a
; no replica then extending shards.tolerant concept to use some
> timeout/acceptable-latency value sounds interesting.
>
> -Mohsin
>
> - Original Message -
> From: thelabd...@gmail.com
> To: solr-user@lucene.apache.org
> Sent: Friday, November 21, 2014 10:56:51 AM GMT -08:0
Message -
From: thelabd...@gmail.com
To: solr-user@lucene.apache.org
Sent: Friday, November 21, 2014 10:56:51 AM GMT -08:00 US/Canada Pacific
Subject: Dealing with bad apples in a SolrCloud cluster
Just soliciting some advice from the community ...
Let's say I have a 10-node SolrCloud cl
bq. We ran into one of failure modes that only AWS can dream up
recently, where for an extended amount of time, two nodes in the same
placement group couldn't talk to one another, but they could both see
Zookeeper, so nothing was marked as down.
I had something similar happen with one of my SolrCl
"Last Gasp" is the last message that Sun Storage controllers would send to each
other when things whet sideways...
For what it's worth.
> Date: Fri, 21 Nov 2014 14:07:12 -0500
> From: michael.della.bi...@appinions.com
> To: solr-user@lucene.apache.org
> Subject: Re: D
Good discussion topic.
I'm wondering if Solr doesn't need some sort of "shoot the other node in
the head" functionality.
We ran into one of failure modes that only AWS can dream up recently,
where for an extended amount of time, two nodes in the same placement
group couldn't talk to one anot
bq. esp. since we've set max threads so high to avoid distributed
dead-lock.
We should fix this for 5.0 - add a second thread pool that is used for
internal requests. We can make it optional if necessary (simpler default
container support), but it's a fairly easy improvement I think.
- Mark
On
Just soliciting some advice from the community ...
Let's say I have a 10-node SolrCloud cluster and have a single collection
with 2 shards with replication factor 10, so basically each shard has one
replica on each of my nodes.
Now imagine one of those nodes starts getting into a bad state and st