Re: Cascading failures with replicas

2017-03-18 Thread Walter Underwood
6.3.0. No idea how it is happening, but I got two replicas on the same host after one host went down. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Mar 18, 2017, at 8:35 PM, Erick Erickson wrote: > > Hmmm, I'm totally mystified about how Solr is

Re: Cascading failures with replicas

2017-03-18 Thread Erick Erickson
Hmmm, I'm totally mystified about how Solr is "creating a new replica when one host is down". Are you saying this is happening automagically? You're right the autoAddReplica bit is HDFS so having replicas just show up is completely completely weird. In days past, when a replica was discovered on di

Re: Cascading failures with replicas

2017-03-18 Thread Walter Underwood
Thanks. This is a very CPU-heavy workload, with ngram fields and very long queries. 16.7 million docs. The whole cascading failure thing in search engines is hard. The first time I hit this was at Infoseek, over twenty years ago. > On Mar 18, 2017, at 12:46 PM, Erick Erickson wrote: > > bug#

Re: Cascading failures with replicas

2017-03-18 Thread Erick Erickson
bug# 2, Solr shouldn't be adding replicas by itself unless you specified autoAddReplicas=true when you created the collection. It default to "false". So I'm not sure what's going on here. bug #3. The internal load balancers are round-robin, so this is expected. Not optimal I'll grant but expected.

Cascading failures with replicas

2017-03-17 Thread Walter Underwood
I’m running a 4x4 cluster (4 shards, replication factor 4) on 16 hosts. I shut down Solr on one host because it got into some kind of bad, can’t-recover state where it was causing timeouts across the whole cluster (bug #1). I ran a load benchmark near the capacity of the cluster. This had run fi