bug# 2, Solr shouldn't be adding replicas by itself unless you
specified autoAddReplicas=true when you created the collection. It
default to "false". So I'm not sure what's going on here.

bug #3. The internal load balancers are round-robin, so this is
expected. Not optimal I'll grant but expected.

bug #4. What shard placement rules are you using? There are a series
of rules for replica placement and one of the criteria (IIRC) is
exactly to try to distribute replicas to different hosts. Although
there was some glitchiness whether two JVMs on the same _host_ were
considered "the same host" or not.

bug #1 has been more or less of a pain for quite a while, work is ongoing there.

FWIW,
Erick

On Fri, Mar 17, 2017 at 5:40 PM, Walter Underwood <wun...@wunderwood.org> wrote:
> I’m running a 4x4 cluster (4 shards, replication factor 4) on 16 hosts. I 
> shut down Solr on one host because it got into some kind of bad, 
> can’t-recover state where it was causing timeouts across the whole cluster 
> (bug #1).
>
> I ran a load benchmark near the capacity of the cluster. This had run fine in 
> test, this was the prod cluster.
>
> Solr Cloud added a replica to replace the down node. The node with two cores 
> got double the traffic and started slowly flapping in and out of service. The 
> 95th percentile response spiked from 3 seconds to 100 seconds. At some point, 
> another replica was made, with two replicas from the same shard on the same 
> instance. Naturally, that was overloaded, and I killed the benchmark out of 
> charity.
>
> Bug #2 is creating a new replica when one host is down. This should be an 
> option and default to “false”, because it causes the cascade.
>
> Bug #3 is sending equal traffic to each core without considering the host. 
> Each host should get equal traffic, not each core.
>
> Bug #4 is putting two replicas from the same shard on one instance. That is 
> just asking for trouble.
>
> When it works, this cluster is awesome.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>

Reply via email to