[jira] [Commented] (SOLR-14123) autoAddReplicas is not reliable when multiple nodes go down.

David Hunt (Jira) Thu, 19 Dec 2019 11:17:39 -0800


    [ 
https://issues.apache.org/jira/browse/SOLR-14123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000314#comment-17000314
 ]


David Hunt commented on SOLR-14123:
-----------------------------------

Also in my scenario I have many shards/replicas impacted, do events get 
propagated? or is each event supposed to trigger add replica for each replica 
that was on the node?

> autoAddReplicas is not reliable when multiple nodes go down.
> ------------------------------------------------------------
>
>                 Key: SOLR-14123
>                 URL: https://issues.apache.org/jira/browse/SOLR-14123
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: AutoScaling
>    Affects Versions: 8.3
>            Reporter: David Hunt
>            Priority: Major
>              Labels: autoscale
>
> I started noticing problems in our production environment with indexing being 
> blocked due to a minimum replication factor not being met.  We have 
> autoAddReplicas triggers in place to add replicas when nodes our lost but it 
> doesn't seem to correctly add all replicas that have been lost when nodes are 
> lost. I’ve been able to reproduce this behavior consistently in a development 
> environment.
> Repro:
>  # Setup a 10 node SolrCloud cluster.
>  # Create autoAddReplicas to trigger on nodeLost with waitFor set to 10 
> minutes.
>  # Create 15 collections with 2 shards and 4 replicas.
>  # Kill 3 Solr nodes.
>  # 15 minutes later kill 1 more Solr node.
> Results:
> Monitor your shards/replicas.  You’ll see some replicas added to make up for 
> the lost replicas but not all.  An hour later many shards are still missing 
> replicas.
> Expected:
> All lost replicas should be added on the 6 remaining healthy nodes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-14123) autoAddReplicas is not reliable when multiple nodes go down.

Reply via email to