[ 
https://issues.apache.org/jira/browse/SOLR-14820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189564#comment-17189564
 ] 

Christine Poerschke commented on SOLR-14820:
--------------------------------------------

Thanks [~thomas.woeckinger] for reporting this issue and for correlating that 
https://stackoverflow.com/questions/58528855/solr-cloud-add-replica-fails-on-node-thats-available-on-clusterstatus-live
 also experiences it.

Nothing obviously jumps out from a quick look at the code e.g. 
https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.6.2/solr/core/src/java/org/apache/solr/cloud/api/collections/Assign.java#L427-L428
 except perhaps that it's impossible to tell from the logging how 
{{nodeNameVsShardCount.keySet()}} got to be empty e.g. was it because 
{{clusterState.getLiveNodes()}} was empty or because the 
{{nodeList.retainAll(createNodeList)}} intersection resulted in an empty list.

Thinking out aloud:
* The stackoverflow issue mentions n Solr instances and m ZK instances, are you 
experiencing the issue in that configuration also and/or does it happen also 
when the ZKs are in the same JVM as Solr?
* When there's n Solr instances, does it matter which one is sent the command? 
Logically it shouldn't matter but any differences might provide clues as to why 
it is happening.
* Is it possible to reproduce this e.g. with the tech products example or other 
examples? If it happens only with external ZK instances then that might be a 
bit fiddly but not impossible e.g. in SOLR-12454 the 
{{overseer-scenario-run.sh}} script was useful to explore scenarios.

> Create replica is broken
> ------------------------
>
>                 Key: SOLR-14820
>                 URL: https://issues.apache.org/jira/browse/SOLR-14820
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: replication (java), Server
>    Affects Versions: 8.6, 8.5.1, 8.5.2, 8.6.1, 8.6.2
>            Reporter: Thomas Wöckinger
>            Priority: Critical
>
> Creating an additional replica for a collection fails because live nodes are 
> not valid



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to