Mike Drob created LUCENE-9630: --------------------------------- Summary: Allow Shard Leader to give up leadership gracefully via shard terms Key: LUCENE-9630 URL: https://issues.apache.org/jira/browse/LUCENE-9630 Project: Lucene - Core Issue Type: Bug Reporter: Mike Drob
Currently we have (via SOLR-12412) that when a leader sees an index writing error during an update it will give up leadership by deleting the replica and adding a new replica. One stated benefit of this was that because we are using the overseer and a known code path, that this is done asynchronous and very efficiently. I would argue that this approach is too heavy handed. In the case of a corrupt index exception, it makes some sense to completely delete the index dir and attempt to sync from a good peer. Even in this case, however, it might be better to allow fingerprinting and other index delta mechanisms take over and allow for a more efficient data transfer. In an alternate case where the index error arises due to a disconnected file system (possible with shared file systems, i.e. S3, HDFS, some k8s systems) and the required solution is some kind of reconnect, then this approach has several shortcomings - the core delete and creations are going to fail leaving dangling replicas. Further, the data is still present so there is no need to do so many extra copies. I propose that we bring in a mechanism to give up leadership via the existing shard terms language. I believe we would be able to set all replicas currently equal to leader term T to T+1, and then trigger a new leader election. The current leader would know it is ineligible, while the other replicas that were current before the failed update would be eligible. This improvement would entail adding an additional possible operation to terms state machine. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org