[ https://issues.apache.org/jira/browse/SOLR-13813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yonik Seeley updated SOLR-13813: -------------------------------- Attachment: SOLR-13813.patch Status: Open (was: Open) Attaching updated patch with fixed test. This updated test succeeds with NRT replicas, but often fails with SHARED replicas. It simply kills the leader while the split is taking place and leaves it down. NOTES: - sometimes there is a failure because of bad replica placement (both replicas of a sub-shard being placed on the same node that is being split, leading to both replicas being down at the end of the test. If this is the case, you get a "No live SolrServers available to handle this request" error message. - interestingly when there are missing documents (search for "MISSING" in output), if one looks at the final cluster state, the original shard is inactive and the new shards are active! Although the test doesn't enforce that the split needs to still be in progress when the leader is brought down, it's unlikely that the split could always be finishing this fast. It's more likely that the split should have failed. > Shared storage online split support > ----------------------------------- > > Key: SOLR-13813 > URL: https://issues.apache.org/jira/browse/SOLR-13813 > Project: Solr > Issue Type: Sub-task > Reporter: Yonik Seeley > Priority: Major > Attachments: SOLR-13813.patch, SOLR-13813.patch > > Time Spent: 1h > Remaining Estimate: 0h > > The strategy for online shard splitting is the same as that for normal (non > SHARED shards.) > During a split, the leader will forward updates to sub-shard leaders, those > updates will be buffered by the transaction log while the split is in > progress, and then the buffered updates are replayed. > One change that was added was to push the local index to blob store after > buffered updates are applied (but before it is marked as ACTIVE): > See > https://github.com/apache/lucene-solr/commit/fe17c813f5fe6773c0527f639b9e5c598b98c7d4#diff-081b7c2242d674bb175b41b6afc21663 > This issue is about adding tests and ensuring that online shard splitting > (while updates are flowing) works reliably. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org