[ https://issues.apache.org/jira/browse/SOLR-14778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17184754#comment-17184754 ]
David Smiley edited comment on SOLR-14778 at 8/25/20, 10:03 PM: ---------------------------------------------------------------- It's possible to have durability without an updateLog provided that all update requests do a (hard) commit without necessarily opening a new searcher. If you mostly do big batch updates and few little one-off changes sporadically, then this is perfectly viable. I think when the updateLog is disabled, Solr should allow you to specify (a boolean) that updates should be durable via automatically having updates work as commit=true&openSearcher=false if the user specifies neither (thus the user can always override), -and furthermore disable autoSoftCommit & commitWithin-. Put differently, imagine a "durability strategy". Are we durable via the updateLog or via always/implicitly hard-committing? Different trade-offs. Another benefit is that updateLog disabling avoids this problem: SOLR-8030 (udpateLog doesn't store request params; problematic with custom URPs). I'm skeptical we actually need new "replica states" to express the distinction between BUFFERING, REJECT, and BLOCKING. Maybe that's what's best but there may be lots of code to change and compatibility concerns given that the current enum has been stable since at least 2015. Instead, I'm imagining a simple solrconfig.xml setting. I don't pretend to think the whole matter is simple though. was (Author: dsmiley): It's possible to have durability without an updateLog provided that all update requests do a (hard) commit without necessarily opening a new searcher. If you mostly do big batch updates and few little one-off changes sporadically, then this is perfectly viable. I think when the updateLog is disabled, Solr should allow you to specify (a boolean) that updates should be durable via automatically having updates work as commit=true&openSearcher=false if the user specifies neither (thus the user can always override), and furthermore disable autoSoftCommit & commitWithin. Put differently, imagine a "durability strategy". Are we durable via the updateLog or via always/implicitly hard-committing? Different trade-offs. Another benefit is that updateLog disabling avoids this problem: SOLR-8030 (udpateLog doesn't store request params; problematic with custom URPs). I'm skeptical we actually need new "replica states" to express the distinction between BUFFERING, REJECT, and BLOCKING. Maybe that's what's best but there may be lots of code to change and compatibility concerns given that the current enum has been stable since at least 2015. Instead, I'm imagining a simple solrconfig.xml setting. I don't pretend to think the whole matter is simple though. > Disabling UpdateLog leads to silently lost updates > -------------------------------------------------- > > Key: SOLR-14778 > URL: https://issues.apache.org/jira/browse/SOLR-14778 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud, update > Affects Versions: 8.6.1 > Reporter: Megan Carey > Priority: Minor > > Solr currently "supports" disabling the UpdateLog, though it is "required" > for NRT replicas (per the [docs|#transaction-log]]). However, when the update > log is disabled and a replica is in BUFFERING state (e.g. during MigrateCmd > or SplitShardCmd), updates are *lost silently*. While most users will likely > never consider disabling the updateLog, it seems pertinent to provide a > better support option. > Options as discussed in [ASF > Slack|[https://the-asf.slack.com/archives/CEKUCUNE9/p1598373062262300]]: > # No longer support disabling the updateLog as it is considered a pertinent > feature in SolrCloud. This might be undesirable for use cases where some data > loss is acceptable and the updateLog takes up too much space. > # Improve Solr documentation to explicitly outline the risks of disabling > the updateLog. > # Add logging to indicate when an update is swallowed in this state. > # _My preferred option:_ Support disabling the updateLog by providing > additional replica states besides BUFFERING, so that there is no data loss > when updateLog is disabled and replica goes offline for an operation like > split. Some ideas: > ## REJECTING: Fail updates so that the client can retry again once the > operation is complete. > ## BLOCKING: Stall update until operation is complete, and then execute > update. > Feedback is welcome; once we establish a path forward I'd be happy to pick it > up. If others are interested I can document my findings as well. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org