1> "mentions that for soft commit, "new segments are created that will
be merged""

Wait, how did that get in there? Ignore it, I'm taking it out.

2> it doesn't matter whether segments are fsync'd or not. There will
be a series of segments written to disk when ramBufferSizeMB is
exceeded without a hard commit. But they are _not_ recorded in the
segments file. So even though they may be perfectly intact, they
aren't "found" if Solr has to restart. The sequence for hard commit
is:
a> flush the segment (actually, fsync)
b> write the names of current valid segments to the segments file.

When a searcher is opened, it gets the list of current valid segments
from the segments file. Since <a> doesn't get done on soft commit, any
segments written to disk since the last commit are invisible.

We haven't found much if any improvement by setting ramBufferSizeMB
greater than the default. There'll be less merging if you raise it,
but it's generally best to just leave it alone.

3> Yes and Yes. If you disable transaction logs (the <updateLog> entry
in solrconfig.xml), then PeerSync won't work. Replication will work
since that only replicates segment files, but that's old-style
master/slave replication.

Best,
Erick

On Fri, Sep 29, 2017 at 1:06 PM, Wei <weiwan...@gmail.com> wrote:
> Thanks Emir and Erick!  Helps me a lot to understand the commit process. A
> few more questions:
>
> 1.  https://lucidworks.com/2013/08/23/understanding-
> transaction-logs-softcommit-and-commit-in-sorlcloud/  mentions that for
> soft commit, "new segments are created that will be merged".  Does that
> mean without hard commit, soft commits will create many small segments in
> memory and that could also slow down query?  As I understand merge policy
> only kicks in with hard commit.
>
> 2.  Without hard commit configure, will the segments still be fsync to disk
> when accumulated updates exceeds rambuffersizeMB?  Is there any concern to
> increase rambuffersizeMB to a large value?
>
> 3. Can transaction logs be disabled in solr cloud? Will
> functionalities(replication, peer sync) break without transaction logs?
>
> Thanks,
> Wei
>
>
> On Fri, Sep 29, 2017 at 8:33 AM, Erick Erickson <erickerick...@gmail.com>
> wrote:
>
>> More than you want to know about hard and soft commits here:
>> https://lucidworks.com/2013/08/23/understanding-
>> transaction-logs-softcommit-and-commit-in-sorlcloud/
>>
>> You don't need to read it though, Emir did an admirable job of telling
>> you why turning off hard commits is a terrible idea.
>>
>> Best,
>> Erick
>>
>> On Fri, Sep 29, 2017 at 1:07 AM, Emir Arnautović
>> <emir.arnauto...@sematext.com> wrote:
>> > Hi Wei,
>> > Hard commits are about data durability. It will roll over transaction
>> logs and create index new index segment. If configured with
>> openSearcher=false, they do not affect query performance much (other then
>> taking some resources) since they do not invalidate caches. If you have
>> transaction logs enabled, without hard commits it would grow infinitely and
>> can result in full disk. In case of heavy indexing, even rare hard commits
>> can result in large transaction logs causing Solr restart after crash to
>> take a while because transaction logs are replayed.
>> > Soft commits are the one that are affecting query performance and should
>> be as rare as your requirements allow. They invalidate caches causing cold
>> searches or if you have warming set up, take resources to do the warming.
>> >
>> > I would recommend to keep hard commits, set to every 20-60 seconds
>> (depending on indexing volume) and make sure openSearcher is set to false.
>> >
>> > HTH,
>> > Emir
>> >
>> >> On 29 Sep 2017, at 06:55, Wei <weiwan...@gmail.com> wrote:
>> >>
>> >> Hello All,
>> >>
>> >> What are the impacts if solr cloud is configured to have only soft
>> commits
>> >> but no hard commits? In this way if a non-leader node crashes, will it
>> >> still be able to recover from the leader? Basically we are wondering
>> in a
>> >> read heavy & write heavy scenario, whether taking hard commit out could
>> >> help to improve query performance and what are the consequences.
>> >>
>> >> Thanks,
>> >> Wei
>> >
>>

Reply via email to