Swawn,
thanks you very much for your answer.

On Wed, May 2, 2018 at 6:27 PM, Shawn Heisey <apa...@elyograg.org> wrote:

> On 5/2/2018 4:54 AM, Patrick Recchia wrote:
> > I'm seeing way too many commits on our solr cluster, and I don't know
> why.
>
> Are you sure there are commits happening?  Do you have logs actually
> saying that a commit is occurring?  The creation of a new segment does
> not necessarily mean a commit happened -- this can happen even without a
> commit.
>

You're right, I assumed a new segment would be created only as part of a
commit; but I realize now that there can be other situations.

Is there any logging I can turn on to know when a commit happens and/or
when a segment is flushed?

I would be very interested in that
I've already enabled InfoStream logging from the IndexWriter, but have
found nothing yet there to help me understand that



> > - IndexConfig is set to autoCommit every minute:
> >
> > <autoCommit> <maxTime>${solr.autoCommit.maxTime:60000}</maxTime> <
> > openSearcher>true</openSearcher> </autoCommit>
> >
> > (solr.autoCommit.maxTime is not set)
>
> It's recommended to set openSearcher to false on autoCommit.  Do you
> have autoSoftCommit configured?
>

autoSoftCommit is left at its default '-1' (which means infinity, I
suppose).



>
> > There is nothing else customized (when it comes to IndexWriter, at least)
> > within solrconfig.xml
> >
> > The data is sent without commit, but with commitWithin=500000 ms.
> >
> > All that said, I would have expected a rate of about 1 segment created
> epr
> > minute; of about 100MB.
>
> One of the events that can cause a new segment to be flushed is the ram
> buffer filling up.  Solr defaults to a ramBufferSizeMB value of 100.
> But that does not translate to a segment size of 100MB -- it's merely
> the size of the ram buffer that Lucene uses for all the work related to
> building a segment.  A segment resulting from a full memory buffer is
> going to be smaller than the buffer.  I do not know how MUCH smaller, or
> what causes variations in that size.
>
> The general advice is to leave the buffer size alone.  But with the high
> volume you've got, you might want to increase it so segments are not
> flushed as frequently.  Be aware that increasing it will have an impact
> on how much heap memory gets used.  Every Solr core (shard replica in
> SolrCloud terminology) that does indexing is going to need one of these
> ram buffers.
>

I will definitely investigate this ramBufferSizeMB.
And, see through lucene code when a segment is flushed.

Again, many thanks.
Patrick

Reply via email to