Youcan turn on "infostream", but that is _very_ voluminous. The regular Solr logs at INFO level should show commits though
On Wed, May 2, 2018 at 10:45 AM, Patrick Recchia <patrick.recc...@gmail.com> wrote: > Swawn, > thanks you very much for your answer. > > > On Wed, May 2, 2018 at 6:27 PM, Shawn Heisey <apa...@elyograg.org> wrote: > >> On 5/2/2018 4:54 AM, Patrick Recchia wrote: >> > I'm seeing way too many commits on our solr cluster, and I don't know >> why. >> >> Are you sure there are commits happening? Do you have logs actually >> saying that a commit is occurring? The creation of a new segment does >> not necessarily mean a commit happened -- this can happen even without a >> commit. >> > > You're right, I assumed a new segment would be created only as part of a > commit; but I realize now that there can be other situations. > > Is there any logging I can turn on to know when a commit happens and/or > when a segment is flushed? > > I would be very interested in that > I've already enabled InfoStream logging from the IndexWriter, but have > found nothing yet there to help me understand that > > > >> > - IndexConfig is set to autoCommit every minute: >> > >> > <autoCommit> <maxTime>${solr.autoCommit.maxTime:60000}</maxTime> < >> > openSearcher>true</openSearcher> </autoCommit> >> > >> > (solr.autoCommit.maxTime is not set) >> >> It's recommended to set openSearcher to false on autoCommit. Do you >> have autoSoftCommit configured? >> > > autoSoftCommit is left at its default '-1' (which means infinity, I > suppose). > > > >> >> > There is nothing else customized (when it comes to IndexWriter, at least) >> > within solrconfig.xml >> > >> > The data is sent without commit, but with commitWithin=500000 ms. >> > >> > All that said, I would have expected a rate of about 1 segment created >> epr >> > minute; of about 100MB. >> >> One of the events that can cause a new segment to be flushed is the ram >> buffer filling up. Solr defaults to a ramBufferSizeMB value of 100. >> But that does not translate to a segment size of 100MB -- it's merely >> the size of the ram buffer that Lucene uses for all the work related to >> building a segment. A segment resulting from a full memory buffer is >> going to be smaller than the buffer. I do not know how MUCH smaller, or >> what causes variations in that size. >> >> The general advice is to leave the buffer size alone. But with the high >> volume you've got, you might want to increase it so segments are not >> flushed as frequently. Be aware that increasing it will have an impact >> on how much heap memory gets used. Every Solr core (shard replica in >> SolrCloud terminology) that does indexing is going to need one of these >> ram buffers. >> > > I will definitely investigate this ramBufferSizeMB. > And, see through lucene code when a segment is flushed. > > Again, many thanks. > Patrick