On 7/3/2019 1:36 AM, Avi Steiner wrote:
We had some cases with customers (Solr 5.3.1, one search node, one shard) with
huge tlog files (more than 1 GB).
With 30 seconds on the autoCommit, that should not be happening.
When a hard commit fires, the current tlog is closed and a new one
starts. Solr only keeps enough tlogs to meet certain minimum
requirements. If the tlogs never rotate, then Solr has to keep the huge
one to meet the requirements.
I have heard of one situation that causes huge tlogs even with
autoCommit. That is a misconfigured SolrCloud feature called Cross Data
Center Replication (CDCR) ... but CDCR did not exist in version 5.3.1.
It was added in 6.0.0.
Do you have a solr.log file covering a significant period of time? At
least several minutes while indexing occurs.
I don't have enough logs so I don't know if commit failed or not. I just
remember there were OOM messages.
What OS is Solr running on? On most platforms other than Windows, an
OOM will cause Solr to self-terminate. On Windows, that wouldn't
happen, Solr would most likely keep running.
The reason that we configured Solr to self-terminate on OOM is that
program operation is completely unpredictable once OOM happens. Index
corruption is only one of the possible side effects. It is far safer to
terminate.
When Solr self-terminates, it will NOT automatically restart with the
out-of-the-box setup. You would have to create that functionality yourself.
If you have the actual OOM message ... what resource does it say was
depleted? It is not always heap memory.
Thanks,
Shawn