Hi,

We are using solrcloud 7.6.0 and we have containerized solr. We have around
30 collections and 7 solr nodes in the cluster.  Though we have
containerized , we have  one zookeeper container and one solr container
running in a host.
We have 24GB  heap and total container has memory 49GB , which leaves off-
heap as 25GB. We have set

max user processes              (-u) unlimited

virtual memory          (kbytes, -v) unlimited

file locks                      (-x) unlimited

max memory size         (kbytes, -m) unlimited


 OOM for solr occur in every 5 days. When we examined heapdumps , the heap
is only around 700MB , but we have off-heap memory as 29GB.

Major consumer is  java.nio.DirectByteBufferR


Major Reference chains



8,820,117Kb (1462.3%): *java.nio.DirectByteBufferR*: 64 objects

↖*sun.misc.Cleaner**.referent*

↖*sun.misc.Cleaner**.{prev}*

↖*java.nio.DirectByteBuffer**.cleaner*

↖*java.nio.ByteBuffer[]*

↖*sun.nio.ch.Util$BufferCache**.buffers*

↖*j.l.ThreadLocal$ThreadLocalMap$Entry**.value*

↖*j.l.ThreadLocal$ThreadLocalMap$Entry[]*

↖*j.l.ThreadLocal$ThreadLocalMap**.table*

↖*j.l.Thread**.threadLocals*

↖*j.l.Thread[]*

↖*j.l.ThreadGroup**.threads*

↖*j.l.ThreadGroup[]*

↖*j.l.ThreadGroup**.groups*

↖Java Static *sun.rmi.runtime.NewThreadAction**.systemThreadGroup*

3,534,863Kb (586.0%): *java.nio.DirectByteBufferR*: 22 objects

↖*sun.misc.Cleaner**.referent*

↖*sun.misc.Cleaner**.{next}*

↖*sun.nio.fs.NativeBuffer**.cleaner*

↖*sun.nio.fs.NativeBuffer[]*

↖*j.l.ThreadLocal$ThreadLocalMap$Entry**.value*

↖*j.l.ThreadLocal$ThreadLocalMap$Entry[]*

↖*j.l.ThreadLocal$ThreadLocalMap**.table*

↖*j.l.Thread**.threadLocals*

↖*j.l.Thread[]*

↖*j.l.ThreadGroup**.threads*

↖*j.l.ThreadGroup[]*

↖*j.l.ThreadGroup**.groups*

↖Java Static *sun.rmi.runtime.NewThreadAction**.systemThreadGroup*

3,145,728Kb (521.5%): *java.nio.DirectByteBufferR*: 3 objects

↖*java.nio.ByteBuffer[]*

↖*org.apache.lucene.store.ByteBufferIndexInput$MultiBufferImpl**.buffers*

↖*org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader*
*.fieldsStream*

↖*org.apache.lucene.index.SegmentCoreReaders**.fieldsReaderOrig*

↖*org.apache.lucene.index.SegmentReader**.core*

↖*org.apache.lucene.index.SegmentReader[]*

↖*org.apache.lucene.index.StandardDirectoryReader**.subReaders*

↖*org.apache.solr.search.SolrIndexSearcher**.rawReader*

↖*{j.u.concurrent.ConcurrentHashMap}**.values*

↖*org.apache.solr.core.SolrCore**.infoRegistry*

↖*{j.u.LinkedHashMap}**.values*

↖*org.apache.solr.core.SolrCores**.cores*

↖*org.apache.solr.core.CoreContainer**.solrCores*

↖*org.apache.solr.cloud.RecoveringCoreTermWatcher**.coreContainer*

↖*{j.u.HashSet}*

↖*org.apache.solr.cloud.ZkShardTerms**.listeners*

↖*{j.u.concurrent.ConcurrentHashMap}**.keys*

↖Java Static *org.apache.solr.common.util.ObjectReleaseTracker**.OBJECTS*

2,605,258Kb (431.9%): *java.nio.DirectByteBufferR*: 184 objects

↖*org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl**.curBuf*

↖*org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader*
*.fieldsStream*

↖*org.apache.lucene.index.SegmentCoreReaders**.fieldsReaderOrig*

↖*org.apache.lucene.index.SegmentReader**.core*

↖*org.apache.lucene.index.SegmentReader[]*

↖*org.apache.lucene.index.StandardDirectoryReader**.subReaders*

↖*org.apache.solr.search.SolrIndexSearcher**.rawReader*

↖*{j.u.concurrent.ConcurrentHashMap}**.values*

↖*org.apache.solr.core.SolrCore**.infoRegistry*

↖*{j.u.LinkedHashMap}**.values*

↖*org.apache.solr.core.SolrCores**.cores*

↖*org.apache.solr.core.CoreContainer**.solrCores*

↖*org.apache.solr.cloud.RecoveringCoreTermWatcher**.coreContainer*

↖*{j.u.HashSet}*

↖*org.apache.solr.cloud.ZkShardTerms**.listeners*

↖*{j.u.concurrent.ConcurrentHashMap}**.keys*

↖Java Static *org.apache.solr.common.util.ObjectReleaseTracker**.OBJECTS*

1,790,441Kb (296.8%): *java.nio.DirectByteBufferR*: 70 objects

↖*org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl**.curBuf*

↖*org.apache.lucene.codecs.lucene50.Lucene50CompoundReader**.handle*

↖*org.apache.lucene.index.SegmentCoreReaders**.cfsReader*

↖*org.apache.lucene.index.SegmentReader**.core*

↖*org.apache.lucene.index.SegmentReader[]*

↖*org.apache.lucene.index.StandardDirectoryReader**.subReaders*

↖*org.apache.solr.search.SolrIndexSearcher**.rawReader*

↖*{j.u.concurrent.ConcurrentHashMap}**.values*

↖*org.apache.solr.core.SolrCore**.infoRegistry*

↖*{j.u.LinkedHashMap}**.values*

↖*org.apache.solr.core.SolrCores**.cores*

↖*org.apache.solr.core.CoreContainer**.solrCores*

↖*org.apache.solr.cloud.RecoveringCoreTermWatcher**.coreContainer*

↖*{j.u.HashSet}*

↖*org.apache.solr.cloud.ZkShardTerms**.listeners*

↖*{j.u.concurrent.ConcurrentHashMap}**.keys*

↖Java Static *org.apache.solr.common.util.ObjectReleaseTracker**.OBJECTS*

1,385,471Kb (229.7%): *java.nio.DirectByteBufferR*: 85 objects

↖*sun.misc.Cleaner**.referent*

↖*sun.misc.Cleaner**.{next}*

↖*java.nio.DirectByteBuffer**.cleaner*

↖*java.nio.ByteBuffer[]*

↖*sun.nio.ch.Util$BufferCache**.buffers*

↖*j.l.ThreadLocal$ThreadLocalMap$Entry**.value*

↖*j.l.ThreadLocal$ThreadLocalMap$Entry[]*

↖*j.l.ThreadLocal$ThreadLocalMap**.table*

↖*j.l.Thread**.threadLocals*

↖*j.l.Thread[]*

↖*j.l.ThreadGroup**.threads*

↖*j.l.ThreadGroup[]*

↖*j.l.ThreadGroup**.groups*

↖Java Static *sun.rmi.runtime.NewThreadAction**.systemThreadGroup*

1,358,286Kb (225.2%): *java.nio.DirectByteBufferR*: 3 objects

↖*org.apache.lucene.store.ByteBufferIndexInput$MultiBufferImpl**.curBuf*

1,184,137Kb (196.3%): *java.nio.DirectByteBufferR*: 95 objects

↖*org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl**.curBuf*

773,799Kb (128.3%): *java.nio.DirectByteBufferR*: 92 objects

↖*org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl**.curBuf*

744,089Kb (123.4%): *java.nio.DirectByteBufferR*: 11 objects

↖*sun.misc.Cleaner**.referent*

659,739Kb (109.4%): *java.nio.DirectByteBufferR*: 66 objects

↖*org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl**.curBuf*

588,605Kb (97.6%): *java.nio.DirectByteBufferR*: 9 objects

↖*sun.misc.Cleaner**.referent*

485,104Kb (80.4%): *java.nio.DirectByteBufferR*: 59 objects

↖*org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl**.curBuf*

395,376Kb (65.5%): *java.nio.DirectByteBufferR*: 46 objects

↖*org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl**.curBuf*

317,410Kb (52.6%): *java.nio.DirectByteBufferR*: 60 objects

↖*org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl**.curBuf*

314,946Kb (52.2%): *java.nio.DirectByteBufferR*: 56 objects

↖*org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl**.curBuf*

211,577Kb (35.1%): *java.nio.DirectByteBufferR*: 44 objects

↖*org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl**.curBuf*

195,447Kb (32.4%): *java.nio.DirectByteBufferR*: 57 objects

↖*org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl**.curBuf*

121,962Kb (20.2%): *java.nio.DirectByteBufferR*: 4 objects

↖*org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl**.curBuf*

100,200Kb (16.6%): *java.nio.DirectByteBufferR*: 185 objects

↖*org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl**.curBuf*

51,464Kb (8.5%): *java.nio.DirectByteBufferR*: 64 objects

↖*org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl**.curBuf*

49,435Kb (8.2%): *java.nio.DirectByteBufferR*: 7 objects

↖*org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl**.curBuf*

32,748Kb (5.4%): *java.nio.DirectByteBufferR*: 3 objects

↖*org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl**.curBuf*

30,610Kb (5.1%): *java.nio.DirectByteBufferR*: 4 objects

↖*org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl**.curBuf*

27,766Kb (4.6%): *java.nio.DirectByteBufferR*: 46 objects

↖*org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl**.curBuf*

26,620Kb (4.4%): *java.nio.DirectByteBufferR*: 4 objects

↖*org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl**.curBuf*

25,570Kb (4.2%): *java.nio.DirectByteBufferR*: 3 objects

↖*org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl**.curBuf*

23,507Kb (3.9%): *java.nio.DirectByteBufferR*: 4 objects

↖*org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl**.curBuf*

23,095Kb (3.8%): *java.nio.DirectByteBufferR*: 3 objects

↖*org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl**.curBuf*

3,396Kb (0.6%): *java.nio.DirectByteBufferR*: 5 objects

↖*org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl**.curBuf*

1,745Kb (0.3%): *java.nio.DirectByteBufferR*: 3 objects

↖*sun.misc.Cleaner**.referent*

1,457Kb (0.2%): *java.nio.DirectByteBuffer*: 55 objects

↖*java.util.concurrent.atomic.AtomicReference**.value*

1,450Kb (0.2%): *java.nio.DirectByteBufferR*: 7 objects

↖*org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl**.curBuf*

1,309Kb (0.2%): *java.nio.DirectByteBufferR*: 2 objects

↖*org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl**.curBuf*

715Kb (0.1%): *java.nio.DirectByteBufferR*: 11 objects (78% of all objects
referenced here) 610Kb (0.1%), *java.nio.DirectByteBuffer*: 2 objects (14%
of all objects referenced here) 104Kb (< 0.1%)

↖*sun.misc.Cleaner**.referent*



We are doing hardcommit with opensearcher false every 5 secs and soft
commit every 2 mins.


Has anyone encountered off-heap OOM. We are thinking of reducing heap
further and increasing the hardcommit interval . Any other suggestions? .
Please share your thoughts.


Thanks,

Raji

Reply via email to