It used to occur every 3 days ,we reduced heap and it started occurring every 5 days . From the logs we can't get much. Some times we see "unable to create new native thread" in the logs and many times no exceptions . When it says "unable to create native thread" error , we got below exceptions as we use cdcr. To eliminate cdcr from this issue , we disabled CDCR also. But we still get OOM.
WARN (cdcr-update-log-synchronizer-93-thread-1) [ ] o.a.s.h.CdcrUpdateLogSynchronizer Caught unexpected exception java.lang.OutOfMemoryError: unable to create new native thread at java.lang.Thread.start0(Native Method) ~[?:1.8.0_211] at java.lang.Thread.start(Thread.java:717) ~[?:1.8.0_211] at org.apache.http.impl.client.IdleConnectionEvictor.start(IdleConnectionEvictor.java:96) ~[httpclient-4.5.3.jar:4.5.3] at org.apache.http.impl.client.HttpClientBuilder.build(HttpClientBuilder.java:1219) ~[httpclient-4.5.3.jar:4.5.3] at org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:319) ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f - nknize - 2018-12-07 14:47:53] at org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:330) ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f - nknize - 2018-12-07 14:47:53] at org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:268) ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f - nknize - 2018-12-07 14:47:53] at org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:255) ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f - nknize - 2018-12-07 14:47:53] at org.apache.solr.client.solrj.impl.HttpSolrClient.<init>(HttpSolrClient.java:200) ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f - nknize - 2018-12-07 14:47:53] at org.apache.solr.client.solrj.impl.HttpSolrClient$Builder.build(HttpSolrClient.java:957) ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f - nknize - 2018-12-07 14:47:53] at org.apache.solr.handler.CdcrUpdateLogSynchronizer$UpdateLogSynchronisation.run(CdcrUpdateLogSynchronizer.java:139) [solr-core-7.6.0.jar:7.6.0-SNAPSHOT 34d82ed033cccd8120431b73e93554b85b24a278 - i843100 - 2019-09-30 14:02:46] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_211] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [?:1.8.0_211] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_211] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [?:1.8.0_211] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_211] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_211] Thanks, Raji On Thu, Apr 30, 2020 at 12:24 AM Mikhail Khludnev <m...@apache.org> wrote: > Raji, how that "OOM for solr occur in every 5 days." exactly looks like? > What is the error message? Where it's occurring exactly? > > On Thu, Apr 30, 2020 at 1:30 AM Raji N <rajis...@gmail.com> wrote: > > > Thanks so much Jan. Will try your suggestions , yes we are also running > > solr inside docker. > > > > Thanks, > > Raji > > > > On Wed, Apr 29, 2020 at 1:46 PM Jan Høydahl <jan....@cominvent.com> > wrote: > > > > > I have seen the same, but only in Docker. > > > I think it does not relate to Solr’s off-heap usage for filters and > other > > > data structures, but rather how Docker treats memory-mapped files as > > > virtual memory. > > > As you know, when using MMapDirectoryFactory, you actually let Linux > > > handle the loading and unloading of the index files, and Solr will > access > > > them as if they were in a huge virtual memory pool. Naturally the index > > > files grow large, and there is something strange going on in the way > > Docker > > > handles this, leading to OOM, not for Java heap but for the process. > > > > > > I have no definitive answer, but so far my research has found a few > > > possible settings > > > > > > Set env.var MALLOC_ARENA_MAX=2 > > > Try to limit -XX:MaxDirectMemorySize > > > Lower mem swappiness in Docker (--memory-swappiness 0) > > > More generic insight into java mem allocation in Docker: > > > https://dzone.com/articles/native-memory-allocation-in-examples > > > > > > Have not yet found a silver bullet, so very interested in this thread. > > > > > > Jan > > > > > > > 29. apr. 2020 kl. 19:26 skrev Raji N <rajis...@gmail.com>: > > > > > > > > Thank you for your reply. When OOM happens somehow it doesn't > generate > > > > dump file. So we have hourly heaps running to diagnose this issue. > Heap > > > is > > > > around 700MB and threads around 150. But 29GB of native memory is > used > > > up, > > > > it is consumed by java.io.DirectBufferR (27GB major consumption) and > > > > java.io.DirectByteBuffer objects . > > > > > > > > We use solr 7.6.0 in solrcloud mode and OS is alpine . Java version > > > > > > > > java -version > > > > > > > > Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8 > > > > > > > > java version "1.8.0_211" > > > > > > > > Java(TM) SE Runtime Environment (build 1.8.0_211-b12) > > > > > > > > Java HotSpot(TM) 64-Bit Server VM (build 25.211-b12, mixed mode) > > > > > > > > > > > > > > > > Thanks much for taking a look at it. > > > > > > > > Raji > > > > > > > > > > > > > > > > On Wed, Apr 29, 2020 at 10:04 AM Shawn Heisey <apa...@elyograg.org> > > > wrote: > > > > > > > >> On 4/29/2020 2:07 AM, Raji N wrote: > > > >>> Has anyone encountered off-heap OOM. We are thinking of reducing > heap > > > >>> further and increasing the hardcommit interval . Any other > > > suggestions? . > > > >>> Please share your thoughts. > > > >> > > > >> It sounds like it's not heap memory that's running out. > > > >> > > > >> When the OutOfMemoryError is logged, it will also contain a message > > > >> mentioning which resource ran out. > > > >> > > > >> A common message that might be logged with the OOME is "Unable to > > create > > > >> native thread". This type of error, if that's what's happening, > > > >> actually has nothing at all to do with memory, OOME is just how Java > > > >> happens to report it. > > > >> > > > >> You will need to know exactly which resource is running out before > we > > > >> can offer any assistance. > > > >> > > > >> If the OOME is logged, the message you're looking for will be in the > > > >> solr log, not the tiny special log that is created when Solr is > killed > > > >> by an OOME. What version of Solr are you running, and what OS is it > > > >> running on? > > > >> > > > >> Thanks, > > > >> Shawn > > > >> > > > > > > > > > > > -- > Sincerely yours > Mikhail Khludnev >