I don't know exactly, but couldn't it hit host-wide threads limit limitation?
On Fri, May 1, 2020 at 11:02 AM Raji N <rajis...@gmail.com> wrote: > Thanks for your reply . Sure will take a look at the docker host log. But > even when we got "unable to create new native thread" error , the heap dump > taken within hour before (we have hourly heap generation) the OOM did not > have more than 150 to 160 threads. So it doesn't look like it happens due > to running out of threads. Rather suspecting it happens because there is no > native memory?. > > Thanks, > Raji > > On Fri, May 1, 2020 at 12:13 AM Mikhail Khludnev <m...@apache.org> wrote: > > > > java.lang.OutOfMemoryError: unable to create new native thread > > Usually mean code flaw, but there is a workaround to trigger heap GC. > > It happens when app creates threads instead of proper pooling, and no GC > > occurs, so java Thread objects hangs in heap in stopped state, but every > of > > them holds a native thread handler; and system run out of native threads > > sooner or later. So, in this case reducing heap size, frees native thread > > and app is able to recycle them. But you are right, it's rather better to > > disable it. > > Also, check docker host log, there's a specific error message for java > > under docker. > > > > On Fri, May 1, 2020 at 3:55 AM Raji N <rajis...@gmail.com> wrote: > > > > > It used to occur every 3 days ,we reduced heap and it started > > > occurring every 5 days . From the logs we can't get much. Some times > we > > > see "unable to create new native thread" in the logs and many times no > > > exceptions . > > > When it says "unable to create native thread" error , we got below > > > exceptions as we use cdcr. To eliminate cdcr from this issue , we > > disabled > > > CDCR also. But we still get OOM. > > > > > > WARN (cdcr-update-log-synchronizer-93-thread-1) [ ] > > > o.a.s.h.CdcrUpdateLogSynchronizer Caught unexpected exception > > > > > > java.lang.OutOfMemoryError: unable to create new native thread > > > > > > at java.lang.Thread.start0(Native Method) ~[?:1.8.0_211] > > > > > > at java.lang.Thread.start(Thread.java:717) > ~[?:1.8.0_211] > > > > > > at > > > > > > > > > org.apache.http.impl.client.IdleConnectionEvictor.start(IdleConnectionEvictor.java:96) > > > ~[httpclient-4.5.3.jar:4.5.3] > > > > > > at > > > > > > > > > org.apache.http.impl.client.HttpClientBuilder.build(HttpClientBuilder.java:1219) > > > ~[httpclient-4.5.3.jar:4.5.3] > > > > > > at > > > > > > > > > org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:319) > > > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f > > > - nknize - 2018-12-07 14:47:53] > > > > > > at > > > > > > > > > org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:330) > > > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f > > > - nknize - 2018-12-07 14:47:53] > > > > > > at > > > > > > > > > org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:268) > > > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f > > > - nknize - 2018-12-07 14:47:53] > > > > > > at > > > > > > > > > org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:255) > > > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f > > > - nknize - 2018-12-07 14:47:53] > > > > > > at > > > > > > > > > org.apache.solr.client.solrj.impl.HttpSolrClient.<init>(HttpSolrClient.java:200) > > > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f > > > - nknize - 2018-12-07 14:47:53] > > > > > > at > > > > > > > > > org.apache.solr.client.solrj.impl.HttpSolrClient$Builder.build(HttpSolrClient.java:957) > > > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f > > > - nknize - 2018-12-07 14:47:53] > > > > > > at > > > > > > > > > org.apache.solr.handler.CdcrUpdateLogSynchronizer$UpdateLogSynchronisation.run(CdcrUpdateLogSynchronizer.java:139) > > > [solr-core-7.6.0.jar:7.6.0-SNAPSHOT > > > 34d82ed033cccd8120431b73e93554b85b24a278 - i843100 - 2019-09-30 > > > 14:02:46] > > > > > > at > > > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > > > [?:1.8.0_211] > > > > > > at > > > java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > > > [?:1.8.0_211] > > > > > > at > > > > > > > > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > > > [?:1.8.0_211] > > > > > > at > > > > > > > > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > > > [?:1.8.0_211] > > > > > > at > > > > > > > > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > > [?:1.8.0_211] > > > > > > at > > > > > > > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > > > [?:1.8.0_211] > > > > > > Thanks, > > > Raji > > > On Thu, Apr 30, 2020 at 12:24 AM Mikhail Khludnev <m...@apache.org> > > wrote: > > > > > > > Raji, how that "OOM for solr occur in every 5 days." exactly looks > > like? > > > > What is the error message? Where it's occurring exactly? > > > > > > > > On Thu, Apr 30, 2020 at 1:30 AM Raji N <rajis...@gmail.com> wrote: > > > > > > > > > Thanks so much Jan. Will try your suggestions , yes we are also > > running > > > > > solr inside docker. > > > > > > > > > > Thanks, > > > > > Raji > > > > > > > > > > On Wed, Apr 29, 2020 at 1:46 PM Jan Høydahl <jan....@cominvent.com > > > > > > wrote: > > > > > > > > > > > I have seen the same, but only in Docker. > > > > > > I think it does not relate to Solr’s off-heap usage for filters > and > > > > other > > > > > > data structures, but rather how Docker treats memory-mapped files > > as > > > > > > virtual memory. > > > > > > As you know, when using MMapDirectoryFactory, you actually let > > Linux > > > > > > handle the loading and unloading of the index files, and Solr > will > > > > access > > > > > > them as if they were in a huge virtual memory pool. Naturally the > > > index > > > > > > files grow large, and there is something strange going on in the > > way > > > > > Docker > > > > > > handles this, leading to OOM, not for Java heap but for the > > process. > > > > > > > > > > > > I have no definitive answer, but so far my research has found a > few > > > > > > possible settings > > > > > > > > > > > > Set env.var MALLOC_ARENA_MAX=2 > > > > > > Try to limit -XX:MaxDirectMemorySize > > > > > > Lower mem swappiness in Docker (--memory-swappiness 0) > > > > > > More generic insight into java mem allocation in Docker: > > > > > > https://dzone.com/articles/native-memory-allocation-in-examples > > > > > > > > > > > > Have not yet found a silver bullet, so very interested in this > > > thread. > > > > > > > > > > > > Jan > > > > > > > > > > > > > 29. apr. 2020 kl. 19:26 skrev Raji N <rajis...@gmail.com>: > > > > > > > > > > > > > > Thank you for your reply. When OOM happens somehow it doesn't > > > > generate > > > > > > > dump file. So we have hourly heaps running to diagnose this > > issue. > > > > Heap > > > > > > is > > > > > > > around 700MB and threads around 150. But 29GB of native memory > is > > > > used > > > > > > up, > > > > > > > it is consumed by java.io.DirectBufferR (27GB major > consumption) > > > and > > > > > > > java.io.DirectByteBuffer objects . > > > > > > > > > > > > > > We use solr 7.6.0 in solrcloud mode and OS is alpine . Java > > version > > > > > > > > > > > > > > java -version > > > > > > > > > > > > > > Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8 > > > > > > > > > > > > > > java version "1.8.0_211" > > > > > > > > > > > > > > Java(TM) SE Runtime Environment (build 1.8.0_211-b12) > > > > > > > > > > > > > > Java HotSpot(TM) 64-Bit Server VM (build 25.211-b12, mixed > mode) > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks much for taking a look at it. > > > > > > > > > > > > > > Raji > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Apr 29, 2020 at 10:04 AM Shawn Heisey < > > apa...@elyograg.org > > > > > > > > > > wrote: > > > > > > > > > > > > > >> On 4/29/2020 2:07 AM, Raji N wrote: > > > > > > >>> Has anyone encountered off-heap OOM. We are thinking of > > reducing > > > > heap > > > > > > >>> further and increasing the hardcommit interval . Any other > > > > > > suggestions? . > > > > > > >>> Please share your thoughts. > > > > > > >> > > > > > > >> It sounds like it's not heap memory that's running out. > > > > > > >> > > > > > > >> When the OutOfMemoryError is logged, it will also contain a > > > message > > > > > > >> mentioning which resource ran out. > > > > > > >> > > > > > > >> A common message that might be logged with the OOME is "Unable > > to > > > > > create > > > > > > >> native thread". This type of error, if that's what's > happening, > > > > > > >> actually has nothing at all to do with memory, OOME is just > how > > > Java > > > > > > >> happens to report it. > > > > > > >> > > > > > > >> You will need to know exactly which resource is running out > > before > > > > we > > > > > > >> can offer any assistance. > > > > > > >> > > > > > > >> If the OOME is logged, the message you're looking for will be > in > > > the > > > > > > >> solr log, not the tiny special log that is created when Solr > is > > > > killed > > > > > > >> by an OOME. What version of Solr are you running, and what OS > > is > > > it > > > > > > >> running on? > > > > > > >> > > > > > > >> Thanks, > > > > > > >> Shawn > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Sincerely yours > > > > Mikhail Khludnev > > > > > > > > > > > > > -- > > Sincerely yours > > Mikhail Khludnev > > > -- Sincerely yours Mikhail Khludnev