Thanks Eric. I don't seeing anywhere that CDCR is not recommended for
production use. Took the thread dump. Seeing about 140 CDCR threads


cdcr-replicator-219-thread-8" #787 prio=5 os_prio=0 tid=0x00007f7c34009000
nid=0x50a waiting on condition [0x00007f7ec871b000]

   java.lang.Thread.State: WAITING (parking)

        at sun.misc.Unsafe.park(Native Method)

        - parking to wait for  <0x00000001da724ca0> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)

        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)

        at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)

        at
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)

        at
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)

        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)

        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

        at java.lang.Thread.run(Thread.java:748)




cdcr-update-log-synchronizer-157-thread-1" #240 prio=5 os_prio=0
tid=0x00007f8782543800 nid=0x2e5 waiting on condition [0x00007f82ad99c000]

   java.lang.Thread.State: WAITING (parking)

        at sun.misc.Unsafe.park(Native Method)

        - parking to wait for  <0x00000001d7f9e8e8> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)

        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)

        at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)

        at
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1081)

        at
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)

        at
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)

        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)

        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

        at java.lang.Thread.run(Thread.java:748)


Thanks,

Raji

On Sun, Mar 29, 2020 at 5:18 AM Erick Erickson <erickerick...@gmail.com>
wrote:

> What that error usually means is that there are a zillion threads running.
>
> Try taking a thread dump. It’s _probable_ that it’s CDCR, but
> take a look at the thread dump to see if you have lots of
> threads that are running. Any by “lots” here, I mean 100s of threads
> that reference the same component, in this case that have cdcr in
> the stack trace.
>
> CDCR is not getting active work at this point, you might want to
> consider another replication strategy if you’re not willing to fix
> the code.
>
> Best,
> Erick
>
> > On Mar 29, 2020, at 4:17 AM, Raji N <rajis...@gmail.com> wrote:
> >
> > Hi All,
> >
> > We running solrcloud 7.6  (with the patch #
> >
> https://issues.apache.org/jira/secure/attachment/12969150)/SOLR-11724.patchon
> > production on 7 hosts in  containers. The container memory is 48GB , heap
> > is 24GB.
> > ulimit -v
> >
> > unlimited
> >
> > ulimit -m
> >
> > unlimited
> > We don't have any custom code in solr. We have set up  bidirectional CDCR
> > between primary and secondary Datacenter. Our secondary DC is very
> unstable
> > and many times many instances are down.
> >
> > We get below exception quite often. Is this because the CDCR connection
> is
> > broken.
> >
> > WARN  (cdcr-update-log-synchronizer-80-thread-1) [   ]
> > o.a.s.h.CdcrUpdateLogSynchronizer Caught unexpected exception
> >
> > java.lang.OutOfMemoryError: unable to create new native thread
> >
> >               at java.lang.Thread.start0(Native Method) ~[?:1.8.0_211]
> >
> >               at java.lang.Thread.start(Thread.java:717) ~[?:1.8.0_211]
> >
> >               at
> >
> org.apache.http.impl.client.IdleConnectionEvictor.start(IdleConnectionEvictor.java:96)
> > ~[httpclient-4.5.3.jar:4.5.3]
> >
> >               at
> >
> org.apache.http.impl.client.HttpClientBuilder.build(HttpClientBuilder.java:1219)
> > ~[httpclient-4.5.3.jar:4.5.3]
> >
> >               at
> >
> org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:319)
> > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> > - nknize - 2018-12-07 14:47:53]
> >
> >               at
> >
> org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:330)
> > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> > - nknize - 2018-12-07 14:47:53]
> >
> >               at
> >
> org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:268)
> > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> > - nknize - 2018-12-07 14:47:53]
> >
> >               at
> >
> org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:255)
> > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> > - nknize - 2018-12-07 14:47:53]
> >
> >               at
> >
> org.apache.solr.client.solrj.impl.HttpSolrClient.<init>(HttpSolrClient.java:200)
> > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> > - nknize - 2018-12-07 14:47:53]
> >
> >               at
> >
> org.apache.solr.client.solrj.impl.HttpSolrClient$Builder.build(HttpSolrClient.java:957)
> > ~[solr-solrj-7.6.0.jar:7.6.0 719cde97f84640faa1e3525690d262946571245f
> > - nknize - 2018-12-07 14:47:53]
> >
> >               at
> >
> org.apache.solr.handler.CdcrUpdateLogSynchronizer$UpdateLogSynchronisation.run(CdcrUpdateLogSynchronizer.java:139)
> > [solr-core-7.6.0.jar:7.6.0-SNAPSHOT
> > 34d82ed033cccd8120431b73e93554b85b24a278 - i843100 - 2019-09-30
> > 14:02:46]
> >
> >               at
> > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> > [?:1.8.0_211]
> >
> >               at
> > java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> > [?:1.8.0_211]
> >
> >               at
> >
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> > [?:1.8.0_211]
> >
> >               at
> >
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> > [?:1.8.0_211]
> >
> >               at
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> > [?:1.8.0_211]
> >
> >               at
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> > [?:1.8.0_211]
> >
> >               at java.lang.Thread.run(Thread.java:748) [?:1.8.0_211]
> >
> > Thanks,
> > Raji
>
>

Reply via email to