Why do you think Solr is hung? There are a bunch of reasons a Solr instance may be down, simply because it's not accepting requests or not recovering may indicate it never started correctly in the first place.
I'd first suggest you look at the Solr logs on the machines in question, they may give you some information that's useful, especially if you have a problem with your config files (schema.xml etc.). Beyond that, though, I'd suggest you contact Cloudera's user list as it's likely those folks will be more familiar with how to troubleshoot this using the tools provided by CDH. Best, Erick On Sun, Mar 20, 2016 at 6:47 PM, 王淇霖 <wan...@xib.com.cn> wrote: > > Hi all, > We have deployed solrcloud with CDH5.4. there are 4 solr server in this > cluster. > Now 2 solr server is down, and can’t start two, solr thread is hang. > Please help to check out. Thanks. > > > "Signal Dispatcher" #5 daemon prio=9 os_prio=0 tid=0x00007fc9b81e3800 > nid=0x7c35 runnable [0x0000000000000000] > java.lang.Thread.State: RUNNABLE > > "Surrogate Locker Thread (Concurrent GC)" #4 daemon prio=9 os_prio=0 > tid=0x00007fc9b81e2800 nid=0x7c34 waiting on condition [0x0000000000000000] > java.lang.Thread.State: RUNNABLE > > "Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007fc9b81aa800 nid=0x7c33 > in Object.wait() [0x00007fc1204f3000] > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143) > - locked <0x00007fc1c39b6608> (a java.lang.ref.ReferenceQueue$Lock) > at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164) > at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209) > > "Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007fc9b81a8800 > nid=0x7c32 in Object.wait() [0x00007fc1205f4000] > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > at java.lang.Object.wait(Object.java:502) > at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:157) > - locked <0x00007fc1c3a068a0> (a java.lang.ref.Reference$Lock) > > "main" #1 prio=5 os_prio=0 tid=0x00007fc9b800a800 nid=0x7c18 waiting on > condition [0x00007fc9beed8000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00007fc1c39eff60> (a > java.util.concurrent.FutureTask) > at > java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429) > at java.util.concurrent.FutureTask.get(FutureTask.java:191) > at > java.util.concurrent.AbstractExecutorService.invokeAll(AbstractExecutorService.java:244) > at org.apache.solr.core.CoreContainer.load(CoreContainer.java:266) > at > org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:193) > at > org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:140) > at > org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:298) > at > org.apache.catalina.core.ApplicationFilterConfig.<init>(ApplicationFilterConfig.java:119) > at > org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4076) > - locked <0x00007fc1c39f12a8> (a java.util.HashMap) > at > org.apache.catalina.core.StandardContext.start(StandardContext.java:4730) > - locked <0x00007fc1c39b0830> (a > org.apache.catalina.core.StandardContext) > at > org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:802) > at > org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:779) > at > org.apache.catalina.core.StandardHost.addChild(StandardHost.java:583) > at > org.apache.catalina.startup.HostConfig.deployDirectory(HostConfig.java:1080) > at > org.apache.catalina.startup.HostConfig.deployDirectories(HostConfig.java:1003) > at > org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:507) > at > org.apache.catalina.startup.HostConfig.start(HostConfig.java:1322) > at > org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:325) > at > org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:142) > at > org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1068) > - locked <0x00007fc1c3a30200> (a > org.apache.catalina.core.StandardHost) > at > org.apache.catalina.core.StandardHost.start(StandardHost.java:822) > - locked <0x00007fc1c3a30200> (a > org.apache.catalina.core.StandardHost) > at > org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1060) > - locked <0x00007fc1c3a33a80> (a > org.apache.catalina.core.StandardEngine) > at > org.apache.catalina.core.StandardEngine.start(StandardEngine.java:463) > at > org.apache.catalina.core.StandardService.start(StandardService.java:525) > - locked <0x00007fc1c3a33a80> (a > org.apache.catalina.core.StandardEngine) > at > org.apache.catalina.core.StandardServer.start(StandardServer.java:759) > - locked <0x00007fc1c39ebd28> (a [Lorg.apache.catalina.Service;) > at org.apache.catalina.startup.Catalina.start(Catalina.java:595) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:289) > at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:414) > > "VM Thread" os_prio=0 tid=0x00007fc9b81a3000 nid=0x7c31 runnable > > "Gang worker#0 (Parallel GC Threads)" os_prio=0 tid=0x00007fc9b801b800 > nid=0x7c19 runnable > > "Gang worker#1 (Parallel GC Threads)" os_prio=0 tid=0x00007fc9b801d800 > nid=0x7c1a runnable > > "Gang worker#2 (Parallel GC Threads)" os_prio=0 tid=0x00007fc9b801f000 > nid=0x7c1b runnable > > "Gang worker#3 (Parallel GC Threads)" os_prio=0 tid=0x00007fc9b8021000 > nid=0x7c1c runnable > > "Gang worker#4 (Parallel GC Threads)" os_prio=0 tid=0x00007fc9b8022800 > nid=0x7c1d runnable > > "Gang worker#5 (Parallel GC Threads)" os_prio=0 tid=0x00007fc9b8024800 > nid=0x7c1e runnable > > "Gang worker#6 (Parallel GC Threads)" os_prio=0 tid=0x00007fc9b8026000 > nid=0x7c1f runnable > > "Gang worker#7 (Parallel GC Threads)" os_prio=0 tid=0x00007fc9b8028000 > nid=0x7c20 runnable > > "Gang worker#8 (Parallel GC Threads)" os_prio=0 tid=0x00007fc9b8029800 > nid=0x7c21 runnable > > "Gang worker#9 (Parallel GC Threads)" os_prio=0 tid=0x00007fc9b802b800 > nid=0x7c22 runnable > > "Gang worker#10 (Parallel GC Threads)" os_prio=0 tid=0x00007fc9b802d000 > nid=0x7c23 runnable > > "Gang worker#11 (Parallel GC Threads)" os_prio=0 tid=0x00007fc9b802f000 > nid=0x7c24 runnable > > "Gang worker#12 (Parallel GC Threads)" os_prio=0 tid=0x00007fc9b8030800 > nid=0x7c25 runnable > > "Gang worker#13 (Parallel GC Threads)" os_prio=0 tid=0x00007fc9b8032800 > nid=0x7c26 runnable > > "Gang worker#14 (Parallel GC Threads)" os_prio=0 tid=0x00007fc9b8034000 > nid=0x7c27 runnable > > "Gang worker#15 (Parallel GC Threads)" os_prio=0 tid=0x00007fc9b8036000 > nid=0x7c28 runnable > > "Gang worker#16 (Parallel GC Threads)" os_prio=0 tid=0x00007fc9b8037800 > nid=0x7c29 runnable > > "Gang worker#17 (Parallel GC Threads)" os_prio=0 tid=0x00007fc9b8039800 > nid=0x7c2a runnable > > "Gang worker#18 (Parallel GC Threads)" os_prio=0 tid=0x00007fc9b803b000 > nid=0x7c2b runnable > > "Gang worker#19 (Parallel GC Threads)" os_prio=0 tid=0x00007fc9b803d000 > nid=0x7c2c runnable > > "Gang worker#20 (Parallel GC Threads)" os_prio=0 tid=0x00007fc9b803e800 > nid=0x7c2d runnable > > "Gang worker#21 (Parallel GC Threads)" os_prio=0 tid=0x00007fc9b8040800 > nid=0x7c2e runnable > > "Gang worker#22 (Parallel GC Threads)" os_prio=0 tid=0x00007fc9b8042000 > nid=0x7c2f runnable > > "Concurrent Mark-Sweep GC Thread" os_prio=0 tid=0x00007fc9b814f800 > nid=0x7c30 runnable > > "VM Periodic Task Thread" os_prio=0 tid=0x00007fc9b821b000 nid=0x7c46 > waiting on condition > > JNI global references: 258 > > regards, > wangql > > >