ankitsultana opened a new issue, #9019: URL: https://github.com/apache/pinot/issues/9019
Around a month ago, we had seen an issue where Pinot brokers for one of our High QPS use-cases had what looked like a deadlock/livelock related issue. The brokers were serving traffic around 300-400 QPS and during the issue the number of the threads in the brokers kept on increasing (to up to 30k/40k). When we took a thread-dump, we saw around 30k threads with the following stack-trace. We were able to remediate the issue by increasing the number of brokers in the tenant to lower the average QPS per broker-instance. We didn't have a chance to dive-deep into the issue but thought of sharing it here in case someone here has seen this issue as well. ``` "jersey-server-managed-async-executor-154433" #173980 prio=5 os_prio=0 tid=0x00007f5de8468000 nid=0xa1b8 waiting on condition [0x00007f5a0a868000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00007f6182902770> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943) at org.glassfish.hk2.utilities.general.Hk2ThreadLocal.get(Hk2ThreadLocal.java:108) at org.jvnet.hk2.internal.PerLocatorUtilities.getAutoAnalyzerName(PerLocatorUtilities.java:166) at org.jvnet.hk2.internal.ConstantActiveDescriptor.<init>(ConstantActiveDescriptor.java:84) at org.jvnet.hk2.internal.ServiceLocatorImpl.internalGetInjecteeDescriptor(ServiceLocatorImpl.java:562) at org.jvnet.hk2.internal.ServiceLocatorImpl.getInjecteeDescriptor(ServiceLocatorImpl.java:587) at org.jvnet.hk2.internal.ThreeThirtyResolver.resolve(ThreeThirtyResolver.java:70) at org.jvnet.hk2.internal.ClazzCreator.resolve(ClazzCreator.java:212) at org.jvnet.hk2.internal.ClazzCreator.resolveAllDependencies(ClazzCreator.java:229) at org.jvnet.hk2.internal.ClazzCreator.create(ClazzCreator.java:358) at org.jvnet.hk2.internal.SystemDescriptor.create(SystemDescriptor.java:487) at org.jvnet.hk2.internal.PerLookupContext.findOrCreate(PerLookupContext.java:70) at org.jvnet.hk2.internal.Utilities.createService(Utilities.java:2022) at org.jvnet.hk2.internal.ServiceLocatorImpl.internalGetService(ServiceLocatorImpl.java:774) at org.jvnet.hk2.internal.ServiceLocatorImpl.internalGetService(ServiceLocatorImpl.java:737) at org.jvnet.hk2.internal.ServiceLocatorImpl.getService(ServiceLocatorImpl.java:733) at org.glassfish.jersey.inject.hk2.SupplierFactoryBridge.provide(SupplierFactoryBridge.java:74) at org.jvnet.hk2.internal.FactoryCreator.create(FactoryCreator.java:153) at org.jvnet.hk2.internal.SystemDescriptor.create(SystemDescriptor.java:487) at org.glassfish.jersey.inject.hk2.RequestContext.findOrCreate(RequestContext.java:59) at org.jvnet.hk2.internal.Utilities.createService(Utilities.java:2022) at org.jvnet.hk2.internal.ServiceHandleImpl.getService(ServiceHandleImpl.java:114) - locked <0x00007f7e5d003c88> (a java.lang.Object) at org.jvnet.hk2.internal.ServiceHandleImpl.getService(ServiceHandleImpl.java:88) at org.glassfish.jersey.inject.hk2.ContextInjectionResolverImpl.resolve(ContextInjectionResolverImpl.java:103) at org.glassfish.jersey.inject.hk2.ContextInjectionResolverImpl.resolve(ContextInjectionResolverImpl.java:121) at org.glassfish.jersey.server.internal.inject.DelegatedInjectionValueParamProvider.lambda$getValueProvider$0(DelegatedInjectionValueParamProvider.java:67) at org.glassfish.jersey.server.internal.inject.DelegatedInjectionValueParamProvider$$Lambda$199/1475889071.apply(Unknown Source) at org.glassfish.jersey.server.spi.internal.ParamValueFactoryWithSource.apply(ParamValueFactoryWithSource.java:50) at org.glassfish.jersey.server.spi.internal.ParameterValueHelper.getParameterValues(ParameterValueHelper.java:64) at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$AbstractMethodParamInvoker.getParamValues(JavaResourceMethodDispatcherProvider.java:109) at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$VoidOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:159) at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:79) at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:469) at org.glassfish.jersey.server.model.ResourceMethodInvoker.lambda$apply$0(ResourceMethodInvoker.java:381) at org.glassfish.jersey.server.model.ResourceMethodInvoker$$Lambda$233/665077335.call(Unknown Source) at org.glassfish.jersey.server.ServerRuntime$AsyncResponder$2$1.run(ServerRuntime.java:819) at org.glassfish.jersey.internal.Errors$1.call(Errors.java:248) at org.glassfish.jersey.internal.Errors$1.call(Errors.java:244) at org.glassfish.jersey.internal.Errors.process(Errors.java:292) at org.glassfish.jersey.internal.Errors.process(Errors.java:274) at org.glassfish.jersey.internal.Errors.process(Errors.java:244) at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:265) at org.glassfish.jersey.server.ServerRuntime$AsyncResponder$2.run(ServerRuntime.java:814) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) ``` Here's the paste for the possible object which could be part of a deadlock or livelock. ``` ┌ /private/tmp 130 ↵ ❯❯❯ cat ~/Desktop/a.thdump| grep "0x00007f6182902770" | wc -l 33893 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org