ankitsultana opened a new issue, #9019:
URL: https://github.com/apache/pinot/issues/9019

   Around a month ago, we had seen an issue where Pinot brokers for one of our 
High QPS use-cases had what looked like a deadlock/livelock related issue. The 
brokers were serving traffic around 300-400 QPS and during the issue the number 
of the threads in the brokers kept on increasing (to up to 30k/40k). When we 
took a thread-dump, we saw around 30k threads with the following stack-trace.
   
   We were able to remediate the issue by increasing the number of brokers in 
the tenant to lower the average QPS per broker-instance. We didn't have a 
chance to dive-deep into the issue but thought of sharing it here in case 
someone here has seen this issue as well.
   
   ```
   "jersey-server-managed-async-executor-154433" #173980 prio=5 os_prio=0 
tid=0x00007f5de8468000 nid=0xa1b8 waiting on condition [0x00007f5a0a868000]
      java.lang.Thread.State: WAITING (parking)
           at sun.misc.Unsafe.park(Native Method)
           - parking to wait for  <0x00007f6182902770> (a 
java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
           at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
           at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
           at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
           at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
           at 
java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
           at 
org.glassfish.hk2.utilities.general.Hk2ThreadLocal.get(Hk2ThreadLocal.java:108)
           at 
org.jvnet.hk2.internal.PerLocatorUtilities.getAutoAnalyzerName(PerLocatorUtilities.java:166)
           at 
org.jvnet.hk2.internal.ConstantActiveDescriptor.<init>(ConstantActiveDescriptor.java:84)
           at 
org.jvnet.hk2.internal.ServiceLocatorImpl.internalGetInjecteeDescriptor(ServiceLocatorImpl.java:562)
           at 
org.jvnet.hk2.internal.ServiceLocatorImpl.getInjecteeDescriptor(ServiceLocatorImpl.java:587)
           at 
org.jvnet.hk2.internal.ThreeThirtyResolver.resolve(ThreeThirtyResolver.java:70)
           at org.jvnet.hk2.internal.ClazzCreator.resolve(ClazzCreator.java:212)
           at 
org.jvnet.hk2.internal.ClazzCreator.resolveAllDependencies(ClazzCreator.java:229)
           at org.jvnet.hk2.internal.ClazzCreator.create(ClazzCreator.java:358)
           at 
org.jvnet.hk2.internal.SystemDescriptor.create(SystemDescriptor.java:487)
           at 
org.jvnet.hk2.internal.PerLookupContext.findOrCreate(PerLookupContext.java:70)
           at 
org.jvnet.hk2.internal.Utilities.createService(Utilities.java:2022)
           at 
org.jvnet.hk2.internal.ServiceLocatorImpl.internalGetService(ServiceLocatorImpl.java:774)
           at 
org.jvnet.hk2.internal.ServiceLocatorImpl.internalGetService(ServiceLocatorImpl.java:737)
           at 
org.jvnet.hk2.internal.ServiceLocatorImpl.getService(ServiceLocatorImpl.java:733)
           at 
org.glassfish.jersey.inject.hk2.SupplierFactoryBridge.provide(SupplierFactoryBridge.java:74)
           at 
org.jvnet.hk2.internal.FactoryCreator.create(FactoryCreator.java:153)
           at 
org.jvnet.hk2.internal.SystemDescriptor.create(SystemDescriptor.java:487)
           at 
org.glassfish.jersey.inject.hk2.RequestContext.findOrCreate(RequestContext.java:59)
           at 
org.jvnet.hk2.internal.Utilities.createService(Utilities.java:2022)
           at 
org.jvnet.hk2.internal.ServiceHandleImpl.getService(ServiceHandleImpl.java:114)
           - locked <0x00007f7e5d003c88> (a java.lang.Object)
           at 
org.jvnet.hk2.internal.ServiceHandleImpl.getService(ServiceHandleImpl.java:88)
           at 
org.glassfish.jersey.inject.hk2.ContextInjectionResolverImpl.resolve(ContextInjectionResolverImpl.java:103)
           at 
org.glassfish.jersey.inject.hk2.ContextInjectionResolverImpl.resolve(ContextInjectionResolverImpl.java:121)
           at 
org.glassfish.jersey.server.internal.inject.DelegatedInjectionValueParamProvider.lambda$getValueProvider$0(DelegatedInjectionValueParamProvider.java:67)
           at 
org.glassfish.jersey.server.internal.inject.DelegatedInjectionValueParamProvider$$Lambda$199/1475889071.apply(Unknown
 Source)
           at 
org.glassfish.jersey.server.spi.internal.ParamValueFactoryWithSource.apply(ParamValueFactoryWithSource.java:50)
           at 
org.glassfish.jersey.server.spi.internal.ParameterValueHelper.getParameterValues(ParameterValueHelper.java:64)
           at 
org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$AbstractMethodParamInvoker.getParamValues(JavaResourceMethodDispatcherProvider.java:109)
           at 
org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$VoidOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:159)
           at 
org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:79)
           at 
org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:469)
           at 
org.glassfish.jersey.server.model.ResourceMethodInvoker.lambda$apply$0(ResourceMethodInvoker.java:381)
           at 
org.glassfish.jersey.server.model.ResourceMethodInvoker$$Lambda$233/665077335.call(Unknown
 Source)
           at 
org.glassfish.jersey.server.ServerRuntime$AsyncResponder$2$1.run(ServerRuntime.java:819)
           at org.glassfish.jersey.internal.Errors$1.call(Errors.java:248)
           at org.glassfish.jersey.internal.Errors$1.call(Errors.java:244)
           at org.glassfish.jersey.internal.Errors.process(Errors.java:292)
           at org.glassfish.jersey.internal.Errors.process(Errors.java:274)
           at org.glassfish.jersey.internal.Errors.process(Errors.java:244)
           at 
org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:265)
           at 
org.glassfish.jersey.server.ServerRuntime$AsyncResponder$2.run(ServerRuntime.java:814)
           at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
           at java.util.concurrent.FutureTask.run(FutureTask.java:266)
           at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
           at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
           at java.lang.Thread.run(Thread.java:748)
   ```
   
   Here's the paste for the possible object which could be part of a deadlock 
or livelock.
   
   ```
   ┌  /private/tmp 130 ↵
   ❯❯❯ cat ~/Desktop/a.thdump| grep "0x00007f6182902770" | wc -l
      33893
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to