jmint-stripe opened a new issue #8352:
URL: https://github.com/apache/pinot/issues/8352


   We observed a situation with our Pinot cluster where a host got stuck 
processing Helix messages and the Helix task queue got backed up and kept 
increasing in size until we restarted the server and at that point it 
recovered. We eventually tracked down what we believe is the root cause which 
were some Zookeeper connection disconnects.
   
   We would expect that once the ZK connection is re-established it would be 
able to process the queued messages.
   
   Below are some logs from our investigation that shows how the message for 
adding a new consuming segment did not get processed on a server.
   
   ----------------
   
   From inspecting our logs we see the controller assigns new segment to two 
streaming servers:
   ```
   2022/02/10 17:29:30.901 INFO [PinotLLCRealtimeSegmentManager] 
[grizzly-http-server-13] Adding new CONSUMING segment: SEGMENT_NAME_REDACTED to 
instances: [ServerA, ServerB]
   2022/02/10 17:29:41.293 INFO [MessageDispatchStage] 
[HelixController-pipeline-default-PinotClusterRedacted-(f382b6fe_DEFAULT)] 
Event f382b6fe_DEFAULT : Sending Message 06e151ca-965b-47d6-9dcc-bbd4a0d02463 
to ServerA transit point_transaction_REALTIME.SEGMENT_NAME_REDACTED|[] 
from:OFFLINE to:CONSUMING, relayMessages: 0
   2022/02/10 17:29:41.293 INFO [MessageDispatchStage] 
[HelixController-pipeline-default-PinotClusterRedacted-(f382b6fe_DEFAULT)] 
Event f382b6fe_DEFAULT : Sending Message 7074fc66-0986-4199-bc3d-24825eaaf79b 
to ServerB transit point_transaction_REALTIME.SEGMENT_NAME_REDACTED|[] 
from:OFFLINE to:CONSUMING, relayMessages: 0
   ```
   
   Each server processes the message
   
   ServerA - we see it actually handle the message
   ```
   2022/02/10 17:29:41.311 INFO [HelixTaskExecutor] 
[ZkClient-EventThread-49-pinot-zookeeper-PinotClusterRedacted.service.consul:2181]
 Scheduling message 06e151ca-965b-47d6-9dcc-bbd4a0d02463: 
point_transaction_REALTIME:SEGMENT_NAME_REDACTED, OFFLINE->CONSUMING
   2022/02/10 17:29:41.316 INFO [HelixStateTransitionHandler] 
[HelixTaskExecutor-message_handle_STATE_TRANSITION] handling message: 
06e151ca-965b-47d6-9dcc-bbd4a0d02463 transit 
point_transaction_REALTIME.SEGMENT_NAME_REDACTED|[] from:OFFLINE to:CONSUMING, 
relayedFrom: null
   ```
   
   ServerB - we do not see any further log messages for this message
   ```
   2022/02/10 17:29:41.306 INFO [HelixTaskExecutor] 
[ZkClient-EventThread-49-pinot-zookeeper-PinotClusterRedacted.service.consul:2181]
 Scheduling message 7074fc66-0986-4199-bc3d-24825eaaf79b: 
point_transaction_REALTIME:SEGMENT_NAME_REDACTED, OFFLINE->CONSUMING
   ```
   
   if we look up the message UUID 7074fc66-0986-4199-bc3d-24825eaaf79b, we see 
that it’s scheduled, but the number queued tasks on that threadpool is 1133
   ```
   2022/02/10 17:29:41.306 INFO [HelixTaskExecutor] 
[ZkClient-EventThread-49-pinot-zookeeper-PinotClusterRedacted.service.consul:2181]
 Message: 7074fc66-0986-4199-bc3d-24825eaaf79b handling task scheduled
   2022/02/10 17:29:41.306 INFO [HelixTaskExecutor] 
[ZkClient-EventThread-49-pinot-zookeeper-PinotClusterRedacted.service.consul:2181]
 Submit task: 7074fc66-0986-4199-bc3d-24825eaaf79b to pool: 
java.util.concurrent.ThreadPoolExecutor@92d8506[Running, pool size = 40, active 
threads = 40, queued tasks = 1133, completed tasks = 1147]
   2022/02/10 17:29:41.306 INFO [HelixTaskExecutor] 
[ZkClient-EventThread-49-pinot-zookeeper-PinotClusterRedacted.service.consul:2181]
 Scheduling message 7074fc66-0986-4199-bc3d-24825eaaf79b: 
point_transaction_REALTIME:SEGMENT_NAME_REDACTED, OFFLINE->CONSUMING
   2022/02/10 17:29:41.303 INFO [MessageLatencyMonitor] 
[ZkClient-EventThread-49-pinot-zookeeper-PinotClusterRedacted.service.consul:2181]
 The latency of message 7074fc66-0986-4199-bc3d-24825eaaf79b is 289 ms
   2022/02/10 17:29:41.293 INFO [MessageDispatchStage] 
[HelixController-pipeline-default-PinotClusterRedacted-(f382b6fe_DEFAULT)] 
Event f382b6fe_DEFAULT : Sending Message 7074fc66-0986-4199-bc3d-24825eaaf79b 
to ServerB transit point_transaction_REALTIME.SEGMENT_NAME_REDACTED|[] 
from:OFFLINE to:CONSUMING, relayMessages: 0
   ```
   
   in comparison, the threadpool on the other streaming host which only has 
zero queued tasks so it processes it immediately
   ```
   2022/02/10 17:29:41.311 INFO [HelixTaskExecutor] 
[ZkClient-EventThread-49-pinot-zookeeper-PinotClusterRedacted.service.consul:2181]
 Submit task: dd34ddd4-425b-41e5-a0f0-2863442daea6 to pool: 
java.util.concurrent.ThreadPoolExecutor@97a16b1[Running, pool size = 40, active 
threads = 0, queued tasks = 0, completed tasks = 864]
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to