dongxiaoman opened a new issue #7779:
URL: https://github.com/apache/pinot/issues/7779


   NOTE: This is not an urgent bug but it seems quite annoying if we can 
confirm it is the root cause.
   
   Right now we can see clear correlation of query failure (because right now 
any offline segments could cause failure) seconds after a realtime segment is 
moved into another tiered storage.
   
   We have a query error for a missing segment at timestamp 
`"timestamp":"2021-11-16T22:13:18.256Z`
   and 5 seconds earlier we see logs indicating the segment was dropped from 
real time server. And a few minutes later we see the same segment showing up in 
its tiered servers.
   
   The log for dropping in streaming server is:
   ```
   2021/11/16 22:13:15.345 INFO [HelixStateTransitionHandler] 
[HelixTaskExecutor-message_handle_STATE_TRANSITION] Instance 
Server_st-fw-81.service.consul_8098, partition 
point_entry__34__576__20211116T0930Z received state transition from OFFLINE to 
DROPPED on session 30043ed1a3604f2, message id: 
d9310a75-1742-4758-981e-32c0b193f7eb
   ```
   
   In my mental model, it could be this reason:
   1. Segment is set to be moved to another tier due to TTL
   2. The segment is dropped from Real time server, but the new tier has not 
completed the "ONLINE" task needed for that segment yet
   3. The segment appears offline from Pinot controller, Query kicks in, 
Brokers (? or servers?) complains about missing segments from Real time server
   
   The step #3 is still a bit strange, did broker not receive the segment 
external view change event within 5 seconds? The segment is going to show up in 
another tiered storage
   
   If we think of a tiered storage move of segment as "rebalance", we actually 
should have the option to do the "no-downtime" move of segments into another 
tier. Keep one replica in place, ensure the new replica shows up, and then move 
another?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to