priyen opened a new issue, #11448:
URL: https://github.com/apache/pinot/issues/11448

   The scenario is as follows, assuming we have 3 replicas for the table:
   - increase instances / replica config by some number
   - a rebalance is kicked off with reAssignInstances=True, minReplicas=2, and 
includeConsuming=False, and downtime=False
   - this should cause all sealed segments to move appropriately
   - finally, once a consuming segment seals, (consuming to online), we notice 
that the ingestion delay tracker (pinot.server.realtime_ingestion_delay) metric 
continues to rise from 0. Tracking the code, we determined this happens when 
the segment is sealed, but is no longer under ownership of that instance, and 
so it is also dropped.  In the code, it is marked for "verification" as part of 
the transition message handling. At some point, the background thread will 
realize this partition is inactive, and stop tracking the lag. This takes 10~ 
mins, so we see a increasing lag over time from the moment this transition 
happens.
   
   Relevant code - 
https://github.com/apache/pinot/blob/399f033ec3917df2bc478b5904406a95e0bc7258/pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/IngestionDelayTracker.java#L91
   
   Desired behaviour - lag tracking is stopped the moment the partition is 
transitioned/dropped from said instance. Right now, that function call simply 
marks the partition for verification
   
   cc @jugomezv cc @jadami-stripe 
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to