jadami10 commented on issue #10460:
URL: https://github.com/apache/pinot/issues/10460#issuecomment-1482056208

   > @jadami10 were you able to validate that forcecommit was the issue?
   
   yes, after looking through the logs of the winners, the server that was 
lagging was chosen as the winner for 5 partitions.
   
   > I think one of the use-cases that forcecommit was meant to solve the case 
where the ingestion is seems stuck. In such cases, force committing based on 
the first responsive replica rather than winning replica seems appropriate.
   
   Assuming my previous comment/reading of the code above is correct, when your 
ingestion is stuck and you call this, you will at most wait 3.3 seconds more 
seconds for a winner. In practice, it's likely minutes before you've noticed 
ingestion is stuck.
   
   But in the case where 1 server is significantly behind (in our case 2 hours 
behind), accidentally picking it has pretty big ramifications. Now every server 
is 2 hours behind for those partitions.
   
   There's also very few cases where the wait period is skipped (only row limit 
and end of partition reached). Even for time based or size based segment 
completion, the protocol still waits, and that doesn't seem to be a problem. 
I'll wait to hear from @sajjad-moradi, but I would be strongly in favor of 
undoing this part of the force commit feature.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to