sajjad-moradi opened a new issue, #15897:
URL: https://github.com/apache/pinot/issues/15897

   If a Pinot Server encounters multiple consumption errors, it instructs the 
Controller to mark its replica as OFFLINE in the Ideal State (IS). Currently, 
the RealtimeSegmentValidationManager (RSVM) periodic job attempts to create a 
new consuming segment [only if all replicas are in the OFFLINE 
state](https://github.com/apache/pinot/blob/master/pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java#L1586-L1587)
 in the IS.
   
   While this is a useful automation, we have observed many cases where some, 
but not all, replicas are marked OFFLINE due to transient stream issues. In 
such cases, all queries are routed to the remaining healthy replicas, which is 
not ideal.
   
   It would be beneficial if the RSVM job could automatically mitigate these 
scenarios.
   
   One proposed solution is to issue a force commit when this condition is 
detected. The force commit should apply only to the affected partition and only 
if a sufficient number of events have been consumed—e.g., at least half of the 
desired numRows specified in the segment ZK metadata.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to