npawar commented on issue #6308:
URL: 
https://github.com/apache/incubator-pinot/issues/6308#issuecomment-737632162


   For example 1, what you suggest `We may want to extend it (in case of 1 
replica, maybe) to transition the segment to OFFLINE state if segment build 
fails` would work. Letting the ValidationManager fix it should be fine. 
   However, I think we should transition the segment to OFFLINE state, any time 
the while loop in the consumer thread exits with `ERROR` state (and not just 
for failed segment build or single replica).
   ```
   try {
       while (!_state.isFinal()) {
           ...
       }
       if (state == ERROR) {
           throw IllegalStateException("Exited with ERROR state", e);
       }
   } catch (Exception e) {
           segmentLogger.error("Exception while in work", e);
           postStopConsumedMsg(e.getClass().getName());
           _state = State.ERROR;
           _serverMetrics.setValueOfTableGauge(_metricKeyName, 
ServerGauge.LLC_PARTITION_CONSUMING, 0);
           return;
   }
   ```
   As long as we reach the `postStopConsumedMsg` for all error conditions, I 
think we should be fine.
   
   
   
-------------------------------------------------------------------------------
   
   I'm assuming this is for example 2: `Validation manager can be extended to 
handle ERROR state in all replicas` ? So we start looking at the External View 
in the Validation Manager?
   
   
-------------------------------------------------------------------------------
   
   Regarding `We do have a mechanism by which servers can report problems to 
the controller. We use that currently for problems that happen during 
consumption.` - Where is this mechanism? Can we use that to create a Controller 
API that reports the status of all consumers? 
   This API will help in
   1. If users notice a lag, they can call this API and see consumer health. 
Right now they see segment metadata is IN_PROGRESS and segment is CONSUMING in 
ideal state, which causes confusion as to why the lag. 
   2. We could also use that API in the Validation Manager to restart 
consumption if any consumers are dead
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to