npawar commented on a change in pull request #6890:
URL: https://github.com/apache/incubator-pinot/pull/6890#discussion_r634914658



##########
File path: 
pinot-controller/src/main/java/org/apache/pinot/controller/util/ConsumingSegmentInfoReader.java
##########
@@ -131,6 +134,51 @@ private String generateServerURL(String tableNameWithType, 
String endpoint) {
     return String.format("%s/tables/%s/consumingSegmentsInfo", endpoint, 
tableNameWithType);
   }
 
+  /**
+   * Utility method to derive ingestion status from consuming segment Info. 
Status is HEALTHY if
+   * consuming segment info specifies CONSUMING state for all active segments 
across all servers
+   * including replicas.
+   */
+  public TableStatus.IngestionStatus getIngestionStatus(String 
tableNameWithType, int timeoutMs) {
+    try {
+      ConsumingSegmentsInfoMap consumingSegmentsInfoMap = 
getConsumingSegmentsInfo(tableNameWithType, timeoutMs);
+      for (Map.Entry<String, List<ConsumingSegmentInfo>> 
consumingSegmentInfoEntry : consumingSegmentsInfoMap._segmentToConsumingInfoMap
+          .entrySet()) {
+        String segmentName = consumingSegmentInfoEntry.getKey();
+        List<ConsumingSegmentInfo> consumingSegmentInfoList = 
consumingSegmentInfoEntry.getValue();
+        if (consumingSegmentInfoList == null || 
consumingSegmentInfoList.isEmpty()) {
+          String errorMessage = "Did not get any response from servers for 
segment: " + segmentName;
+          return 
TableStatus.IngestionStatus.newIngestionStatus(TableStatus.IngestionState.UNHEALTHY,
 errorMessage);
+        }
+
+        // Check if any responses are missing
+        Set<String> serversForSegment = 
_pinotHelixResourceManager.getServersForSegment(tableNameWithType, segmentName);
+        if (serversForSegment.size() != consumingSegmentInfoList.size()) {
+          Set<String> serversResponded =
+              consumingSegmentInfoList.stream().map(c -> 
c._serverName).collect(Collectors.toSet());
+          serversForSegment.removeAll(serversResponded);
+          String errorMessage =
+              "Not all servers responded for segment: " + segmentName + " 
Missing servers : " + serversForSegment;
+          return 
TableStatus.IngestionStatus.newIngestionStatus(TableStatus.IngestionState.UNHEALTHY,
 errorMessage);
+        }
+
+        for (ConsumingSegmentInfo consumingSegmentInfo : 
consumingSegmentInfoList) {
+          if (consumingSegmentInfo._consumerState
+              .equals(ConsumerState.NOT_CONSUMING.toString())) {

Review comment:
       +1 to Chinmay's observation. Hence when transient errors will make a 
segment OFFLINE in IS, it will be ERROR state in EV, and hence NOT_CONSUMING, 
resulting in UNHEALTHY.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to