npawar commented on a change in pull request #6667:
URL: https://github.com/apache/incubator-pinot/pull/6667#discussion_r606531505



##########
File path: 
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/PinotTableIdealStateBuilder.java
##########
@@ -115,14 +117,45 @@ public static void 
buildLowLevelRealtimeIdealStateFor(PinotLLCRealtimeSegmentMan
     pinotLLCRealtimeSegmentManager.setUpNewTable(realtimeTableConfig, 
idealState);
   }
 
-  public static int getPartitionCount(StreamConfig streamConfig) {
-    PartitionCountFetcher partitionCountFetcher = new 
PartitionCountFetcher(streamConfig);
+  /**
+   * Fetches the list of {@link PartitionGroupInfo} for the new partition 
groups for the stream,
+   * with the help of the {@link PartitionGroupMetadata} of the current 
partitionGroups.
+   *
+   * Reasons why <code>currentPartitionGroupMetadata</code> is needed:
+   *
+   * The current partition group metadata is used to determine the offsets 
that have been consumed for a partition group.
+   * An example of where the offsets would be used:
+   * e.g. If partition group 1 contains shardId 1, with status DONE and 
endOffset 150. There's 2 possibilities:
+   * 1) the stream indicates that shardId's last offset is 200.
+   * This tells Pinot that partition group 1 still has messages which haven't 
been consumed, and must be included in the response.
+   * 2) the stream indicates that shardId's last offset is 150,
+   * This tells Pinot that all messages of partition group 1 have been 
consumed, and it need not be included in the response.
+   * Thus, this call will skip a partition group when it has reached end of 
life and all messages from that partition group have been consumed.
+   *
+   * The current partition group metadata is also used to know about existing 
groupings of partitions,
+   * and accordingly make the new partition groups.
+   * e.g. Assume that partition group 1 has status IN_PROGRESS and contains 
shards 0,1,2
+   * and partition group 2 has status DONE and contains shards 3,4.
+   * In the above example, the currentPartitionGroupMetadataList indicates that
+   * the collection of shards in partition group 1, should remain unchanged in 
the response,
+   * whereas shards 3,4 can be added to new partition groups if needed.
+   *
+   * @param streamConfig the streamConfig from the tableConfig
+   * @param currentPartitionGroupMetadataList List of {@link 
PartitionGroupMetadata} for the current partition groups.
+   *                                          The size of this list is equal 
to the number of partition groups,
+   *                                          and is created using the latest 
segment zk metadata.
+   */
+  public static List<PartitionGroupInfo> 
getPartitionGroupInfoList(StreamConfig streamConfig,
+      List<PartitionGroupMetadata> currentPartitionGroupMetadataList) {
+    PartitionGroupInfoFetcher partitionGroupInfoFetcher =
+        new PartitionGroupInfoFetcher(streamConfig, 
currentPartitionGroupMetadataList);
     try {
-      RetryPolicies.noDelayRetryPolicy(3).attempt(partitionCountFetcher);
-      return partitionCountFetcher.getPartitionCount();
+      RetryPolicies.noDelayRetryPolicy(3).attempt(partitionGroupInfoFetcher);

Review comment:
       Added randomDelayRetryPolicy




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to