somandal commented on code in PR #15617:
URL: https://github.com/apache/pinot/pull/15617#discussion_r2055075416


##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/rebalance/TableRebalancer.java:
##########
@@ -479,6 +482,16 @@ private RebalanceResult doRebalance(TableConfig 
tableConfig, RebalanceConfig reb
         externalViewStabilizationTimeoutInMs);
     int expectedVersion = currentIdealState.getRecord().getVersion();
 
+    // Cache segment partition id to avoid ZK reads. Similar behavior as cache 
used in StrictReplicaGroupAssignment
+    // NOTE:
+    // 1. This cache is used for table rebalance only, but not segment 
assignment. During rebalance, rebalanceTable()
+    //    can be invoked multiple times when the ideal state changes during 
the rebalance process.
+    // 2. The cache won't be refreshed when an existing segment is replaced 
with a segment from a different partition.
+    //    Replacing a segment with a segment from a different partition should 
not be allowed for upsert table because
+    //    it will cause the segment being served by the wrong servers. If this 
happens during the table rebalance,
+    //    another rebalance might be needed to fix the assignment.
+    Object2IntOpenHashMap<String> segmentPartitionIdMap = new 
Object2IntOpenHashMap<>();

Review Comment:
   This is actually picked up from `StrictRealtimeSegmentAssignment` which has 
this optimization to reduce the overhead of computing the partitionId in case 
it has to be fetched from SegmentZkMetadata. Decided to add it here as well, 
since otherwise that optimization will essentially be undone. I thought of 
exposing it from  `StrictRealtimeSegmentAssignment` as well, but it would 
require an interface change so decided against it 😅 
   
   Reference code:
   
https://github.com/apache/pinot/blob/9f030bb0f20de69595156c18188946b2d2876e1c/pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/segment/StrictRealtimeSegmentAssignment.java#L69
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to