somandal commented on code in PR #16341:
URL: https://github.com/apache/pinot/pull/16341#discussion_r2210683768


##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/rebalance/TableRebalancer.java:
##########
@@ -1813,9 +1865,63 @@ public int fetch(String segmentName) {
     }
   }
 
+  @VisibleForTesting
+  @FunctionalInterface
+  interface DataLossRiskAssessor {
+    boolean hasDataLossRisk(String segmentName);
+  }
+
+  private static class DataLossRiskAssessorImpl implements 
DataLossRiskAssessor {
+    private final String _tableNameWithType;
+    private final HelixManager _helixManager;
+    private final boolean _isPeerDownloadEnabled;
+    private final boolean _isUpsertOrDedupTable;
+    private final boolean _isPauselessEnabled;
+
+    private DataLossRiskAssessorImpl(String tableNameWithType, TableConfig 
tableConfig, HelixManager helixManager) {
+      _tableNameWithType = tableNameWithType;
+      _helixManager = helixManager;
+      _isPeerDownloadEnabled = 
!StringUtils.isEmpty(tableConfig.getValidationConfig().getPeerSegmentDownloadScheme());
+      _isUpsertOrDedupTable = tableConfig.isUpsertEnabled() || 
tableConfig.isDedupEnabled();
+      _isPauselessEnabled = 
PauselessConsumptionUtils.isPauselessEnabled(tableConfig);
+    }
+
+    @Override
+    public boolean hasDataLossRisk(String segmentName) {
+      // If peer-download is disabled, no data loss risk exists
+      if (!_isPeerDownloadEnabled) {
+        return false;
+      }
+
+      SegmentZKMetadata segmentZKMetadata = ZKMetadataProvider
+          .getSegmentZKMetadata(_helixManager.getHelixPropertyStore(), 
_tableNameWithType, segmentName);
+      if (segmentZKMetadata == null) {
+        return false;
+      }
+
+      // If the segment state is COMPLETED and the peer download URL is empty, 
there is a data loss risk
+      if (!segmentZKMetadata.getStatus().isCompleted()) {

Review Comment:
   Should we blindly check for downloadURL if state is in `COMMITTING` state? 
(it will always be empty in this case, correct?)
   
   I thought `COMMITTING` is only problematic for upsert / dedup tables? That's 
why I had added the check after this which checks if upsert or dedup and if 
pauseless. Can you see if the comments I added for the next block makes sense?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to