somandal commented on code in PR #16341: URL: https://github.com/apache/pinot/pull/16341#discussion_r2210685640
########## pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/rebalance/TableRebalancer.java: ########## @@ -1813,9 +1865,63 @@ public int fetch(String segmentName) { } } + @VisibleForTesting + @FunctionalInterface + interface DataLossRiskAssessor { + boolean hasDataLossRisk(String segmentName); + } + + private static class DataLossRiskAssessorImpl implements DataLossRiskAssessor { + private final String _tableNameWithType; + private final HelixManager _helixManager; + private final boolean _isPeerDownloadEnabled; + private final boolean _isUpsertOrDedupTable; + private final boolean _isPauselessEnabled; + + private DataLossRiskAssessorImpl(String tableNameWithType, TableConfig tableConfig, HelixManager helixManager) { + _tableNameWithType = tableNameWithType; + _helixManager = helixManager; + _isPeerDownloadEnabled = !StringUtils.isEmpty(tableConfig.getValidationConfig().getPeerSegmentDownloadScheme()); + _isUpsertOrDedupTable = tableConfig.isUpsertEnabled() || tableConfig.isDedupEnabled(); + _isPauselessEnabled = PauselessConsumptionUtils.isPauselessEnabled(tableConfig); + } + + @Override + public boolean hasDataLossRisk(String segmentName) { + // If peer-download is disabled, no data loss risk exists + if (!_isPeerDownloadEnabled) { + return false; + } + + SegmentZKMetadata segmentZKMetadata = ZKMetadataProvider + .getSegmentZKMetadata(_helixManager.getHelixPropertyStore(), _tableNameWithType, segmentName); + if (segmentZKMetadata == null) { + return false; + } + + // If the segment state is COMPLETED and the peer download URL is empty, there is a data loss risk + if (!segmentZKMetadata.getStatus().isCompleted()) { + return CommonConstants.Segment.METADATA_URI_FOR_PEER_DOWNLOAD.equals(segmentZKMetadata.getDownloadUrl()); + } + + // If the segment is not yet completed, then the following scenarios are possible: + // - Non-upsert / non-dedup table: data loss scenarios are not possible. Either the segment will restart + // consumption or the RealtimeSegmentValidationManager will kick in to fix up the segment if pauseless is + // enabled + // - Upsert / dedup table: For non-pauseless tables, it is safe to move the segment without data loss concerns. + // For pauseless tables, if the segment is still in CONSUMING state, moving it is safe, but if it is in + // COMMITTING state then there is a risk of data loss on segment build failures as well since the + // RealtimeSegmentValidationManager does not automatically try to fix up these segments. To be safe it is best + // to return that there is a risk of data loss in case of race conditions for pauseless enabled tables + // (rebalance updates IS at the same time as segment commit protocol starts and moves it to COMMITTING) + return _isUpsertOrDedupTable && _isPauselessEnabled; Review Comment: Sure, I can reuse that, I should just pass the `repairErrorSegmentsForPartialUpsertOrDedup` as false, right? Also, I see that method only checks for partial upserts, does that mean the problem doesn't exist for full upserts? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org