Jackie-Jiang commented on code in PR #15257: URL: https://github.com/apache/pinot/pull/15257#discussion_r2002064702
########## pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/retention/RetentionManager.java: ########## @@ -165,6 +187,95 @@ private void manageRetentionForRealtimeTable(String realtimeTableName, Retention } } + @VisibleForTesting + void manageRetentionForHybridTable(TableConfig realtimeTableConfig) { + LOGGER.info("Managing retention for hybrid table: {}", realtimeTableConfig.getTableName()); + List<String> segmentsToDelete = new ArrayList<>(); + String realtimeTableName = realtimeTableConfig.getTableName(); + String rawTableName = TableNameBuilder.extractRawTableName(realtimeTableName); + String offlineTableName = TableNameBuilder.OFFLINE.tableNameWithType(rawTableName); + try { + TableConfig offlineTableConfig = _pinotHelixResourceManager.getOfflineTableConfig(offlineTableName); Review Comment: Let's pass in both realtime and offline table config to reduce a ZK access ########## pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/retention/RetentionManager.java: ########## @@ -119,10 +129,22 @@ private void manageRetentionForTable(TableConfig tableConfig) { if (TableNameBuilder.isOfflineTableResource(tableNameWithType)) { manageRetentionForOfflineTable(tableNameWithType, retentionStrategy); } else { - manageRetentionForRealtimeTable(tableNameWithType, retentionStrategy); + // hybrid table check should be performed before the realtime table check. + if (_isHybridTableRetentionStrategyEnabled && isHybridTable(tableConfig)) { + manageRetentionForHybridTable(tableConfig); + } else { + manageRetentionForRealtimeTable(tableNameWithType, retentionStrategy); + } } } + private boolean isHybridTable(TableConfig tableConfig) { + String tableName = TableNameBuilder.extractRawTableName(tableConfig.getTableName()); + TableConfig offlineTableConfig = _pinotHelixResourceManager.getOfflineTableConfig(tableName); + TableConfig realtimeTableConfig = _pinotHelixResourceManager.getRealtimeTableConfig(tableName); Review Comment: (minor) No need to check read realtime config again ########## pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/retention/RetentionManager.java: ########## @@ -165,6 +187,95 @@ private void manageRetentionForRealtimeTable(String realtimeTableName, Retention } } + @VisibleForTesting + void manageRetentionForHybridTable(TableConfig realtimeTableConfig) { + LOGGER.info("Managing retention for hybrid table: {}", realtimeTableConfig.getTableName()); + List<String> segmentsToDelete = new ArrayList<>(); + String realtimeTableName = realtimeTableConfig.getTableName(); + String rawTableName = TableNameBuilder.extractRawTableName(realtimeTableName); + String offlineTableName = TableNameBuilder.OFFLINE.tableNameWithType(rawTableName); + try { + TableConfig offlineTableConfig = _pinotHelixResourceManager.getOfflineTableConfig(offlineTableName); + ZkHelixPropertyStore<ZNRecord> propertyStore = _pinotHelixResourceManager.getPropertyStore(); + Schema schema = ZKMetadataProvider.getTableSchema(propertyStore, offlineTableName); + if (schema == null) { + throw new RuntimeException("Failed to get schema for table: " + offlineTableName); + } + String timeColumn = null; + SegmentsValidationAndRetentionConfig validationConfig = offlineTableConfig.getValidationConfig(); + if (validationConfig != null) { + timeColumn = validationConfig.getTimeColumnName(); + } + if (StringUtils.isEmpty(timeColumn)) { + throw new RuntimeException("TimeColumn is null or empty for table: " + offlineTableName); + } + DateTimeFieldSpec dateTimeSpec = schema.getSpecForTimeColumn(timeColumn); + if (dateTimeSpec == null) { + throw new RuntimeException(String.format("Failed to get DateTimeFieldSpec for time column: %s of table: %s", + timeColumn, offlineTableName)); + } + DateTimeFormatSpec timeFormatSpec = dateTimeSpec.getFormatSpec(); + if (timeFormatSpec.getColumnUnit() == null) { + throw new RuntimeException(String.format("Time unit must be configured in the field spec for time column: %s of" + + " table: %s", timeColumn, offlineTableName)); + } + List<SegmentZKMetadata> offlineSegmentsZKMetadata = Review Comment: Reading all ZK metadata for OFFLINE table can be very expensive. Time boundary info can be read from broker API. See `PinotBrokerDebug.getTimeBoundary()` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org