snleee commented on code in PR #9825: URL: https://github.com/apache/pinot/pull/9825#discussion_r1026040267
########## pinot-plugins/pinot-minion-tasks/pinot-minion-builtin-tasks/src/main/java/org/apache/pinot/plugin/minion/tasks/BaseMultipleSegmentsConversionExecutor.java: ########## @@ -253,9 +271,10 @@ public List<SegmentConversionResult> executeTask(PinotTaskConfig pinotTaskConfig TableNameBuilder.extractRawTableName(tableNameWithType)); List<NameValuePair> parameters = Arrays.asList(enableParallelPushProtectionParameter, tableNameParameter); - SegmentConversionUtils - .uploadSegment(configs, httpHeaders, parameters, tableNameWithType, resultSegmentName, uploadURL, - convertedTarredSegmentFile); + pushSegment(tableNameWithType, pinotTaskConfig.getConfigs(), outputSegmentTarURI, httpHeaders, parameters); +// SegmentConversionUtils Review Comment: Can you clean up this? ########## pinot-plugins/pinot-minion-tasks/pinot-minion-builtin-tasks/src/main/java/org/apache/pinot/plugin/minion/tasks/BaseMultipleSegmentsConversionExecutor.java: ########## @@ -242,6 +256,10 @@ public List<SegmentConversionResult> executeTask(PinotTaskConfig pinotTaskConfig new BasicHeader(FileUploadDownloadClient.CustomHeaders.SEGMENT_ZK_METADATA_CUSTOM_MAP_MODIFIER, segmentZKMetadataCustomMapModifier.toJsonString()); + URI outputSegmentTarURI = moveSegmentToOutputPinotFS(pinotTaskConfig.getConfigs(), convertedTarredSegmentFile); + LOGGER.info("Moved generated segment from [{}] to location: [{}]", Review Comment: What happens if the Minion somehow got killed (e.g. restart) after finish the segment move to outputPinotFS but before finishing the metadata push to controller? I think that this can potentially leave some files in the deep storage without any segment metadata. Do we have any design/solution to address this? ########## pinot-plugins/pinot-minion-tasks/pinot-minion-builtin-tasks/src/main/java/org/apache/pinot/plugin/minion/tasks/BaseMultipleSegmentsConversionExecutor.java: ########## @@ -253,9 +271,10 @@ public List<SegmentConversionResult> executeTask(PinotTaskConfig pinotTaskConfig TableNameBuilder.extractRawTableName(tableNameWithType)); List<NameValuePair> parameters = Arrays.asList(enableParallelPushProtectionParameter, tableNameParameter); - SegmentConversionUtils - .uploadSegment(configs, httpHeaders, parameters, tableNameWithType, resultSegmentName, uploadURL, - convertedTarredSegmentFile); + pushSegment(tableNameWithType, pinotTaskConfig.getConfigs(), outputSegmentTarURI, httpHeaders, parameters); Review Comment: We can use `configs` instead of `pinotTaskConfig.getConfigs()` ########## pinot-plugins/pinot-minion-tasks/pinot-minion-builtin-tasks/src/main/java/org/apache/pinot/plugin/minion/tasks/BaseMultipleSegmentsConversionExecutor.java: ########## @@ -242,6 +255,10 @@ public List<SegmentConversionResult> executeTask(PinotTaskConfig pinotTaskConfig new BasicHeader(FileUploadDownloadClient.CustomHeaders.SEGMENT_ZK_METADATA_CUSTOM_MAP_MODIFIER, segmentZKMetadataCustomMapModifier.toJsonString()); + URI outputSegmentTarURI = moveSegmentToOutputPinotFS(pinotTaskConfig.getConfigs(), convertedTarredSegmentFile); Review Comment: +1 As @zhtaoxiang mentioned, we should NOT move the segments to deep storage when we do the segment tar file push. In that case, the controller will move the segment to the deep storage during the upload phase. ########## pinot-plugins/pinot-minion-tasks/pinot-minion-builtin-tasks/src/main/java/org/apache/pinot/plugin/minion/tasks/BaseMultipleSegmentsConversionExecutor.java: ########## @@ -242,6 +256,10 @@ public List<SegmentConversionResult> executeTask(PinotTaskConfig pinotTaskConfig new BasicHeader(FileUploadDownloadClient.CustomHeaders.SEGMENT_ZK_METADATA_CUSTOM_MAP_MODIFIER, segmentZKMetadataCustomMapModifier.toJsonString()); + URI outputSegmentTarURI = moveSegmentToOutputPinotFS(pinotTaskConfig.getConfigs(), convertedTarredSegmentFile); Review Comment: We can use `configs ` instead of `pinotTaskConfig.getConfigs()` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org