abhishekbafna commented on code in PR #16249:
URL: https://github.com/apache/pinot/pull/16249#discussion_r2185134422


##########
pinot-plugins/pinot-minion-tasks/pinot-minion-builtin-tasks/src/main/java/org/apache/pinot/plugin/minion/tasks/BaseMultipleSegmentsConversionExecutor.java:
##########
@@ -352,6 +348,67 @@ public List<SegmentConversionResult> 
executeTask(PinotTaskConfig pinotTaskConfig
     }
   }
 
+  private void parallelDownloadAndUntarSegments(int nThreads, String 
tableNameWithType, String taskType,
+      String[] segmentNames, String[] downloadURLs, File tempDataDir, 
AtomicInteger recordCounter,
+      List<File> inputSegmentDirs)
+      throws Exception {
+
+    ExecutorService executorService = null;
+    int length = downloadURLs.length;
+    try {
+      executorService = Executors.newFixedThreadPool(nThreads);
+      List<Future<Void>> futures = new ArrayList<>(length);
+      for (int i = 0; i < length; i++) {
+        int index = i;
+        futures.add(executorService.submit(() -> {
+          downloadAndUntarSegment(tableNameWithType, taskType, 
segmentNames[index], downloadURLs[index],
+              tempDataDir, index, recordCounter, inputSegmentDirs);
+          return null;
+        }));
+
+        // Wait for all downloads to complete and cancel other tasks if any 
download fails
+        for (Future<Void> future : futures) {
+          try {
+            future.get();
+          } catch (Exception e) {
+            // Cancel all other download tasks
+            for (Future<Void> f : futures) {
+              f.cancel(true);
+            }
+            throw e;
+          }
+        }
+      }
+    } finally {
+      if (executorService != null) {
+        executorService.shutdown();
+      }
+    }
+  }
+
+  private void downloadAndUntarSegment(String tableNameWithType, String 
taskType,
+      String segmentName, String downloadURL, File tempDataDir, int index, 
AtomicInteger recordCounter,
+      List<File> inputSegmentDirs)
+      throws Exception {
+    try {
+      _eventObserver.notifyProgress(_pinotTaskConfig, "Downloading and 
decompressing segment from: " + downloadURL
+          + " (" + (index + 1) + " out of " + inputSegmentDirs.size() + ")");
+      // Download and decompress the segment file
+      File indexDir = downloadSegmentToLocalAndUntar(tableNameWithType, 
segmentName, downloadURL, taskType,
+          tempDataDir, "_" + index);
+      reportSegmentDownloadMetrics(indexDir, tableNameWithType, taskType);
+      SegmentMetadata segmentMetadata = new SegmentMetadataImpl(indexDir);
+      // Ensure segment directory placement is at same index as the segment 
name in the inputSegmentNames
+      inputSegmentDirs.set(index, indexDir);
+      recordCounter.addAndGet(segmentMetadata.getTotalDocs());
+    } catch (Exception e) {

Review Comment:
   I think, this is better. This keep the logic together and ensure that 
`inputSegmentDirs` and `recordCounter` only happens when everything above 
succeed.
   The `new SegmentMetadataImpl(indexDir);` also throes IOException and other. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to