stevenzwu commented on code in PR #10691:
URL: https://github.com/apache/iceberg/pull/10691#discussion_r1676050363


##########
core/src/main/java/org/apache/iceberg/util/ParallelIterable.java:
##########
@@ -88,7 +91,18 @@ private ParallelIterator(
     @Override
     public void close() {
       // close first, avoid new task submit
-      this.closed = true;
+      this.closed.set(true);
+
+      for (Task<T> task : yieldedTasks) {
+        try {
+          task.close();
+        } catch (Exception e) {
+          throw new RuntimeException("Close failed", e);

Review Comment:
   we may want to finish the close for loop in case of failure in the middle. 
should we use the `Tasks` util here?
   
   Here is an example from `CatalogUtil::deleteFile`
   ```
   Tasks.foreach(files)
           .executeWith(ThreadPools.getWorkerPool())
           .noRetry()
           .suppressFailureWhenFinished()
           .onFailure((file, exc) -> LOG.warn("Failed to delete {} file {}", 
type, file, exc))
           .run(io::deleteFile);
   ```



##########
core/src/main/java/org/apache/iceberg/util/ParallelIterable.java:
##########
@@ -192,4 +209,65 @@ public synchronized T next() {
       return queue.poll();
     }
   }
+
+  private static class Task<T> implements Callable<Optional<Task<T>>>, 
AutoCloseable {
+    private final Iterable<T> input;
+    private final ConcurrentLinkedQueue<T> queue;
+    private final AtomicBoolean closed;
+    private final int approximateMaxQueueSize;
+
+    private Iterator<T> iterator;
+
+    Task(
+        Iterable<T> input,
+        ConcurrentLinkedQueue<T> queue,
+        AtomicBoolean closed,
+        int approximateMaxQueueSize) {
+      this.input = Preconditions.checkNotNull(input, "input cannot be null");
+      this.queue = Preconditions.checkNotNull(queue, "queue cannot be null");
+      this.closed = Preconditions.checkNotNull(closed, "closed cannot be 
null");
+      this.approximateMaxQueueSize = approximateMaxQueueSize;
+    }
+
+    @Override
+    public Optional<Task<T>> call() throws Exception {
+      try {
+        if (iterator == null) {
+          iterator = input.iterator();
+        }
+        while (iterator.hasNext()) {
+          if (queue.size() >= approximateMaxQueueSize) {
+            // yield
+            return Optional.of(this);
+          }
+          T next = iterator.next();

Review Comment:
   shouldn't item retrieval happen after the `closed` check? otherwise, we may 
lose the item if break happened



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to