findinpath commented on code in PR #10691: URL: https://github.com/apache/iceberg/pull/10691#discussion_r1677791198
########## core/src/main/java/org/apache/iceberg/util/ParallelIterable.java: ########## @@ -192,4 +209,65 @@ public synchronized T next() { return queue.poll(); } } + + private static class Task<T> implements Callable<Optional<Task<T>>>, Closeable { + private final Iterable<T> input; + private final ConcurrentLinkedQueue<T> queue; + private final AtomicBoolean closed; + private final int approximateMaxQueueSize; + + private Iterator<T> iterator; + + Task( + Iterable<T> input, + ConcurrentLinkedQueue<T> queue, + AtomicBoolean closed, + int approximateMaxQueueSize) { + this.input = Preconditions.checkNotNull(input, "input cannot be null"); + this.queue = Preconditions.checkNotNull(queue, "queue cannot be null"); + this.closed = Preconditions.checkNotNull(closed, "closed cannot be null"); + this.approximateMaxQueueSize = approximateMaxQueueSize; + } + + @Override + public Optional<Task<T>> call() throws Exception { + try { + if (iterator == null) { + iterator = input.iterator(); + } + while (iterator.hasNext()) { + if (queue.size() >= approximateMaxQueueSize) { + // yield + return Optional.of(this); Review Comment: > i am not sure i follow. Why do we potentially keep on re-submitting the task ? I'm saying why not setting `taskFutures[i] = continuation.get()` ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org