findinpath commented on code in PR #10691: URL: https://github.com/apache/iceberg/pull/10691#discussion_r1677572633
########## core/src/main/java/org/apache/iceberg/util/ParallelIterable.java: ########## @@ -192,4 +209,65 @@ public synchronized T next() { return queue.poll(); } } + + private static class Task<T> implements Callable<Optional<Task<T>>>, Closeable { + private final Iterable<T> input; + private final ConcurrentLinkedQueue<T> queue; + private final AtomicBoolean closed; + private final int approximateMaxQueueSize; + + private Iterator<T> iterator; + + Task( + Iterable<T> input, + ConcurrentLinkedQueue<T> queue, + AtomicBoolean closed, + int approximateMaxQueueSize) { + this.input = Preconditions.checkNotNull(input, "input cannot be null"); + this.queue = Preconditions.checkNotNull(queue, "queue cannot be null"); + this.closed = Preconditions.checkNotNull(closed, "closed cannot be null"); + this.approximateMaxQueueSize = approximateMaxQueueSize; + } + + @Override + public Optional<Task<T>> call() throws Exception { + try { + if (iterator == null) { + iterator = input.iterator(); + } + while (iterator.hasNext()) { + if (queue.size() >= approximateMaxQueueSize) { + // yield + return Optional.of(this); Review Comment: If processing of the items from the queue is slower, this may mean that in `checkTasks()` we're processing the same item over and over again. From `checkTasks()` ``` if (taskFutures[i] != null) { Optional<Task<T>> continuation; // check for task failure and re-throw any exception try { continuation = taskFutures[i].get(); } catch (...) ... } continuation.ifPresent(yieldedTasks::addLast); } taskFutures[i] = submitNextTask(); ``` If we know that we're dealing with a "hold your horses" task, why do we re-submit it again? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org