findepi commented on code in PR #10691: URL: https://github.com/apache/iceberg/pull/10691#discussion_r1677648977
########## core/src/main/java/org/apache/iceberg/util/ParallelIterable.java: ########## @@ -20,65 +20,69 @@ import java.io.Closeable; import java.io.IOException; +import java.util.ArrayDeque; +import java.util.Deque; import java.util.Iterator; import java.util.NoSuchElementException; +import java.util.Optional; +import java.util.concurrent.Callable; import java.util.concurrent.ConcurrentLinkedQueue; import java.util.concurrent.ExecutionException; import java.util.concurrent.ExecutorService; import java.util.concurrent.Future; -import org.apache.iceberg.exceptions.RuntimeIOException; +import java.util.concurrent.atomic.AtomicBoolean; import org.apache.iceberg.io.CloseableGroup; import org.apache.iceberg.io.CloseableIterable; import org.apache.iceberg.io.CloseableIterator; import org.apache.iceberg.relocated.com.google.common.base.Preconditions; import org.apache.iceberg.relocated.com.google.common.collect.Iterables; +import org.apache.iceberg.relocated.com.google.common.io.Closer; public class ParallelIterable<T> extends CloseableGroup implements CloseableIterable<T> { + + private static final int DEFAULT_MAX_QUEUE_SIZE = 10_000; Review Comment: Good call. Admittedly, this value was not tuned. What value would be best here? also, per https://github.com/apache/iceberg/pull/10691#issuecomment-2225641596, if the yielding actually occurs, this means the ParallelIterator consumer isn't able to keep up with processing incoming items, so yielding doesn't introduce much cost. However, resuming is not instantaneous, so every 10k elements we pay resuming cost. This can be eliminated with low water mark. Resume before we exhaust the queue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org