findepi commented on code in PR #10691:
URL: https://github.com/apache/iceberg/pull/10691#discussion_r1677648977


##########
core/src/main/java/org/apache/iceberg/util/ParallelIterable.java:
##########
@@ -20,65 +20,69 @@
 
 import java.io.Closeable;
 import java.io.IOException;
+import java.util.ArrayDeque;
+import java.util.Deque;
 import java.util.Iterator;
 import java.util.NoSuchElementException;
+import java.util.Optional;
+import java.util.concurrent.Callable;
 import java.util.concurrent.ConcurrentLinkedQueue;
 import java.util.concurrent.ExecutionException;
 import java.util.concurrent.ExecutorService;
 import java.util.concurrent.Future;
-import org.apache.iceberg.exceptions.RuntimeIOException;
+import java.util.concurrent.atomic.AtomicBoolean;
 import org.apache.iceberg.io.CloseableGroup;
 import org.apache.iceberg.io.CloseableIterable;
 import org.apache.iceberg.io.CloseableIterator;
 import org.apache.iceberg.relocated.com.google.common.base.Preconditions;
 import org.apache.iceberg.relocated.com.google.common.collect.Iterables;
+import org.apache.iceberg.relocated.com.google.common.io.Closer;
 
 public class ParallelIterable<T> extends CloseableGroup implements 
CloseableIterable<T> {
+
+  private static final int DEFAULT_MAX_QUEUE_SIZE = 10_000;

Review Comment:
   Good call. Admittedly, this value was not tuned. What value would be best 
here?
   
   also, per 
https://github.com/apache/iceberg/pull/10691#issuecomment-2225641596, if the 
yielding actually occurs, this means the ParallelIterator consumer isn't able 
to keep up with processing incoming items, so yielding doesn't introduce much 
cost. However, resuming is not instantaneous, so every 10k elements we pay 
resuming cost. This can be eliminated with low water mark. Resume before we 
exhaust the queue.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to