aokolnychyi commented on code in PR #7731:
URL: https://github.com/apache/iceberg/pull/7731#discussion_r1217404778


##########
core/src/main/java/org/apache/iceberg/util/TableScanUtil.java:
##########
@@ -35,16 +38,22 @@
 import org.apache.iceberg.ScanTaskGroup;
 import org.apache.iceberg.SplittableScanTask;
 import org.apache.iceberg.StructLike;
+import org.apache.iceberg.io.CloseableGroup;
 import org.apache.iceberg.io.CloseableIterable;
+import org.apache.iceberg.io.CloseableIterator;
 import org.apache.iceberg.relocated.com.google.common.base.Preconditions;
 import org.apache.iceberg.relocated.com.google.common.collect.FluentIterable;
 import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
 import org.apache.iceberg.relocated.com.google.common.collect.Iterables;
 import org.apache.iceberg.relocated.com.google.common.collect.Lists;
 import org.apache.iceberg.relocated.com.google.common.collect.Maps;
 import org.apache.iceberg.types.Types;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 public class TableScanUtil {
+  private static final Logger LOG = 
LoggerFactory.getLogger(TableScanUtil.class);
+  private static final long MIN_SPLIT_SIZE = 16 * 1024 * 1024; // 16 MB

Review Comment:
   I would say 8 MB is already small enough so such tasks should proceed fairly 
quickly. I would be a bit concerned going smaller than that. The cost of 
opening files is non-trivial. Also, too small splits may overload the 
underlying storage with a large number of requests. I even find 16 MB 
reasonable, to be honest. I don't want our queries to fail with rate limits.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to