Re: [PR] Spark: Add session-level split size override [iceberg]

via GitHub Thu, 04 Jun 2026 20:58:21 -0700


wombatu-kun commented on code in PR #16154:
URL: https://github.com/apache/iceberg/pull/16154#discussion_r3360259547



##########
spark/v4.1/spark/src/main/java/org/apache/iceberg/spark/SparkReadConf.java:
##########
@@ -156,13 +160,18 @@ public int orcBatchSize() {
   }
 
   public Long splitSizeOption() {
-    return 
confParser.longConf().option(SparkReadOptions.SPLIT_SIZE).parseOptional();
+    return confParser
+        .longConf()
+        .option(SparkReadOptions.SPLIT_SIZE)
+        .sessionConf(SparkSQLProperties.SPLIT_SIZE)

Review Comment:
   `splitSizeOption()` is the gate term in `SparkScan.java:370` 
(`splitSizeOption() == null && adaptiveSplitSizeEnabled()`), so adding 
`SPLIT_SIZE` here makes setting the session conf disable adaptive split sizing, 
which defaults to on. The table-property path leaves `splitSizeOption()` null 
and keeps adaptive sizing enabled, so the same split-size value behaves 
differently depending on whether it comes from this session conf or 
`read.split.target-size`. If that is intended (honor an explicit session split 
size exactly, as the read option does), document it in the precedence javadoc 
and add a test for the interaction; otherwise the session conf should not 
suppress adaptive sizing.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Spark: Add session-level split size override [iceberg]

Reply via email to