Re: [PR] infra: use spark base image for docker [iceberg-python]

via GitHub Sun, 28 Sep 2025 10:43:58 -0700


kevinjqliu commented on code in PR #2540:
URL: https://github.com/apache/iceberg-python/pull/2540#discussion_r2386232833



##########
dev/provision.py:
##########
@@ -23,35 +22,27 @@
 from pyiceberg.schema import Schema
 from pyiceberg.types import FixedType, NestedField, UUIDType
 
-# The configuration is important, otherwise we get many small
-# parquet files with a single row. When a positional delete
-# hits the Parquet file with one row, the parquet file gets
-# dropped instead of having a merge-on-read delete file.
-spark = (
-    SparkSession
-        .builder
-        .config("spark.sql.shuffle.partitions", "1")
-        .config("spark.default.parallelism", "1")

Review Comment:
   yea i remember these, i can add them to spark-defaults but the tests are 
passing now without them 🤷 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] infra: use spark base image for docker [iceberg-python]

Reply via email to