Re: [I] [bug]OversizedAllocationException when query data with Spark [iceberg]

via GitHub Fri, 20 Jun 2025 03:28:04 -0700


firasomrane commented on issue #9820:
URL: https://github.com/apache/iceberg/issues/9820#issuecomment-2990837278


   To not impact the read performance of all the consumers of the table you can 
set `spark.sql.execution.arrow.useLargeVarTypes` [spark config 
](https://spark.apache.org/docs/latest/configuration.html#:~:text=spark.sql.execution.arrow.useLargeVarTypes)
 to `true`
   only when running the spark jobs that will potentially fail. if you are 
using `spark >=3.5.0` .
   If read performance is important for you setting 
`read.parquet.vectorization.enabled=false` will have a negative impact since 
that will disallow batch reads of rows from parquet files.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] [bug]OversizedAllocationException when query data with Spark [iceberg]

Reply via email to