Re: [I] The ColumnarToRow Spark optimization is not applied when using nested fields from an Iceberg table [iceberg]

via GitHub Sat, 03 Aug 2024 22:45:53 -0700


amogh-jahagirdar commented on issue #10828:
URL: https://github.com/apache/iceberg/issues/10828#issuecomment-2267352623


   Thanks for the repro steps, I did some debugging with it and it seems this 
has to do with vectorized reads not being supported for nested fields. 
https://github.com/apache/iceberg/blob/main/spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkBatch.java#L154
 if you trace from here you'll see that if there's a nested type, we don't 
perform a vectorized read and we'll end up surfacing to spark that columnar 
execution is not supported etc.
   
   It certainly seems like this is an area for improvement, although I don't 
recall the context as to why it's not supported today.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] The ColumnarToRow Spark optimization is not applied when using nested fields from an Iceberg table [iceberg]

Reply via email to