Fokko commented on code in PR #902: URL: https://github.com/apache/iceberg-python/pull/902#discussion_r1671798339
########## pyiceberg/table/__init__.py: ########## @@ -1884,8 +1884,9 @@ def to_arrow_batch_reader(self) -> pa.RecordBatchReader: from pyiceberg.io.pyarrow import project_batches, schema_to_pyarrow + target_schema = schema_to_pyarrow(self.projection()) Review Comment: My preference would be to let Arrow decide. For Polars it is different because they are also the query engine. Casting the types will recompute the buffers, consuming additional memory/CPU, which I would rather avoid. For the table, we first materialize all the batches in memory, so if one of them is large, it will automatically upcast, otherwise, it will keep the small types. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org