Fokko commented on code in PR #986:
URL: https://github.com/apache/iceberg-python/pull/986#discussion_r1706662170


##########
pyiceberg/io/pyarrow.py:
##########
@@ -1303,6 +1345,8 @@ def project_table(
             # When FsSpec is not installed
             raise ValueError(f"Expected PyArrowFileIO or FsspecFileIO, got: 
{io}") from e
 
+    use_large_types = property_as_bool(io.properties, 
PYARROW_USE_LARGE_TYPES_ON_READ, True)

Review Comment:
   This is the only part I wouldn't say I like where we now force the table to 
use large or normal tables. When we read record batches I agree that we need to 
force the schema, but for the table, we have to read all the footers anyway.
   
   Once https://github.com/apache/iceberg-python/pull/929 goes in, I think we 
still need to change that, but let's defer that question for now.



##########
pyiceberg/io/__init__.py:
##########
@@ -80,6 +80,7 @@
 GCS_ENDPOINT = "gcs.endpoint"
 GCS_DEFAULT_LOCATION = "gcs.default-bucket-location"
 GCS_VERSION_AWARE = "gcs.version-aware"
+PYARROW_USE_LARGE_TYPES_ON_READ = "pyarrow.use-large-types-on-read"

Review Comment:
   I think it also makes more sense to move this inside of the Arrow file.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to