Re: [PR] Forward Compatible large_* type support: read as large, write as small [iceberg-python]

via GitHub Fri, 05 Jul 2024 06:51:27 -0700


Fokko commented on code in PR #890:
URL: https://github.com/apache/iceberg-python/pull/890#discussion_r1666835301



##########
pyiceberg/table/__init__.py:
##########
@@ -1866,7 +1866,7 @@ def plan_files(self) -> Iterable[FileScanTask]:
             for data_entry in data_entries
         ]
 
-    def to_arrow(self) -> pa.Table:
+    def to_arrow(self, with_large_types: bool = True) -> pa.Table:

Review Comment:
   > I felt like we needed a way for the user to control which type they would 
be using for their arrow table or RecordBatchReader
   
   I don't think we should expose this in the public API. Do people want to 
control this? In an ideal world:
   
   - When writing you want to take the type that's being handed to PyIceberg 
from the user
   - When reading you want to take this information from what comes out of the 
Parquet files
   
   My first assumption was to go with the large one since that seems what most 
libraries seem to be using. But unfortunately, that doesn't seem to be the case.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Forward Compatible large_* type support: read as large, write as small [iceberg-python]

Reply via email to