Re: [PR] Forward Compatible large_* type support: read as large, write as small [iceberg-python]

via GitHub Thu, 04 Jul 2024 12:42:34 -0700


Fokko commented on code in PR #890:
URL: https://github.com/apache/iceberg-python/pull/890#discussion_r1666027762



##########
pyiceberg/table/__init__.py:
##########
@@ -1866,7 +1866,7 @@ def plan_files(self) -> Iterable[FileScanTask]:
             for data_entry in data_entries
         ]
 
-    def to_arrow(self) -> pa.Table:
+    def to_arrow(self, with_large_types: bool = True) -> pa.Table:

Review Comment:
   Hey @syun64 Thanks again for jumping on this issue. It is a very nasty one, 
so thanks for doing the hard work here.
   
   Can I suggest one more direction? My first thoughts are that we should not 
bother the user with having to set this kind of flags. Instead, I think we can 
solve it when we concatenate the table:
   
   
![image](https://github.com/apache/iceberg-python/assets/1134248/abd46459-f1be-4f29-a34f-48b3645acc1e)
   
   When we do `to_requested_schema`, we can allow both a normal and a large 
string when we request a string type. When doing the concatenation of the 
batches into a table, we let Arrow coerce to a common type. WDYT?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Forward Compatible large_* type support: read as large, write as small [iceberg-python]

Reply via email to