syun64 commented on code in PR #890:
URL: https://github.com/apache/iceberg-python/pull/890#discussion_r1666094679


##########
pyiceberg/table/__init__.py:
##########
@@ -1866,7 +1866,7 @@ def plan_files(self) -> Iterable[FileScanTask]:
             for data_entry in data_entries
         ]
 
-    def to_arrow(self) -> pa.Table:
+    def to_arrow(self, with_large_types: bool = True) -> pa.Table:

Review Comment:
   Thank you very much for taking the time to review @Fokko . 
   
   It’s great you brought this up because I didn’t feel great about introducing 
a flag either… but I felt like we needed a way for the user to control which 
type they would be using for their arrow table or RecordBatchReader.
   
   Do you have a preference for which type (large or small) should be the 
common type for the schema? The reason I’ve introduced a flag here is because 
we would still need to choose to which type to use in the pyarrow schema we 
infer based on the Iceberg table schema. As we’ve discussed in this 
[issue](https://github.com/apache/iceberg-python/issues/791), I thought being 
intentional about which type we are choosing to represent our table or 
RecordBatchReader would make the behavior feel more consistent and error prone 
for the end user, than the alternative of rendering the type that PyArrow 
infers based on the parquet file.
   
   If this does not sound like a great candidate for an API argument, would 
having a configuration to control this behavior be a better option? I think 
that was an idea that was discussed in a [previous discussion 
here](https://github.com/apache/iceberg-python/pull/807#pullrequestreview-2119199017).
 Please let me know!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to