syun64 commented on code in PR #890: URL: https://github.com/apache/iceberg-python/pull/890#discussion_r1666094679
########## pyiceberg/table/__init__.py: ########## @@ -1866,7 +1866,7 @@ def plan_files(self) -> Iterable[FileScanTask]: for data_entry in data_entries ] - def to_arrow(self) -> pa.Table: + def to_arrow(self, with_large_types: bool = True) -> pa.Table: Review Comment: Thank you very much for taking the time to review @Fokko . It’s great you brought this up because I didn’t feel great about introducing a flag either… but I felt like we needed a way for the user to control which type they would be using for their arrow table or RecordBatchReader. Do you have a preference for which type (large or small) should be the common type for the schema? The reason I’ve introduced a flag here is because we would still need to choose to which type to use in the pyarrow schema we infer based on the Iceberg table schema. As we’ve discussed in this [issue](https://github.com/apache/iceberg-python/issues/791), I thought being intentional about which type we are choosing to represent our table or RecordBatchReader would make the behavior feel more consistent and error prone for the end user, than the alternative of rendering the type that PyArrow infers based on the parquet file. If this does not sound like a great candidate for an API argument, would having a configuration to control this behavior be a better option? I think that was an idea that was discussed in a [previous discussion here](https://github.com/apache/iceberg-python/pull/807#pullrequestreview-2119199017). Please let me know! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org