andormarkus commented on PR #1742: URL: https://github.com/apache/iceberg-python/pull/1742#issuecomment-2693512105
Hi @Fokko We ended up to build a dynamic custom serialize / deserialize function which supports gzip and zlib compression to deal with size issue. Once the `DataFile` is compressed the size is not so big and we can easily pass it trough the queue. We needed to built is dynamic thus we deal with the changes of the `DataFile` class. If we can cadd method to `DataFile` for serialize / deserialize than the logic can be much simpler. The Manifest file is implemented in `PyIceberg` as well, however as I read in the source code the read and write methods are deeply nested and not easy to access compared to `append_data_file` I'm fine to contribute the serialize / deserialize functions if the maintainers agree on this approach. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org