mths1 commented on PR #1232: URL: https://github.com/apache/iceberg-python/pull/1232#issuecomment-2422796878
Hi all I was trying 'target-file-size-bytes' lately, and to my understanding in the pyiceberg version we were using, it somehow violates the principle of least surprise. As far as I understand, in pyIceberg it is not the file size on disk, but the size _in memory_. A target-file-size-bytes of 512MB resulted for us in files of 20MB on disk. This caused a lot of trouble for us (first understanding) and secondly other tools now pick the wrong value from metadata. If I am not mistaken, it would be great to document that behaviour as it is not quite intuitive. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org