Re: [I] S3 compression Issue with Iceberg [iceberg]

2024-09-21 Thread via GitHub
github-actions[bot] commented on issue #8713: URL: https://github.com/apache/iceberg/issues/8713#issuecomment-2365374210 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] S3 compression Issue with Iceberg [iceberg]

2023-10-09 Thread via GitHub
swat1234 commented on issue #8713: URL: https://github.com/apache/iceberg/issues/8713#issuecomment-1752720100 Hi @jhchee , Thanks for your response. We are mainly looking for the compression using SNAPPY. But snappy is increasing the file size. -- This is an automated message from

Re: [I] S3 compression Issue with Iceberg [iceberg]

2023-10-06 Thread via GitHub
jhchee commented on issue #8713: URL: https://github.com/apache/iceberg/issues/8713#issuecomment-1750689951 @swat1234 Can you try something like: The result with UNCOMPRESSED codec looks unusual (and it shouldn't be smaller than snappy). Are you sure that you are using this config in your

Re: [I] S3 compression Issue with Iceberg [iceberg]

2023-10-06 Thread via GitHub
swat1234 commented on issue #8713: URL: https://github.com/apache/iceberg/issues/8713#issuecomment-1750556745 Can some one please advise. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [I] S3 compression Issue with Iceberg [iceberg]

2023-10-05 Thread via GitHub
swat1234 commented on issue #8713: URL: https://github.com/apache/iceberg/issues/8713#issuecomment-1748235072 I have tried with huge data. Below are the outcomes. 1. File with UNCOMPRESSED codec. - 1.8GB 2. File with gzip codec - 1.8GB 3. File with code snappy codec - 2.8GB

Re: [I] S3 compression Issue with Iceberg [iceberg]

2023-10-04 Thread via GitHub
amogh-jahagirdar commented on issue #8713: URL: https://github.com/apache/iceberg/issues/8713#issuecomment-1746920616 +1 to @RussellSpitzer point. These files seem way too small for compression to play a significant role and be meaningful. Compression is most noticeable on significant amoun

Re: [I] S3 compression Issue with Iceberg [iceberg]

2023-10-04 Thread via GitHub
RussellSpitzer commented on issue #8713: URL: https://github.com/apache/iceberg/issues/8713#issuecomment-1746708682 If you are only trying with sub kilobyte files the results will be bad. You have some amortized costs there and most of the file (footers) will not be compressed. Try with lar

Re: [I] S3 compression Issue with Iceberg [iceberg]

2023-10-04 Thread via GitHub
swat1234 commented on issue #8713: URL: https://github.com/apache/iceberg/issues/8713#issuecomment-1746699625 We tried with only write.parquet.compression-codec parameter set to snappy, gzip but it is not working. Instead of compressing, the size is getting increased. -- This is an autom

Re: [I] S3 compression Issue with Iceberg [iceberg]

2023-10-04 Thread via GitHub
nastra commented on issue #8713: URL: https://github.com/apache/iceberg/issues/8713#issuecomment-1746404632 I would probably start first by reducing the amount of random table properties being set. As I mentioned earlier, the one that matters in your case is `write.parquet.compression-c

Re: [I] S3 compression Issue with Iceberg [iceberg]

2023-10-04 Thread via GitHub
swat1234 commented on issue #8713: URL: https://github.com/apache/iceberg/issues/8713#issuecomment-1746360361 I am are trying to reduce the storage space of the files by applying Snappy or Gzip compression. I can see metadata is getting compression to gzip but not the data files. Could you

Re: [I] S3 compression Issue with Iceberg [iceberg]

2023-10-04 Thread via GitHub
nastra commented on issue #8713: URL: https://github.com/apache/iceberg/issues/8713#issuecomment-1746347077 I see that you configured `"write.metadata.compression-codec": "gzip"` but this is for table metadata files being compressed, not individual data files. Also any particular reason to