swat1234 opened a new issue, #8713:
URL: https://github.com/apache/iceberg/issues/8713

   Iceberg tables not compressing parquet file in s3. When the below Table 
parameters are used for the Compression the file size is increasing in 
comparison with uncompression. Can some one please assist on the same.
   
   1. File with UNCOMPRESSED codec.
   
   00000-0-0129ba78-17f6-466f-b57b-695c678d64d5-00001.parquet === size 682 bytes
   
   },
         "properties" : {
           "codec" : "UNCOMPRESSED",
   
   -------------------------------
   2. File with gzip codec 733 bytes
   
   00000-0-e6f22c0e-2e16-43aa-8a5f-efabee995876-00001.parquet
   
   "properties" : {
           "codec" : "GZIP",
   
   -------------------------------
   3. File with code snappy codec 686 bytes.
   
   00000-0-36fd4aad-8c38-40f5-8241-78ffe4f0a032-00001.parquet
   
    "codec" : "SNAPPY",
           "path" : {
   
   --------------------------------------------------------------
   Table Properties:
   
   "parquet.compression": "SNAPPY"
       "read.parquet.vectorization.batch-size": "5000"
       "read.split.target-size": "134217728"
       "read.parquet.vectorization.enabled": "true"
       "write.parquet.page-size-bytes": "1048576"
       "write.parquet.row-group-size-bytes": "134217728"
       "write_compression": "SNAPPY"
       "write.parquet.compression-codec": "snappy"
       "write.metadata.metrics.max-inferred-column-defaults": "100"
       "write.parquet.compression-level": "4"
       "write.target-file-size-bytes": "536870912"
       "write.delete.target-file-size-bytes": "67108864"
       "write.parquet.page-row-limit": "20000"
       "write.format.default": "parquet"
       "write.metadata.compression-codec": "gzip"
       "write.compression": "SNAPPY"
   
   
   Thanks in advance!!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to