dbtsai commented on code in PR #8299:
URL: https://github.com/apache/iceberg/pull/8299#discussion_r1292030532


##########
core/src/main/java/org/apache/iceberg/TableProperties.java:
##########
@@ -143,6 +143,7 @@ private TableProperties() {}
   public static final String PARQUET_COMPRESSION = 
"write.parquet.compression-codec";
   public static final String DELETE_PARQUET_COMPRESSION = 
"write.delete.parquet.compression-codec";
   public static final String PARQUET_COMPRESSION_DEFAULT = "gzip";
+  public static final String PARQUET_COMPRESSION_NEW_TABLE_DEFAULT = "zstd";

Review Comment:
   Then, we don't need `PARQUET_COMPRESSION_NEW_TABLE_DEFAULT`



##########
core/src/main/java/org/apache/iceberg/BaseMetastoreCatalog.java:
##########
@@ -149,6 +149,10 @@ public BaseMetastoreCatalogTableBuilder(TableIdentifier 
identifier, Schema schem
       this.identifier = identifier;
       this.schema = schema;
       this.tableProperties.putAll(tableDefaultProperties());
+      // Explicitly set ZSTD for new tables

Review Comment:
   Maybe `// Explicitly set Parquet compression codec for new tables`



##########
core/src/main/java/org/apache/iceberg/BaseMetastoreCatalog.java:
##########
@@ -149,6 +149,10 @@ public BaseMetastoreCatalogTableBuilder(TableIdentifier 
identifier, Schema schem
       this.identifier = identifier;
       this.schema = schema;
       this.tableProperties.putAll(tableDefaultProperties());
+      // Explicitly set ZSTD for new tables
+      this.tableProperties.put(
+          TableProperties.PARQUET_COMPRESSION,
+          TableProperties.PARQUET_COMPRESSION_NEW_TABLE_DEFAULT);

Review Comment:
   ```Java
   this.tableProperties.put(
       TableProperties.PARQUET_COMPRESSION,
       TableProperties.PARQUET_COMPRESSION_DEFAULT);
   ```
   so we don't need to introduce a new conf, 
`PARQUET_COMPRESSION_NEW_TABLE_DEFAULT`



##########
core/src/main/java/org/apache/iceberg/SnapshotProducer.java:
##########
@@ -381,6 +382,15 @@ public void commit() {
                   update.setBranchSnapshot(newSnapshot, targetBranch);
                 }
 
+                // Explicitly set GZIP for existing tables if unset
+                if (base.properties().get(TableProperties.PARQUET_COMPRESSION) 
== null) {
+                  Map<String, String> newProperties = 
Maps.newHashMap(base.properties());
+                  newProperties.put(
+                      TableProperties.PARQUET_COMPRESSION,
+                      TableProperties.PARQUET_COMPRESSION_DEFAULT);

Review Comment:
   ```java
   if (base.properties().get(TableProperties.PARQUET_COMPRESSION) == null) {
     Map<String, String> newProperties = Maps.newHashMap(base.properties());
     newProperties.put(TableProperties.PARQUET_COMPRESSION, "gzip");
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to