zhongyujiang commented on code in PR #9384: URL: https://github.com/apache/iceberg/pull/9384#discussion_r1438129707
########## core/src/main/java/org/apache/iceberg/TableProperties.java: ########## @@ -334,6 +335,9 @@ private TableProperties() {} public static final String MAX_REF_AGE_MS = "history.expire.max-ref-age-ms"; public static final long MAX_REF_AGE_MS_DEFAULT = Long.MAX_VALUE; + public static final String DELETE_GRANULARITY = "write.delete.granularity"; Review Comment: Maybe just `write.position-delete.granularity`? I prefer to use a more precise name and limit the scope of its usage. A while ago I encountered an issue about adjusting the row-group size of Parquet position delete files. I want to adjust the default row-group size of Parquet pos delete of the tables that I manage to speed up queries (more details are in issue #9149), however I found the parameter `write.delete.parquet.row-group-size-bytes` that controls the row-group size of Parquet pos delete also controls the row-group size of equality delete files. But obviously the row-group sizes applicable to these two type of delete files are not the same. Because we also use equality delete when the data size is small, I cannot directly set a default value of `write.delete.parquet.row-group-size-bytes` for new tables. I can only adjust `write.delete.parquet.row-group-size-bytes` according to the specific use of each table, which is inconvenient. In fact, I think it is not appropriate to use one parameter to control the row-group size of both position delete files and equality delete files, so I created #9177 to add a separate parameter for the position delete file that only writes the `file_path` and `pos` columns. Back to this, IIUC, If we later add a grouping granularity for equality delete, since position delete and equality delete have different characteristics, they will most likely apply different grouping granularity. So I think we'd better make the distinction right from the start, what do you think? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org