pvary commented on code in PR #13355: URL: https://github.com/apache/iceberg/pull/13355#discussion_r2157413509
########## docs/docs/spark-procedures.md: ########## @@ -406,7 +406,7 @@ Iceberg can compact data files in parallel using Spark with the `rewriteDataFile | `target-file-size-bytes` | 536870912 (512 MB, default value of `write.target-file-size-bytes` from [table properties](configuration.md#write-properties)) | Target output file size | | `min-file-size-bytes` | 75% of target file size | Files under this threshold will be considered for rewriting regardless of any other criteria | | `max-file-size-bytes` | 180% of target file size | Files with sizes above this threshold will be considered for rewriting regardless of any other criteria | -| `min-input-files` | 5 | Any file group exceeding this number of files will be rewritten regardless of other criteria | +| `min-input-files` | 5 | Any file group (with at least two files) having this number of files or more will be rewritten regardless of other criteria | Review Comment: Maybe add this as a separate sentence at the end? It is easier to understand this way, than trying to decipher what `with at least two files` means in this context -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org