dramaticlly commented on code in PR #12278: URL: https://github.com/apache/iceberg/pull/12278#discussion_r1958833411
########## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/DeleteOrphanFilesSparkAction.java: ########## @@ -84,11 +86,11 @@ * comparing the actual files in that location with content and metadata files referenced by all * valid snapshots. The location must be accessible for listing via the Hadoop {@link FileSystem}. * - * <p>By default, this action cleans up the table location returned by {@link Table#location()} and - * removes unreachable files that are older than 3 days using {@link Table#io()}. The behavior can - * be modified by passing a custom location to {@link #location} and a custom timestamp to {@link - * #olderThan(long)}. For example, someone might point this action to the data folder to clean up - * only orphan data files. + * <p>By default, this action cleans up data and metadata directory under the table location + * returned by {@link Table#location()} and removes unreachable files that are older than 3 days + * using {@link Table#io()}. The behavior can be modified by passing a custom location to {@link + * #location} and a custom timestamp to {@link #olderThan(long)}. For example, someone might point + * this action to the data folder to clean up only orphan data files. Review Comment: my $0.02, I think this introduce behaviour change for removing orphan files, would be great to have a email on dev@ to highlight the proposal and changes. Also I think we can just mention orphan removal is honoring `write.data.path` and `write.metadata.path` but allow for action/procedure level override if `location` is provided. (The default value of `write.data.path` and `write.metadata.path` can change independently and we dont need to mention Table#location) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org