dramaticlly commented on code in PR #12278:
URL: https://github.com/apache/iceberg/pull/12278#discussion_r1958833411


##########
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/DeleteOrphanFilesSparkAction.java:
##########
@@ -84,11 +86,11 @@
  * comparing the actual files in that location with content and metadata files 
referenced by all
  * valid snapshots. The location must be accessible for listing via the Hadoop 
{@link FileSystem}.
  *
- * <p>By default, this action cleans up the table location returned by {@link 
Table#location()} and
- * removes unreachable files that are older than 3 days using {@link 
Table#io()}. The behavior can
- * be modified by passing a custom location to {@link #location} and a custom 
timestamp to {@link
- * #olderThan(long)}. For example, someone might point this action to the data 
folder to clean up
- * only orphan data files.
+ * <p>By default, this action cleans up data and metadata directory under the 
table location
+ * returned by {@link Table#location()} and removes unreachable files that are 
older than 3 days
+ * using {@link Table#io()}. The behavior can be modified by passing a custom 
location to {@link
+ * #location} and a custom timestamp to {@link #olderThan(long)}. For example, 
someone might point
+ * this action to the data folder to clean up only orphan data files.

Review Comment:
   my $0.02, I think this introduce behaviour change for removing orphan files, 
would be great to have a email on dev@ to highlight the proposal and changes. 
   Also I think we can just mention orphan removal is honoring 
`write.data.path` and `write.metadata.path` but allow for action/procedure 
level override if `location` is provided. (The default value of 
`write.data.path` and `write.metadata.path` can change independently and we 
dont need to mention Table#location)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to