ismailsimsek commented on code in PR #11906:
URL: https://github.com/apache/iceberg/pull/11906#discussion_r1907274886


##########
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/DeleteOrphanFilesSparkAction.java:
##########
@@ -589,21 +620,42 @@ private FileURI toFileURI(I input) {
   static class PartitionAwareHiddenPathFilter implements PathFilter, 
Serializable {
 
     private final Set<String> hiddenPathPartitionNames;
+    private final boolean checkParents;
 
-    PartitionAwareHiddenPathFilter(Set<String> hiddenPathPartitionNames) {
+    PartitionAwareHiddenPathFilter(Set<String> hiddenPathPartitionNames, 
boolean checkParents) {
       this.hiddenPathPartitionNames = hiddenPathPartitionNames;
+      this.checkParents = checkParents;
     }
 
     @Override
     public boolean accept(Path path) {
+      if (!checkParents) {
+        return doAccept(path);
+      }
+
+      // if any of the parent folders is not accepted then return false
+      return doAccept(path) && !hasHiddenPttParentFolder(path);
+    }
+
+    private boolean doAccept(Path path) {
       return isHiddenPartitionPath(path) || 
HiddenPathFilter.get().accept(path);
     }
 
+    /**
+     * Iterates through the parent folders if any of the parent folders of the 
given path is a
+     * hidden partition folder.
+     */
+    public boolean hasHiddenPttParentFolder(Path path) {
+      return Stream.iterate(path, Path::getParent)
+          .takeWhile(Objects::nonNull)
+          .anyMatch(parentPath -> !doAccept(parentPath));
+    }

Review Comment:
   Now it will check parent folders per file, to ensure none of the parent 
folder is not hiddenpartition folder. this might be less performant for large 
list, if performance is a concern.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to