ismailsimsek commented on code in PR #11906:
URL: https://github.com/apache/iceberg/pull/11906#discussion_r1907274886
##########
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/DeleteOrphanFilesSparkAction.java:
##########
@@ -589,21 +620,42 @@ private FileURI toFileURI(I input) {
static class PartitionAwareHiddenPathFilter implements PathFilter,
Serializable {
private final Set<String> hiddenPathPartitionNames;
+ private final boolean checkParents;
- PartitionAwareHiddenPathFilter(Set<String> hiddenPathPartitionNames) {
+ PartitionAwareHiddenPathFilter(Set<String> hiddenPathPartitionNames,
boolean checkParents) {
this.hiddenPathPartitionNames = hiddenPathPartitionNames;
+ this.checkParents = checkParents;
}
@Override
public boolean accept(Path path) {
+ if (!checkParents) {
+ return doAccept(path);
+ }
+
+ // if any of the parent folders is not accepted then return false
+ return doAccept(path) && !hasHiddenPttParentFolder(path);
+ }
+
+ private boolean doAccept(Path path) {
return isHiddenPartitionPath(path) ||
HiddenPathFilter.get().accept(path);
}
+ /**
+ * Iterates through the parent folders if any of the parent folders of the
given path is a
+ * hidden partition folder.
+ */
+ public boolean hasHiddenPttParentFolder(Path path) {
+ return Stream.iterate(path, Path::getParent)
+ .takeWhile(Objects::nonNull)
+ .anyMatch(parentPath -> !doAccept(parentPath));
+ }
Review Comment:
Now it will check parent folders per file, to ensure none of the parent
folder is not hiddenpartition folder. this might be less performant for large
list, if performance is a concern.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]