kingwind94 commented on issue #6694: URL: https://github.com/apache/iceberg/issues/6694#issuecomment-1409652067
> > But flink's new added position deletes should only appy to the new added data files, not history (rewritting) data files, so this should not hinder the rewrite operation if it works corrctly. > > Unfortunately, 6313 cannot completely solve this problem. It is not enough to use the upper bound and lower bound of `file_path` to detect whether there is a conflict. Even if metrics are complete, DeleteFileIndex.canContainPosDeletesForFile() still cannot completely filter out unmatched delete files. > > There is another PR #5760 trying to fix this. I think the file_path name of data files are ordered as the flink checkpoint num grows. Not sure about that, but the file_path is not randomly named, at least it seems that DeleteFileIndex.canContainPosDeletesForFile() can filter out new-added position deletes for flink-to-iceberg cases. I will keep looking at it. And thx again, I will check the PR #5760. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org