fuzing commented on PR #12254:
URL: https://github.com/apache/iceberg/pull/12254#issuecomment-2825652819
@RussellSpitzer - We've applied this PR and performed some cursory testing
with a minio S3 compatible store.
We scattered a number of random files inside and outside the table's
sub-folders (including both the metadata and data folders/sub-folders) and it
appears to be working correctly.
One anomaly (to be reconfirmed during tomorrow's testing run) is that we
didn't see the expected output (i.e. the orphan_file_output array was empty -
not helpful during our dry_run).
We used the following call:
```
CALL system.remove_orphan_files(
table => 'a.b.c',
older_than => TIMESTAMP '2025-04-22T00:00:00.000Z',
dry_run => true);
```
With dry_run => false for the removal. As expected, extraneous files
within s3://bucket/a/b were removed. Those outside this path were left
untouched.
Tomorrow we plan to confirm the existence/non-existence of the output files,
along with injecting some orphaned files that pre-date/post-date the older_than
timestamp to make sure that this is behaving as expected.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]