fuzing commented on PR #12254: URL: https://github.com/apache/iceberg/pull/12254#issuecomment-2830301428
> > @RussellSpitzer - we ran further tests today and are pleased to report that for S3 this PR appears to work correctly with the following caveat: > > > > * The files deleted (or proposed deletes in dry_run mode) don't seem to be populated in the procedure response payload. > > > > Our tests involved injecting random files within and outside the Iceberg table S3 root prefix. We injected files with various timestamps and noted that these faux-orphaned files under the root table prefix older than our "older_than" setting were removed correctly, while those younger were left intact. No files outside our root table prefix were removed. > > We're pretty sure there's a bug in file enumeration for the purposes of reporting (i.e. dry_run and actual removal reporting), in that they are not returned as part of the response payload through (in our case) the ODBC driver (whereas other query results, such as SQL SELECT results are). > > If the above is deemed to be a bug, then this should be fixed, at which point this P/R should be ready to merge. We're happy to rerun tests if needed. > > Note that we did not test the fallback path (we aren't configured for such a setup unfortunately - we're Iceberg => S3 only). > > Thanks a lot for running those tests and spotting the issue. I'll dive into the file enumeration problem and get it resolved. Really appreciate your help with this! No worries at all - thanks for contributing to this great project. Don't forget that the lack of response files "might" have some connection to the ODBC driver we're using (most likely not, but just keep that in mind in the event that you're sure that the response is being sent back). Cheers, and thanks again. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org