fuzing commented on PR #12254:
URL: https://github.com/apache/iceberg/pull/12254#issuecomment-2829035039

   @RussellSpitzer - we ran further tests today and are pleased to report that 
for S3 this PR appears to work correctly with 
   the following caveat:
   - The files deleted (or proposed deletes in dry_run mode) don't seem to be 
populated in the procedure response payload.
   
   Our tests involved injecting random files within and outside the Iceberg 
table S3 root prefix.  We injected files with various timestamps and noted that 
these faux-orphaned files under the root table prefix older than our 
"older_than" setting were removed correctly, while those younger were left 
intact.  No files outside our root table prefix were removed.
   
   We're pretty sure there's a bug in file enumeration for the purposes of 
reporting (i.e. dry_run and actual removal reporting), in that they are not 
returned as part of the response payload through (in our case) the ODBC driver 
(whereas other query results, such as SQL SELECT results are).
   
   If the above is deemed to be a bug, then this should be fixed, at which 
point this P/R should be ready to merge.  We're happy to rerun tests if needed.
   
   Note that we did not test the fallback path (we aren't configured for such a 
setup unfortunately - we're Iceberg => S3 only).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to