slfan1989 opened a new issue, #668:
URL: https://github.com/apache/iceberg-cpp/issues/668

   ## Background
   
   `ExpireSnapshots` may delete many files during cleanup, including data 
files, manifest files, manifest list files, and statistics files.
   
   Currently, `ExpireSnapshots` already uses `FileIO ::DeleteFiles` for grouped 
deletion. However, the cleanup path itself still processes deletion groups 
sequentially. For large tables with many expired files, this may make snapshot 
expiration cleanup slower than necessary.
   
   
   ## Proposal
   
   Add parallel file deletion support inside the `ExpireSnapshots` cleanup 
strategy.
   
   The initial implementation can be intentionally scoped and conservative:
   
   - Keep the public `ExpireSnapshots` API unchanged.
   - Use an internal default parallelism value.
   - Apply parallelism only inside the cleanup deletion path.
   - Preserve existing best-effort cleanup semantics.
   - Continue supporting custom `DeleteWith(...)` callbacks.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to