slfan1989 opened a new issue, #668: URL: https://github.com/apache/iceberg-cpp/issues/668
## Background `ExpireSnapshots` may delete many files during cleanup, including data files, manifest files, manifest list files, and statistics files. Currently, `ExpireSnapshots` already uses `FileIO ::DeleteFiles` for grouped deletion. However, the cleanup path itself still processes deletion groups sequentially. For large tables with many expired files, this may make snapshot expiration cleanup slower than necessary. ## Proposal Add parallel file deletion support inside the `ExpireSnapshots` cleanup strategy. The initial implementation can be intentionally scoped and conservative: - Keep the public `ExpireSnapshots` API unchanged. - Use an internal default parallelism value. - Apply parallelism only inside the cleanup deletion path. - Preserve existing best-effort cleanup semantics. - Continue supporting custom `DeleteWith(...)` callbacks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
