slfan1989 commented on code in PR #669:
URL: https://github.com/apache/iceberg-cpp/pull/669#discussion_r3286959157
##########
src/iceberg/update/expire_snapshots.cc:
##########
@@ -100,13 +103,39 @@ class FileCleanupStrategy {
/// \brief Delete files at the given locations.
void DeleteFiles(const std::unordered_set<std::string>& paths) {
+ if (paths.empty()) {
+ return;
+ }
+
+ std::vector<std::string> path_list(paths.begin(), paths.end());
+ auto parallelism = std::min(kDefaultDeleteParallelism, path_list.size());
+ if (parallelism <= 1) {
+ DeleteFileRange(path_list, 0, path_list.size());
+ return;
+ }
+
+ std::vector<std::thread> workers;
Review Comment:
Thanks for the review and guidance. I agree that adding `std::thread`
directly inside `ExpireSnapshots` is too local, and using the planned
general-purpose thread pool would be a better direction.
I'll pause this PR for now and revisit the ExpireSnapshots parallel cleanup
after the thread pool infrastructure lands.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]