aokolnychyi commented on code in PR #9117: URL: https://github.com/apache/iceberg/pull/9117#discussion_r1399860140
########## data/src/main/java/org/apache/iceberg/data/DeleteFilter.java: ########## @@ -245,18 +242,9 @@ private CloseableIterable<T> applyPosDeletes(CloseableIterable<T> records) { List<CloseableIterable<Record>> deletes = Lists.transform(posDeletes, this::openPosDeletes); - // if there are fewer deletes than a reasonable number to keep in memory, use a set Review Comment: When this logic was added a few years ago, we added position deletes into a set. We have been using bitmaps for a while now. In fact, vectorized reads always build bitmaps and have no threshold on the number of deletes. This has proven to work really well. Position deletes represented as bitmaps should always fit in memory. Position deletes compress really well both on disk and in memory. We have seen this 100K threshold causing degradation in jobs without any good reason. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org