aokolnychyi commented on code in PR #9117:
URL: https://github.com/apache/iceberg/pull/9117#discussion_r1399860140


##########
data/src/main/java/org/apache/iceberg/data/DeleteFilter.java:
##########
@@ -245,18 +242,9 @@ private CloseableIterable<T> 
applyPosDeletes(CloseableIterable<T> records) {
 
     List<CloseableIterable<Record>> deletes = Lists.transform(posDeletes, 
this::openPosDeletes);
 
-    // if there are fewer deletes than a reasonable number to keep in memory, 
use a set

Review Comment:
   When this logic was added a few years ago, we added position deletes into a 
set. We have been using bitmaps for a while now. In fact, vectorized reads 
always build bitmaps and have no threshold on the number of deletes. This has 
proven to work really well. Position deletes represented as bitmaps should 
always fit in memory.
   
   Position deletes compress really well both on disk and in memory. We have 
seen this 100K threshold causing degradation in jobs without any good reason.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to