jpountz commented on code in PR #14529: URL: https://github.com/apache/lucene/pull/14529#discussion_r2057662695
########## lucene/core/src/java/org/apache/lucene/codecs/lucene90/IndexedDISI.java: ########## @@ -491,6 +492,14 @@ public int advance(int target) throws IOException { return doc; } + @Override + public void intoBitSet(int upTo, FixedBitSet bitSet, int offset) throws IOException { + assert doc >= offset; + while (method.intoBitsetWithinBlock(this, upTo, bitSet, offset) == false) { + readBlockHeader(); Review Comment: I wonder if we should call something like below after reading the block header (similar to what `advance(target)` does): ``` boolean found = method.advanceWithinBlock(this, block); assert found; ``` This would guarantee that `disi.doc` is always a doc ID of the block when `intoBitsetWithinBlock` is called, which could help simplify some implementations of `intoBitsetWithinBlock`? (I'm looking at `DENSE` in particular) ########## lucene/core/src/java/org/apache/lucene/codecs/lucene90/IndexedDISI.java: ########## @@ -625,6 +634,29 @@ boolean advanceExactWithinBlock(IndexedDISI disi, int target) throws IOException disi.exists = false; return false; } + + @Override + boolean intoBitsetWithinBlock(IndexedDISI disi, int upTo, FixedBitSet bitSet, int offset) + throws IOException { + if (disi.doc >= upTo) { + return true; + } + bitSet.set(disi.doc - offset); + + for (int i = disi.index + 1, to = disi.nextBlockIndex; i <= to; i++) { Review Comment: I'm still a bit confused, why is `disi.nextBlockIndex` included in the loop when it's excluded in `advanceWithinBlock`? ########## lucene/core/src/java/org/apache/lucene/codecs/lucene90/IndexedDISI.java: ########## @@ -720,6 +805,15 @@ boolean advanceExactWithinBlock(IndexedDISI disi, int target) { * return whether this document exists. */ abstract boolean advanceExactWithinBlock(IndexedDISI disi, int target) throws IOException; + + /** + * Similar to {@link DocIdSetIterator#intoBitSet}, load docs in this block into a bitset. This + * mehtod return true if there are remaining docs (gte upTo) in the block, otherwise false. When + * false return, fp of disi#slice is at blockEnd and disi#index is correct but other status vars + * are undefined. Caller should decode the header of next block by {@link #readBlockHeader()}. + */ + abstract boolean intoBitsetWithinBlock( Review Comment: Nit: for consistency with how it's called on `DocIdSetIterator`. ```suggestion abstract boolean intoBitSetWithinBlock( ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org