jpountz commented on code in PR #14529:
URL: https://github.com/apache/lucene/pull/14529#discussion_r2057729121


##########
lucene/core/src/java/org/apache/lucene/codecs/lucene90/IndexedDISI.java:
##########
@@ -693,6 +727,44 @@ boolean advanceExactWithinBlock(IndexedDISI disi, int 
target) throws IOException
         disi.index = disi.numberOfOnes - Long.bitCount(leftBits);
         return (leftBits & 1L) != 0;
       }
+
+      @Override
+      boolean intoBitSetWithinBlock(IndexedDISI disi, int upTo, FixedBitSet 
bitSet, int offset)
+          throws IOException {
+        if (disi.doc >= upTo) {
+          advanceWithinBlock(disi, disi.doc);

Review Comment:
   Do we really need to call `advanceWithinBlock`? If `disi.doc` is already `>= 
upTo`, there shouldn't be anything to do?



##########
lucene/core/src/test/org/apache/lucene/codecs/lucene90/TestIndexedDISI.java:
##########
@@ -555,6 +565,47 @@ private void assertAdvanceExactRandomized(
     }
   }
 
+  private void assertIntoBitsetRandomized(
+      IndexedDISI disi, BitSetIterator disi2, int disi2length, int step) 
throws IOException {
+    int index = -1;
+    FixedBitSet set1 = new FixedBitSet(step);
+    FixedBitSet set2 = new FixedBitSet(step);
+
+    for (int upTo = 0; upTo < disi2length; ) {
+      int lastUpTo = upTo;
+      upTo += TestUtil.nextInt(random(), 0, step);
+      int offset = TestUtil.nextInt(random(), lastUpTo, upTo);
+
+      if (disi.docID() < offset) {
+        disi.advance(offset);
+      }
+      int doc = disi2.docID();
+      while (doc < offset) {
+        index++;
+        doc = disi2.nextDoc();
+      }
+      while (doc < upTo) {
+        set2.set(doc - offset);
+        index++;
+        doc = disi2.nextDoc();
+      }
+
+      disi.intoBitSet(upTo, set1, offset);
+      assertEquals(index, disi.index());
+      BitSetIterator expected = new BitSetIterator(set2, set2.cardinality());
+      BitSetIterator actual = new BitSetIterator(set1, set1.cardinality());
+      for (int expectedDoc = expected.nextDoc();
+          expectedDoc != DocIdSetIterator.NO_MORE_DOCS;
+          expectedDoc = expected.nextDoc()) {
+        int actualDoc = actual.nextDoc();
+        assertEquals(expectedDoc + offset, actualDoc + offset); // plus offset 
for better message.
+      }
+      assertEquals(DocIdSetIterator.NO_MORE_DOCS, actual.nextDoc());
+      set1.clear();
+      set2.clear();

Review Comment:
   Can you also check that nextDoc() returns the expected value after 
`intoBitSet` returns? To make sure that all required state is set correctly?



##########
lucene/core/src/java/org/apache/lucene/codecs/lucene90/IndexedDISI.java:
##########
@@ -693,6 +727,44 @@ boolean advanceExactWithinBlock(IndexedDISI disi, int 
target) throws IOException
         disi.index = disi.numberOfOnes - Long.bitCount(leftBits);
         return (leftBits & 1L) != 0;
       }
+
+      @Override
+      boolean intoBitSetWithinBlock(IndexedDISI disi, int upTo, FixedBitSet 
bitSet, int offset)
+          throws IOException {
+        if (disi.doc >= upTo) {
+          advanceWithinBlock(disi, disi.doc);
+          return true;
+        }
+
+        int sourceFrom = disi.doc & 0xFFFF;
+        int sourceStartWord = sourceFrom >>> 6;
+        int sourceTo = Math.min(upTo - disi.block, BLOCK_SIZE);
+        int numWords = FixedBitSet.bits2words(sourceTo) - sourceStartWord;
+
+        if (disi.bitSet == null || disi.bitSet.getBits().length < numWords) {

Review Comment:
   Should we always create a bit set of size `BLOCK_SIZE` when it's null for 
simplicity? This is what users will get in the worst-case scenario anyway, and 
it's not crazy (65k bits = 8kB)?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to