jpountz commented on code in PR #14269: URL: https://github.com/apache/lucene/pull/14269#discussion_r1965299369
########## lucene/core/src/java/org/apache/lucene/search/DocIdSetIterator.java: ########## @@ -224,7 +248,7 @@ protected final int slowAdvance(int target) throws IOException { * </pre> * * <p><b>Note</b>: {@code offset} must be less than or equal to the {@link #docID() current doc - * ID}. + * ID}. Behaviour is undefined if this iterator is unpositioned or exhausted. Review Comment: I agree that it should never be called if the iterator is unpositioned, but exhausted should be fine? Should we assert this in AssertingScorer? (or is it already checked?) ########## lucene/core/src/test/org/apache/lucene/search/TestDocIdSetIterator.java: ########## @@ -64,4 +66,61 @@ public void testAdvance() throws Exception { assertEquals(19, disi.nextDoc()); assertEquals(NO_MORE_DOCS, disi.nextDoc()); } + + public void testIntoBitset() throws Exception { + for (int i = 0; i < 10; i++) { + int max = 1 + random().nextInt(500); + DocIdSetIterator expectedDisi; + DocIdSetIterator actualDisi; + if ((i & 1) == 0) { + int min = random().nextInt(max); + expectedDisi = DocIdSetIterator.range(min, max); + actualDisi = DocIdSetIterator.range(min, max); + } else { + expectedDisi = DocIdSetIterator.all(max); + actualDisi = DocIdSetIterator.all(max); + } + FixedBitSet expected = new FixedBitSet(max * 2); + FixedBitSet actual = new FixedBitSet(max * 2); + int doc = -1; + expectedDisi.nextDoc(); + actualDisi.nextDoc(); + while (doc != NO_MORE_DOCS) { + int r = random().nextInt(3); + switch (r) { + case 0 -> { + expectedDisi.nextDoc(); + actualDisi.nextDoc(); + } + case 1 -> { + int jump = expectedDisi.docID() + random().nextInt(5); + expectedDisi.advance(jump); + actualDisi.advance(jump); + } + case 2 -> { + expected.clear(); + actual.clear(); + int upTo = + random().nextBoolean() + ? expectedDisi.docID() - 1 + : expectedDisi.docID() + random().nextInt(5); + int offset = expectedDisi.docID() - random().nextInt(max); + defaultIntoBitset(expectedDisi, upTo, expected, offset); Review Comment: You may as well do something like that to get the default impl: ```java // use the default impl of intoBitSet new FilterDocIdSetIterator(expectedDisi).intoBitSet(upTo, expected, offset); ``` ########## lucene/core/src/java/org/apache/lucene/search/DocIdSetIterator.java: ########## @@ -87,6 +87,18 @@ public int advance(int target) throws IOException { public long cost() { return maxDoc; } + + @Override + public void intoBitSet(int upTo, FixedBitSet bitSet, int offset) { + assert offset <= doc; + if (upTo >= maxDoc) { + bitSet.set(doc - offset, maxDoc - offset); + doc = NO_MORE_DOCS; + } else if (upTo > doc) { + bitSet.set(doc - offset, upTo - offset); + doc = upTo; + } Review Comment: On other impls, I liked to avoid two branches by doing something like below ```java upTo = Math.min(upTo, maxDoc); if (upTo > doc) { bitSet.set(doc - offset, upTo - offset); advance(upTo); } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org