epotyom commented on code in PR #14273: URL: https://github.com/apache/lucene/pull/14273#discussion_r1967691866
########## lucene/core/src/java/org/apache/lucene/search/DocIdStream.java: ########## @@ -34,12 +33,35 @@ protected DocIdStream() {} * Iterate over doc IDs contained in this stream in order, calling the given {@link * CheckedIntConsumer} on them. This is a terminal operation. */ - public abstract void forEach(CheckedIntConsumer<IOException> consumer) throws IOException; + public void forEach(CheckedIntConsumer<IOException> consumer) throws IOException { + forEach(DocIdSetIterator.NO_MORE_DOCS, consumer); + } + + /** + * Iterate over doc IDs contained in this doc ID stream up to the given {@code upTo} exclusive, + * calling the given {@link CheckedIntConsumer} on them. It is not possible to iterate these doc + * IDs again later on. + */ + public abstract void forEach(int upTo, CheckedIntConsumer<IOException> consumer) + throws IOException; /** Count the number of entries in this stream. This is a terminal operation. */ public int count() throws IOException { int[] count = new int[1]; forEach(doc -> count[0]++); return count[0]; } + + /** + * Count the number of doc IDs in this stream that are below the given {@code upTo}. These doc IDs + * may not be consumed again later. + */ + public int count(int upTo) throws IOException { + int[] count = new int[1]; + forEach(upTo, doc -> count[0]++); + return count[0]; + } + + /** Return {@code true} if this stream may have remaining doc IDs. */ Review Comment: Maybe I'm nitpicking, but is it worth mentioning that it must eventually returning `false`, otherwise `return true` may sound like a correct implementation? ########## lucene/test-framework/src/java/org/apache/lucene/tests/search/AssertingLeafCollector.java: ########## @@ -119,15 +119,54 @@ public void forEach(CheckedIntConsumer<IOException> consumer) throws IOException consumer.accept(doc); lastCollected = doc; }); - consumed = true; + fullyConsumed = true; + assert stream.mayHaveRemaining() == false; + } + + @Override + public void forEach(int upTo, CheckedIntConsumer<IOException> consumer) throws IOException { + assert fullyConsumed == false : "A terminal operation has already been called"; + stream.forEach( + doc -> { + assert doc > lastCollected : "Out of order : " + lastCollected + " " + doc; + assert doc >= min : "Out of range: " + doc + " < " + min; + assert doc < max : "Out of range: " + doc + " >= " + max; + consumer.accept(doc); + lastCollected = doc; + }); + fullyConsumed = upTo == DocIdSetIterator.NO_MORE_DOCS; + if (fullyConsumed) { + assert stream.mayHaveRemaining() == false; + } } @Override public int count() throws IOException { - assert consumed == false : "A terminal operation has already been called"; + assert fullyConsumed == false : "A terminal operation has already been called"; int count = stream.count(); - consumed = true; + fullyConsumed = true; + assert stream.mayHaveRemaining() == false; return count; } + + @Override + public int count(int upTo) throws IOException { + assert fullyConsumed == false : "A terminal operation has already been called"; + int count = stream.count(); Review Comment: Should it be `count(upTo)`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org