jpountz commented on code in PR #14273: URL: https://github.com/apache/lucene/pull/14273#discussion_r2014552652
########## lucene/core/src/java/org/apache/lucene/search/DocIdStream.java: ########## @@ -34,12 +33,35 @@ protected DocIdStream() {} * Iterate over doc IDs contained in this stream in order, calling the given {@link * CheckedIntConsumer} on them. This is a terminal operation. */ - public abstract void forEach(CheckedIntConsumer<IOException> consumer) throws IOException; + public void forEach(CheckedIntConsumer<IOException> consumer) throws IOException { + forEach(DocIdSetIterator.NO_MORE_DOCS, consumer); + } + + /** + * Iterate over doc IDs contained in this doc ID stream up to the given {@code upTo} exclusive, + * calling the given {@link CheckedIntConsumer} on them. It is not possible to iterate these doc + * IDs again later on. + */ + public abstract void forEach(int upTo, CheckedIntConsumer<IOException> consumer) + throws IOException; /** Count the number of entries in this stream. This is a terminal operation. */ public int count() throws IOException { int[] count = new int[1]; forEach(doc -> count[0]++); return count[0]; } + + /** + * Count the number of doc IDs in this stream that are below the given {@code upTo}. These doc IDs + * may not be consumed again later. + */ + public int count(int upTo) throws IOException { Review Comment: > Are you thinking of peeking into these bit sets to provide cardinality up to the specific doc? (Or maybe I'm missing something?) Yes exactly. I have something locally already, I need to beef up testing a bit. The bitset-based `DocIdStream` is one interesting implementation, the other interesting implementation is the one that is backed by a range of doc IDs that all match. It is internally used by queries that fully match a segment (e.g. `PointRangeQuery` when all the segment's values are contained in the query range, or `MatchAllDocsQuery`) or queries on fields that are part of (or correlate with) the index sort fields. See #14312 for reference. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org