benwtrent commented on code in PR #13184: URL: https://github.com/apache/lucene/pull/13184#discussion_r1525341872
########## lucene/join/src/java/org/apache/lucene/search/join/DiversifyingChildrenFloatKnnVectorQuery.java: ########## @@ -98,7 +98,8 @@ protected TopDocs exactSearch(LeafReaderContext context, DocIdSetIterator accept parentBitSet, query, fi.getVectorSimilarityFunction()); - HitQueue queue = new HitQueue(k, true); + final int queueSize = Math.min(k, Math.toIntExact(acceptIterator.cost())); Review Comment: I don't know a better way, but, since this diversifies over parent doc ids, its possible that the hitqueue is still much smaller than `acceptIterator.cost()` as `acceptIterator.cost()` is the iterator over CHILD docs (e.g. passage vector docs). I think any further optimization (e.g. counting the number of relevant parents) would add undo overhead. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org