[ 
https://issues.apache.org/jira/browse/LUCENE-10606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17551976#comment-17551976
 ] 

Kaival Parikh commented on LUCENE-10606:
----------------------------------------

Instead of collecting hit-by-hit using a LeafCollector, we can break down the 
search by instantiating a weight, creating scorers, and checking the underlying 
iterator. If it is backed by a BitSet, we can directly update the reference (as 
we won't be editing it). Else we can create a new BitSet from the iterator 
using BitSet.of

This way the collection is optimized (and can be advantageous as LRUQueryCache 
internally uses a BitSet, so such iterators will be common). Sample 
[code|https://github.com/apache/lucene/compare/main...kaivalnp:alternate_collection]

> Optimize hit collection of prefilter in KnnVectorQuery for BitSet backed 
> queries
> --------------------------------------------------------------------------------
>
>                 Key: LUCENE-10606
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10606
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/search
>            Reporter: Kaival Parikh
>            Priority: Minor
>              Labels: performance
>
> While working on this [PR|https://github.com/apache/lucene/pull/932] to add 
> prefilter testing support, we saw that hit collection took a long time for 
> BitSetIterator backed scorers (due to iteration over the entire underlying 
> BitSet, and copying it into an internal one) (Link to 
> [numbers|https://github.com/apache/lucene/pull/932#discussion_r888896850], 
> second table)
> These BitSetIterators can be frequent (as they are used in LRUQueryCache), 
> and bulk collection can be optimized with more knowledge of the underlying 
> iterator



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to