[ 
https://issues.apache.org/jira/browse/LUCENE-9614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17236727#comment-17236727
 ] 

Michael Sokolov commented on LUCENE-9614:
-----------------------------------------

OK, thought about this a bit, and I guess I see the point a little better. This 
query is weird because if (say) we were to add some new vectors to the index, 
suddenly a vector that previously matched might no longer match. I guess I have 
been thinking of a Query as a convenience for plugging in to the typical 
scoring / execution framework provided by IndexSearcher.  Let me sketch out the 
use case I have in mind, because I'm not sure how we would handle it in the 
non-Query implementation(s).

We'd like to be able to blend matches derived from postings (full text search) 
along with matches derived from vectors, using some kind of scoring function 
that balances vector scores and text relevance scores. Both kinds of matches 
also need to satisfy other constraints, embodied in a Query. If we present KNN 
matches as a Query, I think this can all be done by the Collectors in the usual 
way, but if we have a different API, say something on IndexSearcher, or a 
static method on a KNN class, then that blending will require its own custom 
implementation - I think?

> Implement KNN Query
> -------------------
>
>                 Key: LUCENE-9614
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9614
>             Project: Lucene - Core
>          Issue Type: New Feature
>            Reporter: Michael Sokolov
>            Priority: Major
>
> Now we have a vector index format, and one vector indexing/KNN search 
> implementation, but the interface is low-level: you can search across a 
> single segment only. We would like to expose a Query implementation. 
> Initially, we want to support a usage where the KnnVectorQuery selects the 
> k-nearest neighbors without regard to any other constraints, and these can 
> then be filtered as part of an enclosing Boolean or other query.
> Later we will want to explore some kind of filtering *while* performing 
> vector search, or a re-entrant search process that can yield further results. 
> Because of the nature of knn search (all documents having any vector value 
> match), it is more like a ranking than a filtering operation, and it doesn't 
> really make sense to provide an iterator interface that can be merged in the 
> usual way, in docid order, skipping ahead. It's not yet clear how to satisfy 
> a query that is "k nearest neighbors satsifying some arbitrary Query", at 
> least not without realizing a complete bitset for the Query. But this is for 
> a later issue; *this* issue is just about performing the knn search in 
> isolation, computing a set of (some given) K nearest neighbors, and providing 
> an iterator over those.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to