[
https://issues.apache.org/jira/browse/LUCENE-9614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17434996#comment-17434996
]
ASF subversion and git services commented on LUCENE-9614:
---------------------------------------------------------
Commit abd5ec4ff0b56b1abfc2883e47e75871e60d3cad in lucene's branch
refs/heads/main from Julie Tibshirani
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=abd5ec4 ]
LUCENE-9614: Fix KnnVectorQuery failure when numDocs is 0 (#413)
When the reader has no live docs, `KnnVectorQuery` can error out. This happens
because `IndexReader#numDocs` is 0, and we end up passing an illegal value of
`k = 0` to the search method.
This commit removes the problematic optimization in `KnnVectorQuery` and
replaces with a lower-level based on the total number of vectors in the segment.
> Implement KNN Query
> -------------------
>
> Key: LUCENE-9614
> URL: https://issues.apache.org/jira/browse/LUCENE-9614
> Project: Lucene - Core
> Issue Type: New Feature
> Reporter: Michael Sokolov
> Priority: Major
> Time Spent: 5h 10m
> Remaining Estimate: 0h
>
> Now we have a vector index format, and one vector indexing/KNN search
> implementation, but the interface is low-level: you can search across a
> single segment only. We would like to expose a Query implementation.
> Initially, we want to support a usage where the KnnVectorQuery selects the
> k-nearest neighbors without regard to any other constraints, and these can
> then be filtered as part of an enclosing Boolean or other query.
> Later we will want to explore some kind of filtering *while* performing
> vector search, or a re-entrant search process that can yield further results.
> Because of the nature of knn search (all documents having any vector value
> match), it is more like a ranking than a filtering operation, and it doesn't
> really make sense to provide an iterator interface that can be merged in the
> usual way, in docid order, skipping ahead. It's not yet clear how to satisfy
> a query that is "k nearest neighbors satsifying some arbitrary Query", at
> least not without realizing a complete bitset for the Query. But this is for
> a later issue; *this* issue is just about performing the knn search in
> isolation, computing a set of (some given) K nearest neighbors, and providing
> an iterator over those.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]