slow-J opened a new pull request, #16271: URL: https://github.com/apache/lucene/pull/16271
Part of https://github.com/apache/lucene/issues/16267 (point nr 3 there). DocAndScoreQuery.explain() previously returned a bare not in top N docs for any document a KNN vector query did not collect, which gives no insight into why the document was missing Previous message: ``` 0.0 = no match on optional clause (track(knn=DocAndScoreQuery[1404635,...][0.7655482,...],0.7655482)) 0.0 = not in top 1 docs ``` Add detail to KNN no-match explanations. Recomputing the explained doc's score in explain() so "not in top N docs" says why, e.g. below cutoff, excluded by filter, no vector value, or a tie-break/recall miss. This recomputes the explained document's own score in explain() and reports the reason it was not collected. AbstractKnnVectorQuery supplies a small optional hook to DocAndScoreQuery (which it rewrites into); RescoreTopNQuery, the other producer of DocAndScoreQuery, passes none and keeps the generic message. The zero-result case now rewrites to a MatchNoDocsQuery with a reason instead of the blank-reason singleton. --- Some examples: - Not in top 3 doc(s): score 0.41 < minTopKScore 0.76 - Not in top 3 doc(s): excluded by filter - Not in top 3 doc(s): no vector value in field "embedding" - Not in top 3 doc(s): similarity 0.81 >= cutoff 0.76 (tie-break or approximate-search miss) - MatchNoDocsQuery("No documents matched the nearest-neighbor search") --- TODO before "ready for review": - See are all changed messages covered by unit tests. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
