rmuir commented on code in PR #11946:
URL: https://github.com/apache/lucene/pull/11946#discussion_r1051525868
##
lucene/core/src/java/org/apache/lucene/search/KnnVectorQuery.java:
##
@@ -76,12 +91,29 @@ public KnnVectorQuery(String field, float[] target, int k) {
* @throws IllegalArgumentException if k is less than 1
*/
public KnnVectorQuery(String field, float[] target, int k, Query filter) {
+this(field, target, k, Float.NEGATIVE_INFINITY, filter);
+ }
+
+ /**
+ * Find the k nearest documents to the target vector according
to the vectors in the
+ * given field. target vector.
+ *
+ * @param field a field that has been indexed as a {@link KnnVectorField}.
+ * @param target the target of the search
+ * @param k the number of documents to find (the upper bound)
+ * @param similarityThreshold the minimum acceptable value of similarity
Review Comment:
still don't have any explanation here as to why we'd do this for vector
search query. we avoided any such thresholds or normalization in any of
lucene's scoring for decades: if we didn't do that, we would have never been
able to implement block-max WAND or other algorithms because they'd be
incompatible.
please see:
*
https://cwiki.apache.org/confluence/display/LUCENE/LuceneFAQ#LuceneFAQ-CanIfilterbyscore?
* https://cwiki.apache.org/confluence/display/LUCENE/ScoresAsPercentages
I don't mind being the bad guy blocking this change because it seems like it
has not been thought thru.
You must convince me.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org
-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org