mikemccand commented on a change in pull request #2022:
URL: https://github.com/apache/lucene-solr/pull/2022#discussion_r518797861
##########
File path: lucene/core/src/java/org/apache/lucene/index/VectorValues.java
##########
@@ -74,6 +74,18 @@ public BytesRef binaryValue() throws IOException {
throw new UnsupportedOperationException();
}
+ /**
+ * Return the k nearest neighbor documents as determined by comparison of
their vector values
+ * for this field, to the given vector, by the field's search strategy. If
the search strategy is
+ * reversed, lower values indicate nearer vectors, otherwise higher scores
indicate nearer
+ * vectors. Unlike relevance scores, vector scores may be negative.
+ * @param target the vector-valued query
+ * @param k the number of docs to return
+ * @param fanout control the accuracy/speed tradeoff - larger values give
better recall at higher cost
Review comment:
> Yeah, I think there needs to be a follow-on exposing the index-time
controls, which indeed are much more potent than this search-time fanout, which
has only a small impact on recall and latency. In this patch they are globals
in HnswGraphBuilder, but there is no API for setting them.
OK, makes sense.
> I am thinking the index-time hyperparameters would be specified in
IndexWriterConfig?
Hmm, maybe these could be codec level controls? Or maybe `FieldInfo`? They
would be per-vector-field configuration right?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]