benwtrent commented on code in PR #13181: URL: https://github.com/apache/lucene/pull/13181#discussion_r1593056612
########## lucene/core/src/java/org/apache/lucene/index/ByteVectorValues.java: ########## @@ -75,4 +76,14 @@ public static void checkField(LeafReader in, String field) { + ")"); } } + + /** + * Return a {@link VectorScorer} for the given query vector. The iterator for the scorer is not + * the same instance as the iterator for this {@link ByteVectorValues}. It is a copy, and + * iteration over the scorer will not affect the iteration of this {@link ByteVectorValues}. + * + * @param query the query vector + * @return a {@link VectorScorer} instance + */ + public abstract VectorScorer scorer(byte[] query) throws IOException; Review Comment: > Is the similarity available in the context where the scorer is created? Could it be passed here to avoid tightly coupling the values and similarity (in this interface). That way the same vector values source could be used for different similarities. 🤔 I will have to think on this. We can maybe allow it, but allow default for just using whats stored in the codec. One concern would be quantized similarities. For example, if the user initially indexed quantized vectors saying they would use `euclidean`, we don't calculate the error correction, its just `1`. If they then supply `DOT_PRODUCT`, we don't have the error corrections calculated. Maybe in this case we just error? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org