kaivalnp commented on PR #12679: URL: https://github.com/apache/lucene/pull/12679#issuecomment-1769299867
Here is the gist of my benchmark: https://gist.github.com/kaivalnp/79808017ed7666214540213d1e2a21cf I'm calculating the baseline / individual results as "count of vectors above the threshold" Note that we do not need the actual vectors, because any vector with a score >= `resultSimilarity` is implicitly in the baseline. This simplifies the benchmark to just maintaining counts of vectors (as opposed to the actual vector IDs), and recall is calculated as the "ratio of total count of vectors found by KNN or RNN / total count of vectors in the baseline" Had some other helper functions mainly for calling these and formatting output, but kept the important functions in the gist (how I'm calculating the baseline, KNN / RNN results and time taken) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org