kaivalnp commented on PR #12679:
URL: https://github.com/apache/lucene/pull/12679#issuecomment-1769299867

   Here is the gist of my benchmark: 
https://gist.github.com/kaivalnp/79808017ed7666214540213d1e2a21cf
   
   I'm calculating the baseline / individual results as "count of vectors above 
the threshold"
   
   Note that we do not need the actual vectors, because any vector with a score 
>= `resultSimilarity` is implicitly in the baseline. This simplifies the 
benchmark to just maintaining counts of vectors (as opposed to the actual 
vector IDs), and recall is calculated as the "ratio of total count of vectors 
found by KNN or RNN / total count of vectors in the baseline"
   
   Had some other helper functions mainly for calling these and formatting 
output, but kept the important functions in the gist (how I'm calculating the 
baseline, KNN / RNN results and time taken)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to