Re: [PR] Binary vector format for flat and hnsw vectors [lucene]

via GitHub Tue, 04 Mar 2025 10:41:37 -0800


lpld commented on PR #14078:
URL: https://github.com/apache/lucene/pull/14078#issuecomment-2698569925


   Hi @benwtrent 
   
   Thanks again for your previous comment. I was able to modify luceneutil and 
run some benchmarks. I am quite new to lucene, so I would appreciate some help 
in understanding the results that I’m getting.
   
   First, I was trying to run a quantized and a non-quantized benchmark on the 
Cohere 768 dataset on my local machine.
   
   Here are the results for the quantized benchmark (with 
Lucene102HnswBinaryQuantizedVectorsFormat):
   
   ```
   recall  latency (ms)      nDoc  topK  fanout  maxConn  beamWidth  quantized  
index s  index docs/s  force merge s  num segments  index size (MB)  vec disk 
(MB)  vec RAM (MB)
   0.452        11.655  10000000   100      50       16        100     1 bits  
1914.86       5222.32       10112.13             1         30934.82      
30212.402       915.527
   ```
   
   Unfortunately I didn’t save the non-quantized results, but the recall was 
something around 0.73.
   
   Then I ran the same tests on a dedicated server with more CPU and RAM, and 
the results were weird. Yes, they were much much faster, but the recall was 
super low now:
   
   Non-quantized:
   
   ```
   recall  latency (ms)      nDoc  topK  fanout  maxConn  beamWidth  quantized  
index s  index docs/s  force merge s  num segments  index size (MB)  vec disk 
(MB)  vec RAM (MB)
    0.203         7.143  10000000   100      50       16        100         no  
1403.40       7125.53         769.29             1         29470.29      
29296.875     29296.875
   ```
   
   Quantized:
   
   ```
   recall  latency (ms)      nDoc  topK  fanout  maxConn  beamWidth  quantized  
index s  index docs/s  force merge s  num segments  index size (MB)  vec disk 
(MB)  vec RAM (MB)
    0.191         7.721  10000000   100      50       16        100     1 bits  
 511.40      19554.09        1116.80             1         30597.43      
30212.402       915.527
   ```
   
   So, my questions are
   
   1. What exactly do the numbers in the description of this pull request mean? 
When you say that the recall for Cohere 768 is 0.938, is it the absolute recall 
value that you got from the benchmark, or is it some sort of ratio between the 
quantized and non-quantized recalls?
   2. Do you have any ideas about what could be the reason for such a huge 
recall difference in the benchmark results on difference environments?
   3. I was also trying to do some benchmarking with other public datasets 
(without luceneutil), and I got a little confused about how to correctly 
calculate the recall. I understand that recall is a ratio between the number of 
correct responses and the total number of responses. The total number of 
responses is straightforward, but the number of correct ones is a bit confusing 
to me. `luceneutil` is querying them as following (not exact code, but my 
variation):
   
   ```java
   var queryVector = new ConstKnnByteVectorValueSource(queryEmb);
   var docVectors = new ByteKnnVectorFieldSource("vector");
   
   var exactQuery = new BooleanQuery.Builder()
           .add(new FunctionQuery(new ByteVectorSimilarityFunction(similarity, 
queryVector, docVectors)), BooleanClause.Occur.SHOULD)
           .add(new MatchAllDocsQuery(), BooleanClause.Occur.FILTER)
           .build();
   ```
   
   However, in `lucene` unit tests a different query is used to get the correct 
neighbors from the index:
   
   ```java
   
   var exactQuery = new KnnByteVectorQuery("vector", queryEmb, size, new 
MatchAllDocsQuery());
   
   ```
   
   I would appreciate if you could give some insights on what query is the 
correct one, because they return different results.
   
   Thanks for your time!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] Binary vector format for flat and hnsw vectors [lucene]

Reply via email to