Re: [PR] [Draft] Support Multi-Vector HNSW Search via Flat Vector Storage [lucene]

via GitHub Tue, 28 Jan 2025 12:42:43 -0800


vigyasharma commented on PR #14173:
URL: https://github.com/apache/lucene/pull/14173#issuecomment-2620016142


   Ran some early benchmarks to compare this flat storage based multi-vector 
approach with the existing parent-join approach. I would appreciate any 
feedback on the approach, benchmark setup, or any mistakes you spot along the 
way.
   
   **Observations:**
   1. Latency and recall are better with multiVectors, when both parentJoin and 
multiVector benchmarks are run on my branch. However, the parentJoin benchmark 
has significantly better latency and recall when run on `main` branch. Some key 
differences between my branch and `main` branch runs are:
       1. My branch always creates and loads the metadata needed for 
multiVector, even in the single vector (parentJoin) case. I went with a 
simplistic approach here so my guess is that this is the source of latency.
       2. I compared by disabling merging for both benchmarks, because I 
haven't implemented merging changes yet.
       3. I've added run results below, but I wouldn't put too much faith in 
them until we narrowed down the latency cause.
   2. For parentJoin benchmark run on `main`, there is a visible drop in recall 
when I disable merges (as compared to a `main` branch run with merges enabled). 
Is this expected?
   
   ...
   
   _Note that `nDoc` on parentJoin is `numVectors + nDoc` on multiVector runs. 
This is from the parent documents created in addition to child vector 
documents._
   
   #### ParentJoin v/s MultiVector (on multi-vector branch)
   ```ruby
   # multivector
   recall  latency (ms)  nVectors  nDoc  topK  fanout  maxConn  beamWidth  
quantized  index s  index docs/s  num segments  index size (MB)  vec disk (MB)  
vec RAM (MB)
    0.673         3.548     10000   103   100      50       32        100       
  no     1.62         63.58             1            29.40         29.297       
 29.297
    0.431         4.857    100000  1323   100      50       32        100       
  no    11.42        115.84             3           294.08        292.969       
292.969
    0.461         8.034    200000  2939   100      50       32        100       
  no    22.62        129.92             6           588.27        585.938       
585.938
    0.496        16.040    500000  8773   100      50       32        100       
  no    53.50        163.98            14          1470.72       1464.844      
1464.844
   
   
   # parentJoin on multi-vector branch 
   # (merges disabled, creates and loads multivector config)
   recall  latency (ms)  nVectors    nDoc  topK  fanout  maxConn  beamWidth  
quantized  index s  index docs/s  num segments  index size (MB)  vec disk (MB)  
vec RAM (MB)
    0.610         4.644     10000   10103   100      50       32        100     
    no     1.70       5946.44             1            29.51         29.297     
   29.297
    0.242         5.189    100000  101323   100      50       32        100     
    no    11.34       8935.80             3           295.17        292.969     
  292.969
    0.275         8.988    200000  202939   100      50       32        100     
    no    22.54       9005.50             6           590.51        585.938     
  585.938
    0.290        16.605    500000  508773   100      50       32        100     
    no    52.70       9654.32            14          1476.26       1464.844     
 1464.844
   ```
   ...
   
   #### ParentJoin (on main) v/s MultiVector (on multivector branch)
   ```ruby
   # parentJoin (on main)
   recall  latency (ms)    nDoc  topK  fanout  maxConn  beamWidth  quantized  
index s  index docs/s  force merge s  num segments  index size (MB)  vec disk 
(MB)  vec RAM (MB)
    0.958         1.160   10000   100      50       32        100         no    
 1.49       6706.91           1.85             1            29.67         
29.297        29.297
    0.925         2.392  100000   100      50       32        100         no    
34.98       2858.86           7.86             1           297.91        
292.969       292.969
    0.914         2.972  200000   100      50       32        100         no    
63.80       3134.94          43.48             1           596.14        
585.938       585.938
    0.904         4.292  500000   100      50       32        100         no   
151.49       3300.57         147.08             1          1491.81       
1464.844      1464.844
   
   # multivector
   recall  latency (ms)  nVectors  nDoc  topK  fanout  maxConn  beamWidth  
quantized  index s  index docs/s  num segments  index size (MB)  vec disk (MB)  
vec RAM (MB)
    0.673         3.793     10000   103   100      50       32        100       
  no     1.59         64.78             1            29.40         29.297       
 29.297
    0.431         4.572    100000  1323   100      50       32        100       
  no    11.22        117.87             3           294.08        292.969       
292.969
    0.461         7.681    200000  2939   100      50       32        100       
  no    22.38        131.32             6           588.27        585.938       
585.938
    0.496        16.292    500000  8773   100      50       32        100       
  no    54.10        162.17            14          1470.72       1464.844      
1464.844
   ```
   ...
   
   #### ParentJoin with merges v/s ParentJoin with merges disabled (both on 
main)
   ```ruby
   # parentJoin (on main)
   recall  latency (ms)    nDoc  topK  fanout  maxConn  beamWidth  quantized  
index s  index docs/s  force merge s  num segments  index size (MB)  vec disk 
(MB)  vec RAM (MB)
    0.958         1.160   10000   100      50       32        100         no    
 1.49       6706.91           1.85             1            29.67         
29.297        29.297
    0.925         2.392  100000   100      50       32        100         no    
34.98       2858.86           7.86             1           297.91        
292.969       292.969
    0.914         2.972  200000   100      50       32        100         no    
63.80       3134.94          43.48             1           596.14        
585.938       585.938
    0.904         4.292  500000   100      50       32        100         no   
151.49       3300.57         147.08             1          1491.81       
1464.844      1464.844
   
   ## parentJoin on main (merge disabled):
   recall  latency (ms)    nDoc  topK  fanout  maxConn  beamWidth  quantized  
index s  index docs/s  force merge s  num segments  index size (MB)  vec disk 
(MB)  vec RAM (MB)
    0.440         1.297   10000   100      50       32        100         no    
 1.76       5694.76           2.03             1            29.67         
29.297        29.297
    0.692         2.596  100000   100      50       32        100         no    
11.35       8807.47          29.76             1           297.86        
292.969       292.969
    0.530         3.173  200000   100      50       32        100         no    
22.03       9077.71          67.91             1           596.24        
585.938       585.938
    0.598         4.368  500000   100      50       32        100         no    
53.20       9398.50         204.29             1          1493.26       
1464.844      1464.844
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] [Draft] Support Multi-Vector HNSW Search via Flat Vector Storage [lucene]

Reply via email to