mayya-sharipova commented on PR #14331:
URL: https://github.com/apache/lucene/pull/14331#issuecomment-2737598747

   I've done additional benchmarks with the new Optimized Scalar Quantization 
format that quantize 32x times to 1 single bit 
(Lucene102HnswBinaryQuantizedVectorsFormat). And here we can see less 
improvements but still improvements:
   
   ### Experiment 3 new QSQ format:
   The average speedups from baseline to candidate are:
   
   <mark>Index Time Speedup: **1.33x**</mark>
   <mark>Force Merge Speedup: **1.34x**</mark>
   
   Evaluation is done with Luceneutil on these datasets:
   
   1. **quora-E5-small**; 522931 docs; 384 dims; 7 bits quantized; cosine metric
   
      - baseline: index time: **70.71s**,  force merge: **59.38s**
   
      - candidate: index time: **58.25s**, force merge: **40.15s**
   
   2. **cohere-wikipedia-v2**; 1M docs; 768 dims; 7 bits quantized; cosine 
metric
   
      - baseline: index time: **203.08s**, force merge: **107.27s**
   
      - candidate: index time: **142.27s**, force merge: **85.68s**
   
   3. **gist**; 960 dims, 1M docs; 7 bits quantized; euclidean metric
   
      - baseline: index time: **110.35s**, force merge: **323.66s**
   
      - candidate: index time: **105.52s**, force merge: **202.20s**
   
   4. **cohere-wikipedia-v3**; 1M docs; 1024 dims; 7 bits quantized; 
dot_product metric
   
      - baseline: index time: **313.43s**, force merge: **165.98s**
   
      - candidate: index time: **190.63s,** force merge: **159.95s**
   
   
   
![10_multiple](https://github.com/user-attachments/assets/8478a24c-6bf7-4601-a3d3-1f927b17f409)
   
   
![10_single](https://github.com/user-attachments/assets/756b694e-4877-4400-8178-36e2755d91b1)
   
   
![100_multiple](https://github.com/user-attachments/assets/9dd06463-d9e5-406e-b78b-516213d9e5cc)
   
   
![100_single](https://github.com/user-attachments/assets/6bb02c94-024f-40a9-b88d-36728c390928)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to