[I] Can we store only quantized vectors to reduce disk footprint? [lucene]

via GitHub Thu, 21 Nov 2024 00:04:03 -0800


Rassyan opened a new issue, #14007:
URL: https://github.com/apache/lucene/issues/14007


   ### Description
   
   In light of optimizing disk usage for KNN vector searches in Lucene, I 
propose considering a new KnnVectorsFormat class in Lucene that handles only 
quantized vectors, eliminating the need to store original float32 vectors. This 
approach could significantly reduce disk usage, with potential reductions 
similar to the memory efficiency seen in int8 quantization scenarios, where 
usage can drop to about 25%. This figure is illustrative, emphasizing that 
actual savings could vary with different quantization methods and storage 
configurations.
   
   I seek community feedback on:
   
   - The technical feasibility of this new storage model.
   - Potential impacts on search accuracy and performance.
   
   Your insights will help determine the viability of this approach for 
enhancing Lucene's vector search capabilities.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[I] Can we store only quantized vectors to reduce disk footprint? [lucene]

Reply via email to