Re: [I] Can we store only quantized vectors to reduce disk footprint? [lucene]

via GitHub Thu, 21 Nov 2024 13:06:17 -0800


Rassyan commented on issue #14007:
URL: https://github.com/apache/lucene/issues/14007#issuecomment-2491209549


   Excuse my ignorance, but I was wondering...
   
   > quantization methodologies can easily "re-hydrate" vectors so that 
iterating floats is still possible
   
   Could you elaborate on the computational costs associated with this? If the 
need to retrieve floats from users is not present, is it feasible to skip this 
rehydration step and directly use the quantized vectors for distance 
calculations?
   
   > higher fidelity quantization methods
   
   So, would int7 be considered a higher fidelity quantization method? Based on 
your experience and insights, how would you rate the fidelity of int7, int4, 
and binary quantization methods? Where do they stand in terms of maintaining 
accuracy while optimizing storage efficiency?
   
   
   Has the Lucene community already planned or discussed the implementation of 
a dedicated KnnVectorsFormat for handling only quantized vectors? Are there 
quick support mechanisms for users who are willing to compromise on accuracy 
for significant savings in disk space and do not require the original vectors?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [I] Can we store only quantized vectors to reduce disk footprint? [lucene]

Reply via email to