mikemccand commented on issue #13519:
URL: https://github.com/apache/lucene/issues/13519#issuecomment-2206113117

   > My concern for 8 bit quantization is the algebraic expansion of 
dot-product and the corrective terms.
   > 
   > For scalar quantization, the score corrections for dotProduct are 
derivable via some simple algebra, but I am not immediately aware of a way to 
handle the sign switch. I didn't bother digging deeper there as int7 provides 
basically the exact same recall. I am eager to see if 8bit can be applied while 
keeping the score corrections.
   > 
   > In case you need it, here is valuable background:
   > 
   > https://www.elastic.co/search-labs/blog/scalar-quantization-101
   > 
   > For some background on the small additional correction provided for int4 
(or any scalar quantization where confidence_interval is set to `0`):
   > 
   > 
https://www.elastic.co/search-labs/blog/vector-db-optimized-scalar-quantization
   
   Thanks @benwtrent -- this is very helpful.  I think I understand better :)
   
   I see that we are forced to conflate quantization and distance metric 
because we want to compute the distance metric directly in quantized space 
using efficient SIMD instructions.
   
   I.e. one might ideally consider quantization mathematically as being its own 
problem regardless of distance metrics: you map `float32` down to `int8` or 
`int4` etc., and it's bidirectional.  You can re-map a quantized vector back to 
`float32` (introducing quantization noise of course).  But that would be too 
inefficient (de-quantizing then computing distance metrics in `float32` space) 
... so we must conflate.
   
   Does the Lucene implementation have per-dimension calculation of 
quantiles/min/max?  Or is it really a global min/max computed across all 
dimensions and all vectors' values in those dimensions?  
[Scalar-quantization-101](https://www.elastic.co/search-labs/blog/scalar-quantization-101)
 seems to imply it's the latter.  Do ML vectors "typically" have similar 
distributions of values within each dimension?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to