mikemccand commented on issue #13519: URL: https://github.com/apache/lucene/issues/13519#issuecomment-2206113117
> My concern for 8 bit quantization is the algebraic expansion of dot-product and the corrective terms. > > For scalar quantization, the score corrections for dotProduct are derivable via some simple algebra, but I am not immediately aware of a way to handle the sign switch. I didn't bother digging deeper there as int7 provides basically the exact same recall. I am eager to see if 8bit can be applied while keeping the score corrections. > > In case you need it, here is valuable background: > > https://www.elastic.co/search-labs/blog/scalar-quantization-101 > > For some background on the small additional correction provided for int4 (or any scalar quantization where confidence_interval is set to `0`): > > https://www.elastic.co/search-labs/blog/vector-db-optimized-scalar-quantization Thanks @benwtrent -- this is very helpful. I think I understand better :) I see that we are forced to conflate quantization and distance metric because we want to compute the distance metric directly in quantized space using efficient SIMD instructions. I.e. one might ideally consider quantization mathematically as being its own problem regardless of distance metrics: you map `float32` down to `int8` or `int4` etc., and it's bidirectional. You can re-map a quantized vector back to `float32` (introducing quantization noise of course). But that would be too inefficient (de-quantizing then computing distance metrics in `float32` space) ... so we must conflate. Does the Lucene implementation have per-dimension calculation of quantiles/min/max? Or is it really a global min/max computed across all dimensions and all vectors' values in those dimensions? [Scalar-quantization-101](https://www.elastic.co/search-labs/blog/scalar-quantization-101) seems to imply it's the latter. Do ML vectors "typically" have similar distributions of values within each dimension? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org