dungba88 commented on issue #14984:
URL: https://github.com/apache/lucene/issues/14984#issuecomment-3111827170

   A bit related, but I think for re-scoring phase, keeping the query vector at 
32-bit and dot product with 1-bit/4-bit/7-bit may yield better latency recall 
than if we have to quantized it. From the benchmark in 
https://github.com/apache/lucene/pull/14009, using dot_product(32bit, 32bit) 
only added a very small latency, but it's much higher for (7bit, 7bit) due to 
the quantization cost (which would be more prominent when the the number of 
vectors to score are small).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to