jmazanec15 opened a new issue, #13564:
URL: https://github.com/apache/lucene/issues/13564

   ### Description
   
   With quantization techniques that are compressing vectors in memory further 
and further, because of how much information is lost, recall is going to drop. 
However, with the current quantization support, we already store the full 
precision float vector values. With this, we could create a two phased search 
process that oversamples vectors from the quantized index (i.e. r\*k results), 
and then lazily loads vectors from disk for the r\*k results, re-scoring the 
vectors. This has been discussed directly or indirectly in a couple different 
places, but I figured itd make sense to create a separate issue for it:
   * #13251 
   * #13468
   * #12615
   
   It has been shown to work with binary compression techniques and some PQ in 
a few different places 
(https://medium.com/qdrant/hyperfast-vector-search-with-binary-quantization-865052c5c259,
 https://huggingface.co/blog/embedding-quantization#binary-rescoring). We also 
have done some experiments in OpenSearch with IVFPQ and re-scoring and its 
shown pretty strong promise (assuming that the disk IOPS/throughput are strong 
enough - see 
https://github.com/opensearch-project/k-NN/issues/1779#user-content-appendix-b-baseline-rescore-experiments).
   
   That being said, it can provide similar benefits to the DiskANN approach.
   
   One challenge I foresee is the approach requires the quantized ANN index to 
be resident in memory. However, I am unsure if loading the full precision 
vectors from disk will cause the page cache to evict the quantized ANN index or 
if the page cache will naturally adapt to the access pattern. If it is the 
former, we would probably need some way to pin the quantized ANN index and its 
vectors in memory.
   
   In the future, it could probably be fine tuned quite a bit with access 
pattern optimizations and/or some kind of hierarchical quantization and 
re-scoring. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to