jmazanec15 opened a new issue, #13564: URL: https://github.com/apache/lucene/issues/13564
### Description With quantization techniques that are compressing vectors in memory further and further, because of how much information is lost, recall is going to drop. However, with the current quantization support, we already store the full precision float vector values. With this, we could create a two phased search process that oversamples vectors from the quantized index (i.e. r\*k results), and then lazily loads vectors from disk for the r\*k results, re-scoring the vectors. This has been discussed directly or indirectly in a couple different places, but I figured itd make sense to create a separate issue for it: * #13251 * #13468 * #12615 It has been shown to work with binary compression techniques and some PQ in a few different places (https://medium.com/qdrant/hyperfast-vector-search-with-binary-quantization-865052c5c259, https://huggingface.co/blog/embedding-quantization#binary-rescoring). We also have done some experiments in OpenSearch with IVFPQ and re-scoring and its shown pretty strong promise (assuming that the disk IOPS/throughput are strong enough - see https://github.com/opensearch-project/k-NN/issues/1779#user-content-appendix-b-baseline-rescore-experiments). That being said, it can provide similar benefits to the DiskANN approach. One challenge I foresee is the approach requires the quantized ANN index to be resident in memory. However, I am unsure if loading the full precision vectors from disk will cause the page cache to evict the quantized ANN index or if the page cache will naturally adapt to the access pattern. If it is the former, we would probably need some way to pin the quantized ANN index and its vectors in memory. In the future, it could probably be fine tuned quite a bit with access pattern optimizations and/or some kind of hierarchical quantization and re-scoring. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org