mikemccand commented on issue #13565:
URL: https://github.com/apache/lucene/issues/13565#issuecomment-2244785292

   > This is what got me to thinking of BP for HNSW search: intuitively, it 
could help a lot when the dataset size exceeds the size of the page cache?
   
   I think that gains might be astounding?
   
   Similar vectors would be stored near each other in the `.vec` / `.veq` file, 
so paging in larger blocks / OS readahead could be very effective (though we 
may have to turn off `MADV_RANDOM` and see if it helps).  It should also mean 
less broad exploration of the graph: once you find your "neighborhood" of 
similar-ish vectors you spend some effort there and more quickly get to the top 
K.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to