viliam-durina opened a new issue, #14348:
URL: https://github.com/apache/lucene/issues/14348

   ### Description
   
   Vector similarity search using HNSW accesses the vectors very heavily during 
the search (the `vec` or `veq` files). Even more than the HNSW graph itself 
(the `vex` file). If the vector files don't fit into the page cache, the 
performance is reduced very significantly (around 100x in our particular case). 
Users typically configure their search servers to have enough RAM to fit these 
files.
   
   Lucene currently uses `ReadAdvice.RANDOM` when opening these files. I think 
it would be better to use `RANDOM_PRELOAD`.
   
   If you agree, I can provide a PR.
   
   ### Version and environment details
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to