[I] Opening of vector files with ReadAdvice.RANDOM_PRELOAD [lucene]

via GitHub Wed, 12 Mar 2025 03:30:50 -0700


viliam-durina opened a new issue, #14348:
URL: https://github.com/apache/lucene/issues/14348


   ### Description
   
   Vector similarity search using HNSW accesses the vectors very heavily during 
the search (the `vec` or `veq` files). Even more than the HNSW graph itself 
(the `vex` file). If the vector files don't fit into the page cache, the 
performance is reduced very significantly (around 100x in our particular case). 
Users typically configure their search servers to have enough RAM to fit these 
files.
   
   Lucene currently uses `ReadAdvice.RANDOM` when opening these files. I think 
it would be better to use `RANDOM_PRELOAD`.
   
   If you agree, I can provide a PR.
   
   ### Version and environment details
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[I] Opening of vector files with ReadAdvice.RANDOM_PRELOAD [lucene]

Reply via email to