viliam-durina opened a new issue, #14348: URL: https://github.com/apache/lucene/issues/14348
### Description Vector similarity search using HNSW accesses the vectors very heavily during the search (the `vec` or `veq` files). Even more than the HNSW graph itself (the `vex` file). If the vector files don't fit into the page cache, the performance is reduced very significantly (around 100x in our particular case). Users typically configure their search servers to have enough RAM to fit these files. Lucene currently uses `ReadAdvice.RANDOM` when opening these files. I think it would be better to use `RANDOM_PRELOAD`. If you agree, I can provide a PR. ### Version and environment details _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org