jpountz commented on PR #13219: URL: https://github.com/apache/lucene/pull/13219#issuecomment-2020818027
> P.S.: Are we using RANDOM at the moment? Not yet, we'd need to start using it where it makes sense like we do for (PRE)LOAD. > I also found https://github.com/elastic/elasticsearch/issues/27748, this person suggests to pass RANDOM for everything. Yeah, Wikimedia also did testing and they [report](https://phabricator.wikimedia.org/T169498) getting best performance with a mmap readahead of 16kB instead of the default of 128kB (it's shared on the same thread). It feels a bit like a bug to me that mmap has such a higher readahead than regular read operations, I wonder if we should recommend lowering this default readahead in our wiki / javadocs instead of trying to work around it by passing RANDOM everywhere. My preference would be to not index too much on how the various hints perform in practice and try to provide what seems to be the correct read advice based on what we know of the access patterns. E.g. postings and doc values data should probably use NORMAL, stored fields, term vectors and vectors data should probably use RANDOM, etc. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org