jpountz commented on code in PR #14510: URL: https://github.com/apache/lucene/pull/14510#discussion_r2072468622
########## lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java: ########## @@ -277,6 +277,32 @@ public final long getMaxChunkSize() { return 1L << chunkSizePower; } + public static ReadAdvice toReadAdvice(IOContext context) { + if (context.context() == IOContext.Context.MERGE + || context.context() == IOContext.Context.FLUSH) { + return ReadAdvice.SEQUENTIAL; + } + + if (context.hints().contains(DataAccessHint.RANDOM)) { + return ReadAdvice.RANDOM; + } + if (context.hints().contains(DataAccessHint.SEQUENTIAL)) { + return ReadAdvice.SEQUENTIAL; + } + + if (context.hints().contains(FileTypeHint.DATA) + || context.hints().contains(FileTypeHint.INDEX)) { + return ReadAdvice.NORMAL; + } + // Postings have a forward-only access pattern, so pass ReadAdvice.NORMAL to perform + // readahead. + if (context.hints().contains(FileDataHint.POSTINGS)) { + return ReadAdvice.NORMAL; + } Review Comment: I think it's natural for `DataAccessHint` to be closely related to `ReadAdvice`. My suggestion would be to do the following: - DataAccessHint has a single enum constant: RANDOM - the lack of a DataAccessHint means that the access pattern is Lucene's standard access pattern: a mix of forward seeks and reads of data in the order of 1kB-1MB (what terms, postings, points and doc values typically do) - Vectors pass a DataAccessHint.RANDOM at open time, they are the only ones to pass a DataAccessHint to the IOContext, because they're the only one to seek backwards to evaluate a single query - To mimic today's behavior MMapDirectory's default implementation uses `Constants.DEFAULT_READADVICE` on all files that have `FileTypeHint.DATA`. - In the future (different PR), we can discuss whether MMapDirectory's default implementation should force `ReadAdvice.RANDOM` for `DataAccessHint.RANDOM`, and whether `Constants.DEFAULT_READADVICE` should move back to `NORMAL`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org