uschindler commented on code in PR #13381: URL: https://github.com/apache/lucene/pull/13381#discussion_r1605815826
########## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java: ########## @@ -344,7 +354,11 @@ public void prefetch(long offset, long length) throws IOException { } final MemorySegment prefetchSlice = segment.asSlice(offset, length); - nativeAccess.madviseWillNeed(prefetchSlice); + if (nativeAccess.mincore(prefetchSlice) == false) { Review Comment: Like Robert said: `MemorySegment::isLoaded` is doing that. It also works on zero-length / without-address segments (see the "unmapper" check, if there is no unmapper it is not an mmapped segment and it has no address). The actual `SCOPED_MEMORY_ACCESS.isLoaded()` call uses `mincore()`: - https://github.com/openjdk/jdk/blob/b92bd671835c37cff58e2cdcecd0fe4277557d7f/src/java.base/share/classes/jdk/internal/misc/X-ScopedMemoryAccess.java.template#L239-L258 - https://github.com/openjdk/jdk/blob/b92bd671835c37cff58e2cdcecd0fe4277557d7f/src/java.base/share/classes/java/nio/Buffer.java#L903-L906 - https://github.com/openjdk/jdk/blob/b92bd671835c37cff58e2cdcecd0fe4277557d7f/src/java.base/share/classes/java/nio/MappedMemoryUtils.java#L36-L46 - and finally the C code, e.g. for UNIX flavour: https://github.com/openjdk/jdk/blob/b92bd671835c37cff58e2cdcecd0fe4277557d7f/src/java.base/unix/native/libnio/MappedMemoryUtils.c#L43-L80 So it is actually a copy of your code in a mixed native and java code! ########## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java: ########## @@ -57,6 +58,7 @@ abstract class MemorySegmentIndexInput extends IndexInput implements RandomAcces MemorySegment curSegment; // redundant for speed: segments[curSegmentIndex], also marker if closed! long curPosition; // relative to curSegment, not globally + int consecutivePrefetchHitCount; Review Comment: My problem with this code is that clones of the IndexInput do not inherit that counter. Of course this won't be a problem but could cause additional calls. A shared (mutable) variable is a no-go as clones are normally used in different threads. I think we should maybe copy the counter when creating clones or slices. ########## lucene/core/src/java21/org/apache/lucene/store/NativeAccess.java: ########## @@ -32,6 +32,9 @@ abstract class NativeAccess { */ public abstract void madviseWillNeed(MemorySegment segment) throws IOException; + /** Returns {@code true} if pages from the given {@link MemorySegment} are resident in RAM. */ Review Comment: revert ########## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java: ########## @@ -318,6 +320,14 @@ public void prefetch(long offset, long length) throws IOException { Objects.checkFromIndexSize(offset, length, length()); + if (BitUtil.isZeroOrPowerOfTwo(consecutivePrefetchHitCount++) == false) { + // We've had enough consecutive hits on the page cache that this number is neither zero nor a + // power of two. There is a good chance that a good chunk of this index input is cached in + // physical memory. Let's skip the overhead of the madvise system call, we'll be trying again + // on the next power of two of the counter. + return; + } + if (NATIVE_ACCESS.isEmpty()) { Review Comment: I would move the native access check to the top. ########## lucene/core/src/java21/org/apache/lucene/store/PosixNativeAccess.java: ########## @@ -17,6 +17,7 @@ package org.apache.lucene.store; import java.io.IOException; +import java.lang.foreign.Arena; Review Comment: revert -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org