gbbafna opened a new issue, #15054:
URL: https://github.com/apache/lucene/issues/15054

   ### Description
   
   Hi,
   
   OpenSearch had to change IOContext from `READONCE` to `DEFAULT` , for its 
remote store feature. This is because in remote store, the thread which closes 
the IndexInput is different from the one which opens it . 
   
   Reference : https://github.com/opensearch-project/OpenSearch/pull/17502 
   
   Post switching to `DEFAULT`, we are seeing maps exhaustion in our production 
workloads. So far we have not been able to reproduce the issue. 
   
   One related issue/repro is 
https://github.com/opensearch-project/k-NN/issues/2665#issuecomment-2814269445 
which points to one open index input preventing the whole arena from getting 
freed up , and ultimately exhaustion of maps. 
   
   
   ```
   [2025-05-24T07:00:02,721][WARN ][o.o.i.s.RemoteStoreRefreshListener] 
[24b9ade6de8c7c3be1382651135d04a1] [vpc-flowlogs-2025.05.24-000673][0] 
Exception while uploading new segments to the remote segment store
   java.io.IOException: Map failed: 
MemorySegmentIndexInput(path="/hdd1/mnt/env/root/ES-PATH/var/es/data/nodes/0/indices/index-uuid/0/index/_6yh.cfs")
 [this may be caused by lack of enough unfragmented virtual address space or 
too restrictive virtual memory limits enforced by the operating system, 
preventing us to map a chunk of 2714694334 bytes. Please review 'ulimit -v', 
'ulimit -m' (both should return 'unlimited'), and 'sysctl vm.max_map_count'. 
More information: 
https://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html]
       at 
java.base/sun.nio.ch.FileChannelImpl.mapInternal(FileChannelImpl.java:1319)
       at java.base/sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:1218)
       at 
org.apache.lucene.store.MemorySegmentIndexInputProvider.map(MemorySegmentIndexInputProvider.java:134)
       at 
org.apache.lucene.store.MemorySegmentIndexInputProvider.openInput(MemorySegmentIndexInputProvider.java:76)
       at 
org.apache.lucene.store.MemorySegmentIndexInputProvider.openInput(MemorySegmentIndexInputProvider.java:33)
       at 
org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.java:394)
       at 
org.opensearch.index.store.FsDirectoryFactory$HybridDirectory.openInput(FsDirectoryFactory.java:181)
       at 
org.apache.lucene.store.FilterDirectory.openInput(FilterDirectory.java:101)
       at 
org.apache.lucene.store.FilterDirectory.openInput(FilterDirectory.java:101)
       at 
org.opensearch.index.store.Store$MetadataSnapshot.checksumFromLuceneFile(Store.java:1213)
       at 
org.opensearch.index.store.Store$MetadataSnapshot.loadMetadata(Store.java:1183)
       at org.opensearch.index.store.Store.getSegmentMetadataMap(Store.java:392)
       at 
org.opensearch.index.shard.IndexShard.computeReplicationCheckpoint(IndexShard.java:1887)
       at 
org.opensearch.index.shard.RemoteStoreRefreshListener.syncSegments(RemoteStoreRefreshListener.java:251)
       at 
org.opensearch.index.shard.RemoteStoreRefreshListener.performAfterRefreshWithPermit(RemoteStoreRefreshListener.java:159)
       at 
org.opensearch.index.shard.ReleasableRetryableRefreshListener.runAfterRefreshWithPermit(ReleasableRetryableRefreshListener.java:167)
       at 
org.opensearch.index.shard.ReleasableRetryableRefreshListener.lambda$scheduleRetry$2(ReleasableRetryableRefreshListener.java:127)
       at 
org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:964)
       at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
       at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
       at java.base/java.lang.Thread.run(Thread.java:1583)
   
   ```
   
   With `IOContext.Default` , we are seeing exhaustion of maps . System limits 
are as below : 
   
   ```
   ❯ ulimit -m
   unlimited
   
   ❯ulimit -v 
   unlimited
   
   ❯ sysctl vm.max_map_count
   vm.max_map_count = 1048576
   ```
   
   To solve this we tried with 
`-Dorg.apache.lucene.store.MMapDirectory.sharedArenaMaxPermits=1` . But 
surprisingly, we were seeing the same issue after some time as well. 
   
   We had to disable  Memory Segment Index Input to get over this 
`org.apache.lucene.store.MMapDirectory.enableMemorySegments=false` . But this 
fallback option  is longer available in latest lucene version  . 
   
   
   
   
   
   ### Version and environment details
   
   OpenSearch Version - 2.19
   Lucene Version - 9.12 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to