uschindler commented on issue #15054:
URL: https://github.com/apache/lucene/issues/15054#issuecomment-3187277453

   bq. [@uschindler](https://github.com/uschindler) , what is the reason behind 
using READONCE ? Is there some issue with DEFAULT ? We have take heap dumps and 
verified that IndexInput does get closed . If there is a leak , there would be 
a problem even with READONCE as well , right ?
   
   The reason behind read once is that the Linux kernel optimzes for 
read-ahead. So this is ideal for use cases where you read the file once (but 
with the limitation from a single thread). Lucene uses this for files that are 
only read exactly once (mostly small files or those loaded onto heap). This 
speeds up reading and at the same time minimizes expensive shared arenas.
   
   You can use DEFAULT IOContext, but you have to live with limitations of 
MMapDirectory. By default it uses grouping of arena for files from same segment.
   
   If you are only using the directory for replication only (no open indexes on 
it) consider one of the following:
   - disable grouping in MMapDirecory (used for remote copy) using 
https://lucene.apache.org/core/9_12_1/core/org/apache/lucene/store/MMapDirectory.html#setGroupingFunction(java.util.function.Function):
 `mmapDir.setGroupingFunction(MMapDirectory.NO_GROUPING)` - keep in mind that 
this will cause a JVM safepoint with deoptimization of all top stack frames for 
all threads on each `IndexInput#close()`. This will cause serious havoc for 
heavy
   - Rewrite your code to use READONCE and a single thread (which I think is 
still better to do, sorry). If you need multipe threads your network is the 
limitation and you should improve that.
   - use NIOFSDirectory for plain copy operations
   - reimplement your replication to use asynchronous file channels


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to