uschindler commented on issue #15054: URL: https://github.com/apache/lucene/issues/15054#issuecomment-3187277453
bq. [@uschindler](https://github.com/uschindler) , what is the reason behind using READONCE ? Is there some issue with DEFAULT ? We have take heap dumps and verified that IndexInput does get closed . If there is a leak , there would be a problem even with READONCE as well , right ? The reason behind read once is that the Linux kernel optimzes for read-ahead. So this is ideal for use cases where you read the file once (but with the limitation from a single thread). Lucene uses this for files that are only read exactly once (mostly small files or those loaded onto heap). This speeds up reading and at the same time minimizes expensive shared arenas. You can use DEFAULT IOContext, but you have to live with limitations of MMapDirectory. By default it uses grouping of arena for files from same segment. If you are only using the directory for replication only (no open indexes on it) consider one of the following: - disable grouping in MMapDirecory (used for remote copy) using https://lucene.apache.org/core/9_12_1/core/org/apache/lucene/store/MMapDirectory.html#setGroupingFunction(java.util.function.Function): `mmapDir.setGroupingFunction(MMapDirectory.NO_GROUPING)` - keep in mind that this will cause a JVM safepoint with deoptimization of all top stack frames for all threads on each `IndexInput#close()`. This will cause serious havoc for heavy - Rewrite your code to use READONCE and a single thread (which I think is still better to do, sorry). If you need multipe threads your network is the limitation and you should improve that. - use NIOFSDirectory for plain copy operations - reimplement your replication to use asynchronous file channels -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org