uschindler commented on PR #13570: URL: https://github.com/apache/lucene/pull/13570#issuecomment-2228783249
> To expand a bit on the concern I raised above: > > IIUC, in order for this to work properly (guaranteed to not potentially leak virtual memory address space) it depends on segment filename usage patterns and segment lifecycle. I think we may either need to: > > 1. provide a way to configure specific MMapDirectory instances to bypass this Arena pooling, or > 2. disallow use of MMapDirectory for anything other than Lucene index files (because of the potential for collision of filename patterns that could be parsed as "segments", potentially allowing an Arena to live forever, accumulating an unlimited number of associated files. > > Am I missing something -- is this not a concern? In addition we could add a separate ctor option to disable grouping. In addition I would go for option #2: Like for the segment main file it should also not use grouped segments for any other file that is not a correct segment number (it could validate the filename using the base32 hash starting with "_" or whatever this is). Maybe add a protected method to MMapDirectory that checks if a filename is a segement file) and pass this one as `Predicate<String>` to the provider interface. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org