cbuescher opened a new pull request, #12241: URL: https://github.com/apache/lucene/pull/12241
Today there is no specific ordering of how files are written to a compound file. The current order is determined by iterating over the set of file names in SegmentInfo, which is unspecific. This PR proposes to change to an order based on file size. Colocating data from files that are smaller (typically metadata files like terms index, field info etc...) but accessed often can help when parts of these files are help in cache. In our particular case, the motivation is coming from reading larger compound files from a remote object store through a caching layer that keeps chunks of the file in pages. Keeping small files together can help improve the efficiency of the cache because data that is read often (like metadata) is kept together. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org