mikemccand commented on issue #10025: URL: https://github.com/apache/lucene/issues/10025#issuecomment-2604376572
> > Michael McCandless ([@mikemccand](https://github.com/mikemccand)) ([migrated from JIRA](https://issues.apache.org/jira/browse/LUCENE-8982?focusedCommentId=17223693&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17223693)) > > Yes please! Feel free to tackle this! I can help w/ benchmarking. > > Just curious.. are there any benchmarking results that can be shared with this enabled? Oh hello, sorry, no I never managed to do any benchmarking here. Did you? I'd still be curious about the results ... direct IO is an interesting low-level optimization (and which [Linus notably is not a fan of](https://www.theregister.com/2019/06/21/linus_torvalds_rant)! Not sure if his thinking has changed...) and it's not clear (to me) where it's actually helpful in Lucene. The original theory / use-case for this directory was to ensure merging segments would write the newly merged segment straight to the storage device, bypassing the OS's write cache, and leaving more free RAM to hold hot pages for searching, reducing page faults for searching while heavy merging is going on. At Amazon Product Search we use Lucene with [near-real-time segment replication](https://blog.mikemccandless.com/2017/09/lucenes-near-real-time-segment-index.html) to efficiently distribute index updates to many replicas (to scale to high QPS) for each shard. Long ago, we tried fixing that segment replication copy to use direct IO, on the same theory that copying in many bytes for new segments might evict hot pages used for searching. But what we found is that direct IO caused even more page faults once the replica lit (switched over to them for searching) the new segments as the OS now had to page in the very cold bytes for the newly copied segments on the synchronous query hot path. We are now wondering if some sort of [bandwidth cap/budget on merge policy or scheduler](https://github.com/apache/lucene/issues/14148) might be a better approach. This is just an anecdote from our production experience, not a full/clean A/B benchmark, but at least it's one data point :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org