[GitHub] [lucene] mikemccand commented on issue #12185: Using DirectIODirectory results in BufferOverflowException

via GitHub Thu, 23 Mar 2023 06:55:45 -0700


mikemccand commented on issue #12185:
URL: https://github.com/apache/lucene/issues/12185#issuecomment-1481241581


   > As an aside, in some standard benchmark tests I run with our product, I 
have found the final optimisation of Lucene indexes after all the data has been 
indexed took 36 seconds with NIO, but 148 seconds with NIO+DirectIO enabled. 
For mmap, optimisation took 30 seconds but 100 seconds with DirectIO was 
enabled. So it is odd the use-case DirectIO was meant to speed up actually 
seemed to be slower..
   
   That's a great datapoint -- thanks for sharing!
   
   Maybe try increasing the `mergeBufferSize` `DirectIO` is using?  And maybe 
also the `minBytesDirect`?
   
   We should not expect `DirectIO` to be faster: it is bypassing the OS's 
(helpful!) write buffer cache and taking caching into its own hands, apparently 
quite a bit less effectively.  What it should be helpful for is applications 
that are also doing concurrent searching on the same hardware.  In this case 
the `DirectIO` avoids conflicting with the hot cached pages for searching, 
hopefully reducing long-pole latencies of queries impacted by merging.  But a 
3-4X slowdown on merging is horrible :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] mikemccand commented on issue #12185: Using DirectIODirectory results in BufferOverflowException

Reply via email to