jpountz opened a new pull request, #13337:
URL: https://github.com/apache/lucene/pull/13337

   This adds `IndexInput#prefetch`, which is an optional operation that 
instructs the `IndexInput` to start fetching bytes from storage in the 
background. These bytes will be picked up by follow-up calls to the 
`IndexInput#readXXX` methods. In the future, this will help Lucene move from a 
maximum of one I/O operation per search thread to one I/O operation per search 
thread per `IndexInput`. Typically, when running a query on two terms, the I/O 
into the terms dictionary is sequential today. In the future, we would ideally 
do these I/Os in parallel using this new API. Note that this will require API 
changes to some classes including `TermsEnum`.
   
   I settled on this API because it's simple and wouldn't require making all 
Lucene APIs asynchronous to take advantage of extra I/O concurrency, which I 
worry would make the query evaluation logic too complicated.
   
   Currently, only `NIOFSDirectory` implements this new API. I played with 
`MMapDirectory` as well and found an approach that worked better in the 
benchmark I've been playing with, but I'm not sure it makes sense to implement 
this API on this directory as it either requires adding an explicit buffer on 
`MMapDirectory`, or forcing data to be loaded into the page cache even though 
the OS may have decided that it's not a good idea due to too few cache hits.
   
   This change will require follow-ups to start using this new API when working 
with terms dictionaries, postings, etc.
   
   Relates #13179


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to