BryceKan3 opened a new issue, #15387:
URL: https://github.com/apache/lucene/issues/15387

   ### Description
   
   Currently, as part of DirectoryReader.open() Lucene will [sequentially 
create segment readers for each 
segment](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/index/StandardDirectoryReader.java#L91-L95)
 
   
   <img width="467" height="523" alt="Image" 
src="https://github.com/user-attachments/assets/ed50edf1-b7ec-4787-8e90-c9cbef1e0774";
 />
   
   This can be a very slow operation due to the I/O on the SegmentReader 
creation. If we add support for an Executor to be passed in to 
DirectoryReader.open() calls we can submit the segment reader creations into 
the threadpool and achieve up to an 80% decrease in DirectoryReader.open() 
times. The implementation would need to be fully backwards compatible - 
(possibly by adding an additional constructor) and users would pass their own 
executors into the function so they can control the threading. 
   
   <img width="968" height="672" alt="Image" 
src="https://github.com/user-attachments/assets/5a856a63-bbb3-4ed0-8409-acdf98fedd70";
 />
   
   
   I have created a proof of concept for this and have been able to observe the 
following improvements
   
   
   
   Optimization | P50 (ms) | P90 (ms) | P99 (ms) | P50 Reduction %
   --- | --- | --- | --- | --- |
   Baseline | 995 | 1020 | 1041 | N/A 
   Concurrent SegmentReader Initialization | 171 | 178 | 188 | 82.81%
   
   Above was run with 100 Iterations, 16 Segments, 20 threads passed into the 
executor service with a cleared page cache
   
   I believe this enhancement will provide significant value to the Lucene 
community and would love feedback on this approach.
   
   Thanks
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to