abiesps opened a new issue, #15197:
URL: https://github.com/apache/lucene/issues/15197

   I see that Lucene has added prefetching    
https://github.com/apache/lucene/issues/13179 in number of data structures to 
decouple search concurrency with IO concurrency.
   
   In the linked issue, there is an unchecked item on prefetching support to 
points. I want to know if lucene is thinking of adding prefetching support to 
BKD trees in near future. 
   
   I was also exploring the ways to achieve this. At a high level, I was 
thinking that for range queries, point-lookup etc we can traverse the BKD tree 
in same fashion as we do. But instead of visiting leaf nodes (kdd files). We 
can call prefetch on matching leaf nodes file pointers. Iterator can also save 
the matching leaf nodes during traversal. 
   Once the traversal of BKD tree is complete, we can actually make a call to 
visit doc IDs or doc Values from the matching leaf nodes. Hopefully they may be 
prefetched by this time. 
   
   
   Opensearch uses similar variation of BKD tree traversal for range with an 
exception of early termination, if the matching doc id count becomes greater 
than equal to the size parameter requested in the query. 
   For such use cases to support early termination, I was thinking of adding a 
vInt to inner nodes (kdi files) that denotes the total number of documents 
below that node. 
   
   Let me know if this make sense or what does community feel about this.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to