rishabhmaurya opened a new issue, #12534:
URL: https://github.com/apache/lucene/issues/12534

   ### Description
   
   TopFieldCollector, where the number of hits to collect is known before hand, 
can we make use of it to only collect most competitive hits collected by 
NumericCollector? Competitive iterator in NumericCollector will make use of BKD 
(when defined criteria is met) and will traverse the points in ascending order 
and if the sort order is ascending too it works pretty well, as most 
competitive hits would be collected first and rest of them can be discarded 
fast. When query sort order is descending, it can cause priority queue churning 
and read amplification because of doc values retrieval which are 
non-competitive as most competitive docs are towards the end of BKD. 
   If we know how many hits we need to collect, can we directly move to the 
right node of the tree and start collecting competitive docs thereafter? 
   This could be helpful for the cases where segment is huge and difference 
between numHits to collect and docs in the segment matching query are large as 
it could prune large set of non competitive docs. 
   
   BKD and numeric comparator code is new to me and I might be missing few 
critical cases, but here is my best effort to implement something closest - 
https://github.com/rishabhmaurya/lucene/pull/2
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to