benwtrent commented on issue #15132:
URL: https://github.com/apache/lucene/issues/15132#issuecomment-3237099484

   > +1 to such a heuristic. I assume we'd fall back to fully evaluating the 
filter if post filtering removes too many docs?
   
   This gets tricky as we cannot consume iterators twice. But yeah, we would 
fall back.
   
   > I'm not sure about this. If you have a flat format, it should never try 
the post-filter approach? So this has to belong to the format if we want to 
make the right decision?
   
   Maybe, the difficult part is that post-filtering requires gathering more 
than K results in the collector. So, we need to either:
   
    - update our collector interface to allow more results added (I guess we 
can FORCE this a bit through using push instead of overflowInsert)
    - Utilize two collectors (one for gathering more than `k` results and then 
feeding the results to the passed collector). 
   
   More thought is needed here. Maybe just calling "push" until `k` makes sense 
(e.g. having another collector wrapper that delegates to push until the 
oversampled `k` is hit).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to