jpountz commented on PR #12055:
URL: https://github.com/apache/lucene/pull/12055#issuecomment-1438199152

   Thanks Greg for sharing more info about how it helped on Amazon Product 
search. Do your queries early terminate somehow (in which case I'd expect this 
change to help the most since it can skip evaluating the tail of long postings)?
   
   I like the idea of having multiple rewrite methods and possibly an `auto` 
method that tries to guess a sensible rewrite method given index statistics. It 
helps keep things simple without having a single rewrite method that needs to 
be heroic.
   
   Reuse of postings enums looks ok to me, we could improve naming and add more 
comments to make it more obviously ok, but we only create up to 16 postings 
enums from scratch, reuse otherwise, and make sure to never reuse a postings 
enum that is in the priority queue. The threshold of 16 looks conservative to 
me so I wouldn't worry about NIOFSDirectory, if we have a problem with 
NIOFSDirectory and this threshold of 16 then many simple boolean queries have 
problems too, which I don't think is the case in practice? The threshold on the 
minimum document frequency should also help here, e.g. a near-PK field would 
only accumulate hits into a DocIdSetBuilder and not pull postings enums?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to