dsmiley commented on PR #14357:
URL: https://github.com/apache/lucene/pull/14357#issuecomment-2729965328

   An aside:  `org.apache.lucene.search.DisjunctionScorer.TwoPhase#matches` 
looks kind of sad, in that each matches() call is going to build a priority 
queue of "unverified matches" (DisiWrapper holding TwoPhaseIterator).  It seems 
strange to populate one on visiting each doc instead of maintaining a fixed 
pre-sorted array of them, since we know which clauses have TPIs.  The 
DisiWrapper could have a TPI index (by match cost) into an array of 
TwoPhaseIterators.  The selected unverifiedMatches per matches() call might be 
noted via a bitmask/bitset that is cheap to set & clear & iterate set bits.  Or 
could just use an array of DisiWrapper that is cleared & filled.  No matchCost 
comparisons & heap manipulation.
   Not sure if I'm over-optimizing here.  The use-case bringing me here is only 
one TPI, and it's approximation is all docs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to