[GitHub] [lucene] zacharymorn commented on pull request #12194: [GITHUB-11915] [Discussion Only] Make Lucene smarter about long runs of matches via new API on DISI

via GitHub Fri, 10 Mar 2023 00:23:43 -0800


zacharymorn commented on PR #12194:
URL: https://github.com/apache/lucene/pull/12194#issuecomment-1463441417


   Thanks @gsmiller for your review and suggestions!
   
   > What about updating FixedBitSet#or(disi) to use this? That's used when 
rewriting MultiTermQuery instances, and I would think we'd see a performance 
improvement to prefix3 and wildcard benchmark tasks. I guess the thinking there 
would be to "flip" an entire long to -1L at once if a run of 64 docs is 
included in a dense DISI sent to the or (and then subsequently advance beyond 
the "dense run" in the DISI).
   > 
   > This idea will probably have diminishing returns after 
https://github.com/apache/lucene/pull/12055, since we now prioritize only 
building the more sparse iterators into the bitset upfront, but it could still 
help. If you really want to look for impact, try switching RegexpQuery and 
PrefixQuery to use CONSTANT_SCORE_REWRITE instead of 
CONSTANT_SCORE_BLENDED_REWRITE (in there constructors). That should highlight 
an impact in the benchmark.
   
   I think @mdmarshmallow might be working on this as per 
https://github.com/apache/lucene/issues/11915#issuecomment-1459502217.  As part 
of this PR, I've also added a new API to `BitSet#nextClearBit` just like JDK's 
`BitSet` API,  which might be useful here as well. 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [lucene] zacharymorn commented on pull request #12194: [GITHUB-11915] [Discussion Only] Make Lucene smarter about long runs of matches via new API on DISI

Reply via email to