zacharymorn commented on PR #12194: URL: https://github.com/apache/lucene/pull/12194#issuecomment-1464741062
Thanks @mdmarshmallow for working on it! Btw, I just pushed a commit (https://github.com/apache/lucene/pull/12194/commits/f78182bae7e92b23136b5975dbb9d3199e5e3065) that fixed some bugs identified by randomized tests, you may want to pull that for your work especially if you are using `BitSet#nextClearBit` method. >I suspect we may not really see any benefit though if the DISI can only expose the next non-matching doc within its current block. I think the real advantage here would come from being able to actually skip blocks in the DISI, which would rely on knowing that there are actually multiple consecutive "dense" blocks in a DISI. But... if our only way to know there are multiple consecutive "dense" blocks involves decoding those blocks anyway, maybe there isn't much gain to be had? Hmm... not sure. Excited to see what we learn though! @gsmiller Yeah I'm guessing that as well especially for posting and sparse / dense block, as it would take at least one pass to identify the next candidate. I have tried to cache the result as well and would like to see if that helps, and how it performs under benchmark. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org