jpountz commented on PR #15160: URL: https://github.com/apache/lucene/pull/15160#issuecomment-3258824979
This is due to the fact that this query needs to decode more integers overall. Conceptually moving to a block size of 256 is similar to keeping a block size of 128 but always decoding two consecutive blocks together. Previously there may have been cases when we would have decoded a 128-docs block but skipped the next one because its impacts reported that none of the docs that it contained had a competitive score. This skipping can no longer happen with blocks of size 256, we would always decode or skip at least 256 postings. All queries need to decode more integers. But there are also some efficiency gains from working on larger blocks. For instance all the decision making that we do to figure out if a block should be decoded or not (such as computing the maximum score of this block) gets amortized over more doc IDs. With some queries, the efficiency gains are bigger than the additional decoding overhead. With other queries, it's the opposite. `FilteredTerm` seems to be in the latter camp. We could look into evaluating `FilteredTerm` via `BlockMaxConjunctionScorer` instead of a `DefaultBulkScorer` on top of a filtered `ImpactsDISI`, this may help `FilteredTerm` a bit by vectorizing its execution logic more. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
