jpountz opened a new pull request, #13941: URL: https://github.com/apache/lucene/pull/13941
It is sometimes possible for `MaxScoreBulkScorer` to compute windows that don't contain many candidate matches, resulting in more time spent evaluating maximum scores per window than evaluating candidate matches on this window. This PR introduces a heuristic that tries to require at least 32 candidate matches per clause per window to amortize the per-window overhead. This results in a speedup for the `OrMany` task. ``` TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value OrHighLow 830.99 (2.8%) 821.55 (2.0%) -1.1% ( -5% - 3%) 0.236 CountAndHighMed 149.53 (3.2%) 148.06 (1.8%) -1.0% ( -5% - 4%) 0.335 CountAndHighHigh 49.23 (3.3%) 48.85 (2.1%) -0.8% ( -6% - 4%) 0.483 OrHighRare 277.29 (5.9%) 275.20 (5.1%) -0.8% ( -11% - 10%) 0.728 LowTerm 1006.28 (2.7%) 999.28 (2.7%) -0.7% ( -5% - 4%) 0.512 OrHighNotMed 461.91 (2.0%) 459.09 (3.1%) -0.6% ( -5% - 4%) 0.556 AndHighMed 205.48 (2.0%) 204.44 (2.2%) -0.5% ( -4% - 3%) 0.547 HighTermTitleBDVSort 20.30 (4.4%) 20.22 (4.0%) -0.4% ( -8% - 8%) 0.798 OrHighNotLow 483.66 (2.2%) 481.97 (4.3%) -0.3% ( -6% - 6%) 0.794 OrNotHighHigh 283.34 (2.3%) 282.47 (2.0%) -0.3% ( -4% - 4%) 0.714 OrNotHighLow 1058.78 (3.5%) 1055.94 (2.6%) -0.3% ( -6% - 6%) 0.826 AndHighHigh 78.53 (1.8%) 78.33 (1.9%) -0.3% ( -3% - 3%) 0.721 OrHighHigh 77.35 (1.6%) 77.23 (1.6%) -0.2% ( -3% - 3%) 0.812 OrNotHighMed 314.20 (2.9%) 313.96 (2.7%) -0.1% ( -5% - 5%) 0.944 And2Terms2StopWords 155.15 (2.9%) 155.07 (1.8%) -0.0% ( -4% - 4%) 0.961 OrHighNotHigh 285.50 (2.5%) 285.63 (1.8%) 0.0% ( -4% - 4%) 0.958 CountOrHighMed 104.73 (1.6%) 104.95 (1.6%) 0.2% ( -2% - 3%) 0.744 And3Terms 167.95 (3.2%) 168.63 (2.6%) 0.4% ( -5% - 6%) 0.729 IntNRQ 90.83 (4.7%) 91.26 (14.9%) 0.5% ( -18% - 21%) 0.913 OrHighMed 200.80 (2.1%) 201.78 (1.7%) 0.5% ( -3% - 4%) 0.511 HighTermTitleSort 149.37 (2.5%) 150.20 (2.0%) 0.6% ( -3% - 5%) 0.528 CountOrHighHigh 49.93 (1.4%) 50.24 (1.5%) 0.6% ( -2% - 3%) 0.270 AndHighLow 1079.98 (2.6%) 1086.73 (3.6%) 0.6% ( -5% - 7%) 0.613 Or2Terms2StopWords 158.09 (4.1%) 159.09 (2.4%) 0.6% ( -5% - 7%) 0.630 HighTerm 515.68 (2.2%) 519.07 (2.6%) 0.7% ( -4% - 5%) 0.490 HighTermMonthSort 3222.57 (3.4%) 3244.84 (2.9%) 0.7% ( -5% - 7%) 0.576 MedTerm 582.99 (2.5%) 587.15 (2.5%) 0.7% ( -4% - 5%) 0.468 Wildcard 82.76 (4.3%) 83.45 (3.8%) 0.8% ( -6% - 9%) 0.599 AndStopWords 30.49 (4.7%) 30.77 (2.4%) 0.9% ( -5% - 8%) 0.537 HighTermDayOfYearSort 813.54 (3.4%) 821.97 (2.1%) 1.0% ( -4% - 6%) 0.355 PKLookup 272.42 (2.7%) 275.38 (2.5%) 1.1% ( -4% - 6%) 0.288 Or3Terms 166.90 (4.3%) 168.77 (2.7%) 1.1% ( -5% - 8%) 0.424 OrStopWords 33.64 (6.5%) 34.29 (3.2%) 1.9% ( -7% - 12%) 0.335 TermDTSort 344.04 (6.6%) 351.30 (5.3%) 2.1% ( -9% - 15%) 0.371 Prefix3 123.31 (3.5%) 126.03 (6.6%) 2.2% ( -7% - 12%) 0.286 CountTerm 8267.89 (4.4%) 8628.08 (4.7%) 4.4% ( -4% - 14%) 0.014 OrMany 13.25 (3.7%) 18.87 (3.7%) 42.4% ( 33% - 51%) 0.000 ``` ### Description <!-- If this is your first contribution to Lucene, please make sure you have reviewed the contribution guide. https://github.com/apache/lucene/blob/main/CONTRIBUTING.md --> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org