zhongshanhao opened a new pull request, #13343: URL: https://github.com/apache/lucene/pull/13343
Sometime, due to the need to decode impact and calculate the maximum score, `ImpactsDISI` typically adds more overhead than it enables skipping. Let's talk the query: ``` +title:a +title:b +title:c +title:d ``` These term(a, b, c, d) has a large doc frequency. Maybe the query result set is small, not even a minimum competition score is produced, `BlockMaxConjunctionBulkScorer` and `BlockMaxConjunctionScorer` still try to get max score at the beginning of the `advance`. This PR is designed to solve this problem, to advoid the use of `ImpactsDISI` when no minimum competitive score has been set. Here are the benchmark of this PR on wikimediumall. iter 4: ``` TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value AndHighHigh 13.42 (6.3%) 12.65 (5.5%) -5.7% ( -16% - 6%) 0.131 OrHighHigh 11.19 (12.1%) 10.59 (10.1%) -5.4% ( -24% - 19%) 0.441 HighTermMonthSort 1940.04 (2.8%) 1884.34 (6.9%) -2.9% ( -12% - 7%) 0.390 AndHighMed 72.28 (4.5%) 70.47 (4.2%) -2.5% ( -10% - 6%) 0.358 HighSloppyPhrase 4.89 (4.9%) 4.77 (4.9%) -2.5% ( -11% - 7%) 0.428 BrowseMonthSSDVFacets 3.17 (2.3%) 3.11 (1.7%) -1.8% ( -5% - 2%) 0.145 OrNotHighHigh 151.65 (3.2%) 148.99 (3.2%) -1.8% ( -7% - 4%) 0.385 OrHighNotMed 272.56 (6.0%) 267.90 (4.6%) -1.7% ( -11% - 9%) 0.613 HighTerm 283.60 (5.9%) 278.93 (5.4%) -1.6% ( -12% - 10%) 0.647 MedTermDayTaxoFacets 6.97 (7.4%) 6.85 (6.6%) -1.6% ( -14% - 13%) 0.713 OrHighNotLow 277.16 (4.9%) 272.79 (4.6%) -1.6% ( -10% - 8%) 0.597 OrHighNotHigh 172.23 (4.6%) 169.55 (3.6%) -1.6% ( -9% - 6%) 0.552 HighIntervalsOrdered 2.69 (2.3%) 2.65 (2.2%) -1.5% ( -5% - 3%) 0.299 OrHighMed 73.52 (4.8%) 72.66 (3.4%) -1.2% ( -8% - 7%) 0.657 OrNotHighMed 264.98 (3.3%) 262.00 (4.2%) -1.1% ( -8% - 6%) 0.639 OrHighLow 184.70 (5.4%) 182.84 (5.1%) -1.0% ( -10% - 10%) 0.762 TermDTSort 102.46 (2.4%) 101.69 (3.6%) -0.8% ( -6% - 5%) 0.700 MedSloppyPhrase 3.36 (11.1%) 3.34 (11.0%) -0.5% ( -20% - 24%) 0.946 MedTerm 446.96 (6.0%) 445.14 (5.7%) -0.4% ( -11% - 11%) 0.912 BrowseRandomLabelSSDVFacets 2.13 (4.9%) 2.12 (5.0%) -0.4% ( -9% - 10%) 0.904 Wildcard 72.04 (1.9%) 71.89 (1.8%) -0.2% ( -3% - 3%) 0.859 Respell 33.84 (0.7%) 33.78 (0.9%) -0.2% ( -1% - 1%) 0.730 LowTerm 348.43 (4.9%) 348.30 (3.2%) -0.0% ( -7% - 8%) 0.988 LowSloppyPhrase 13.10 (2.9%) 13.10 (3.4%) 0.0% ( -6% - 6%) 0.999 HighTermTitleBDVSort 4.63 (2.8%) 4.63 (2.0%) 0.0% ( -4% - 5%) 0.986 AndHighMedDayTaxoFacets 32.99 (0.5%) 33.00 (1.3%) 0.0% ( -1% - 1%) 0.942 OrNotHighLow 323.13 (0.7%) 323.29 (1.7%) 0.1% ( -2% - 2%) 0.951 LowSpanNear 43.88 (0.7%) 43.95 (1.4%) 0.2% ( -1% - 2%) 0.823 Prefix3 263.17 (0.6%) 263.60 (1.7%) 0.2% ( -2% - 2%) 0.839 AndHighHighDayTaxoFacets 7.51 (1.4%) 7.53 (1.5%) 0.2% ( -2% - 3%) 0.850 Fuzzy1 56.47 (1.2%) 56.58 (1.1%) 0.2% ( -2% - 2%) 0.802 MedIntervalsOrdered 6.49 (3.0%) 6.52 (3.1%) 0.3% ( -5% - 6%) 0.865 PKLookup 122.42 (2.6%) 123.00 (2.5%) 0.5% ( -4% - 5%) 0.774 LowIntervalsOrdered 17.71 (3.3%) 17.79 (3.3%) 0.5% ( -5% - 7%) 0.820 MedPhrase 73.52 (3.2%) 74.02 (4.4%) 0.7% ( -6% - 8%) 0.781 OrHighMedDayTaxoFacets 3.17 (4.7%) 3.19 (5.8%) 0.7% ( -9% - 11%) 0.836 HighSpanNear 3.00 (0.8%) 3.02 (2.5%) 0.7% ( -2% - 4%) 0.539 HighTermDayOfYearSort 196.79 (1.1%) 198.64 (0.9%) 0.9% ( -1% - 2%) 0.132 AndHighLow 256.98 (3.6%) 259.46 (2.4%) 1.0% ( -4% - 7%) 0.617 HighPhrase 24.61 (4.0%) 24.89 (4.3%) 1.1% ( -6% - 9%) 0.667 MedSpanNear 11.10 (1.5%) 11.23 (3.6%) 1.1% ( -3% - 6%) 0.516 BrowseDateSSDVFacets 0.73 (3.7%) 0.74 (4.6%) 1.1% ( -6% - 9%) 0.665 Fuzzy2 55.52 (1.8%) 56.47 (1.3%) 1.7% ( -1% - 4%) 0.077 HighTermTitleSort 53.70 (2.8%) 54.69 (4.1%) 1.9% ( -4% - 8%) 0.402 LowPhrase 8.64 (5.1%) 8.89 (5.0%) 2.9% ( -6% - 13%) 0.363 BrowseDayOfYearSSDVFacets 2.77 (3.0%) 2.91 (11.6%) 5.0% ( -9% - 20%) 0.346 IntNRQ 39.03 (6.5%) 41.43 (7.8%) 6.2% ( -7% - 21%) 0.175 BrowseRandomLabelTaxoFacets 2.60 (3.0%) 3.09 (34.7%) 18.6% ( -18% - 58%) 0.234 BrowseDateTaxoFacets 3.15 (2.7%) 3.83 (38.6%) 21.6% ( -19% - 64%) 0.211 BrowseDayOfYearTaxoFacets 3.15 (2.7%) 3.85 (38.6%) 22.0% ( -18% - 65%) 0.204 BrowseMonthTaxoFacets 3.24 (1.6%) 4.83 (58.6%) 49.3% ( -10% - 111%) 0.060 ``` The result of benchmark does not seem to add some optimization. 🤔 Should I add relevant test cases? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org