zhongshanhao opened a new pull request, #13343:
URL: https://github.com/apache/lucene/pull/13343
Sometime, due to the need to decode impact and calculate the maximum score,
`ImpactsDISI` typically adds more overhead than it enables skipping.
Let's talk the query:
```
+title:a +title:b +title:c +title:d
```
These term(a, b, c, d) has a large doc frequency.
Maybe the query result set is small, not even a minimum competition score is
produced, `BlockMaxConjunctionBulkScorer` and `BlockMaxConjunctionScorer`
still try to get max score at the beginning of the `advance`.
This PR is designed to solve this problem, to advoid the use of
`ImpactsDISI` when no minimum competitive score has been set.
Here are the benchmark of this PR on wikimediumall.
iter 4:
```
TaskQPS baseline StdDevQPS
my_modified_version StdDev Pct diff p-value
AndHighHigh 13.42 (6.3%) 12.65
(5.5%) -5.7% ( -16% - 6%) 0.131
OrHighHigh 11.19 (12.1%) 10.59
(10.1%) -5.4% ( -24% - 19%) 0.441
HighTermMonthSort 1940.04 (2.8%) 1884.34
(6.9%) -2.9% ( -12% - 7%) 0.390
AndHighMed 72.28 (4.5%) 70.47
(4.2%) -2.5% ( -10% - 6%) 0.358
HighSloppyPhrase 4.89 (4.9%) 4.77
(4.9%) -2.5% ( -11% - 7%) 0.428
BrowseMonthSSDVFacets 3.17 (2.3%) 3.11
(1.7%) -1.8% ( -5% - 2%) 0.145
OrNotHighHigh 151.65 (3.2%) 148.99
(3.2%) -1.8% ( -7% - 4%) 0.385
OrHighNotMed 272.56 (6.0%) 267.90
(4.6%) -1.7% ( -11% - 9%) 0.613
HighTerm 283.60 (5.9%) 278.93
(5.4%) -1.6% ( -12% - 10%) 0.647
MedTermDayTaxoFacets 6.97 (7.4%) 6.85
(6.6%) -1.6% ( -14% - 13%) 0.713
OrHighNotLow 277.16 (4.9%) 272.79
(4.6%) -1.6% ( -10% - 8%) 0.597
OrHighNotHigh 172.23 (4.6%) 169.55
(3.6%) -1.6% ( -9% - 6%) 0.552
HighIntervalsOrdered 2.69 (2.3%) 2.65
(2.2%) -1.5% ( -5% - 3%) 0.299
OrHighMed 73.52 (4.8%) 72.66
(3.4%) -1.2% ( -8% - 7%) 0.657
OrNotHighMed 264.98 (3.3%) 262.00
(4.2%) -1.1% ( -8% - 6%) 0.639
OrHighLow 184.70 (5.4%) 182.84
(5.1%) -1.0% ( -10% - 10%) 0.762
TermDTSort 102.46 (2.4%) 101.69
(3.6%) -0.8% ( -6% - 5%) 0.700
MedSloppyPhrase 3.36 (11.1%) 3.34
(11.0%) -0.5% ( -20% - 24%) 0.946
MedTerm 446.96 (6.0%) 445.14
(5.7%) -0.4% ( -11% - 11%) 0.912
BrowseRandomLabelSSDVFacets 2.13 (4.9%) 2.12
(5.0%) -0.4% ( -9% - 10%) 0.904
Wildcard 72.04 (1.9%) 71.89
(1.8%) -0.2% ( -3% - 3%) 0.859
Respell 33.84 (0.7%) 33.78
(0.9%) -0.2% ( -1% - 1%) 0.730
LowTerm 348.43 (4.9%) 348.30
(3.2%) -0.0% ( -7% - 8%) 0.988
LowSloppyPhrase 13.10 (2.9%) 13.10
(3.4%) 0.0% ( -6% - 6%) 0.999
HighTermTitleBDVSort 4.63 (2.8%) 4.63
(2.0%) 0.0% ( -4% - 5%) 0.986
AndHighMedDayTaxoFacets 32.99 (0.5%) 33.00
(1.3%) 0.0% ( -1% - 1%) 0.942
OrNotHighLow 323.13 (0.7%) 323.29
(1.7%) 0.1% ( -2% - 2%) 0.951
LowSpanNear 43.88 (0.7%) 43.95
(1.4%) 0.2% ( -1% - 2%) 0.823
Prefix3 263.17 (0.6%) 263.60
(1.7%) 0.2% ( -2% - 2%) 0.839
AndHighHighDayTaxoFacets 7.51 (1.4%) 7.53
(1.5%) 0.2% ( -2% - 3%) 0.850
Fuzzy1 56.47 (1.2%) 56.58
(1.1%) 0.2% ( -2% - 2%) 0.802
MedIntervalsOrdered 6.49 (3.0%) 6.52
(3.1%) 0.3% ( -5% - 6%) 0.865
PKLookup 122.42 (2.6%) 123.00
(2.5%) 0.5% ( -4% - 5%) 0.774
LowIntervalsOrdered 17.71 (3.3%) 17.79
(3.3%) 0.5% ( -5% - 7%) 0.820
MedPhrase 73.52 (3.2%) 74.02
(4.4%) 0.7% ( -6% - 8%) 0.781
OrHighMedDayTaxoFacets 3.17 (4.7%) 3.19
(5.8%) 0.7% ( -9% - 11%) 0.836
HighSpanNear 3.00 (0.8%) 3.02
(2.5%) 0.7% ( -2% - 4%) 0.539
HighTermDayOfYearSort 196.79 (1.1%) 198.64
(0.9%) 0.9% ( -1% - 2%) 0.132
AndHighLow 256.98 (3.6%) 259.46
(2.4%) 1.0% ( -4% - 7%) 0.617
HighPhrase 24.61 (4.0%) 24.89
(4.3%) 1.1% ( -6% - 9%) 0.667
MedSpanNear 11.10 (1.5%) 11.23
(3.6%) 1.1% ( -3% - 6%) 0.516
BrowseDateSSDVFacets 0.73 (3.7%) 0.74
(4.6%) 1.1% ( -6% - 9%) 0.665
Fuzzy2 55.52 (1.8%) 56.47
(1.3%) 1.7% ( -1% - 4%) 0.077
HighTermTitleSort 53.70 (2.8%) 54.69
(4.1%) 1.9% ( -4% - 8%) 0.402
LowPhrase 8.64 (5.1%) 8.89
(5.0%) 2.9% ( -6% - 13%) 0.363
BrowseDayOfYearSSDVFacets 2.77 (3.0%) 2.91
(11.6%) 5.0% ( -9% - 20%) 0.346
IntNRQ 39.03 (6.5%) 41.43
(7.8%) 6.2% ( -7% - 21%) 0.175
BrowseRandomLabelTaxoFacets 2.60 (3.0%) 3.09
(34.7%) 18.6% ( -18% - 58%) 0.234
BrowseDateTaxoFacets 3.15 (2.7%) 3.83
(38.6%) 21.6% ( -19% - 64%) 0.211
BrowseDayOfYearTaxoFacets 3.15 (2.7%) 3.85
(38.6%) 22.0% ( -18% - 65%) 0.204
BrowseMonthTaxoFacets 3.24 (1.6%) 4.83
(58.6%) 49.3% ( -10% - 111%) 0.060
```
The result of benchmark does not seem to add some optimization. 🤔
Should I add relevant test cases?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]