zacharymorn commented on code in PR #1018: URL: https://github.com/apache/lucene/pull/1018#discussion_r922641944
########## lucene/core/src/java/org/apache/lucene/search/BooleanWeight.java: ########## @@ -191,6 +191,69 @@ public long cost() { // or null if it is not applicable // pkg-private for forcing use of BooleanScorer in tests BulkScorer optionalBulkScorer(LeafReaderContext context) throws IOException { + if (scoreMode == ScoreMode.TOP_SCORES) { + if (query.getMinimumNumberShouldMatch() > 1 || weightedClauses.size() > 2) { + return null; + } + + List<ScorerSupplier> optional = new ArrayList<>(); + for (WeightedBooleanClause wc : weightedClauses) { + Weight w = wc.weight; + BooleanClause c = wc.clause; + if (c.getOccur() != Occur.SHOULD) { + continue; + } + ScorerSupplier scorer = w.scorerSupplier(context); + if (scorer != null) { + optional.add(scorer); + } + } + + if (optional.size() <= 1) { + return null; + } + + List<Scorer> optionalScorers = new ArrayList<>(); + for (ScorerSupplier ss : optional) { + optionalScorers.add(ss.get(Long.MAX_VALUE)); + } + + return new BulkScorer() { Review Comment: Thanks for the suggestion! I gave that a try and it did work, but it would reduce the performance boost for OrHighMed from around 110+% to 70+%, most likely due to the extra logic inside `DefaultBulkScorer`. I guess my preference would be to use the anonymous bulk scorer to maintain the performance advantage, but I'm also good with using `DefaultBulkScorer` if reducing potentially duplicated code and keeping things consistent are preferred? ``` TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value TermBGroup1M1P 55.89 (7.1%) 54.01 (6.1%) -3.4% ( -15% - 10%) 0.108 TermDateFacets 34.46 (5.8%) 33.58 (5.0%) -2.5% ( -12% - 8%) 0.138 AndHighOrMedMed 90.90 (5.6%) 88.59 (4.6%) -2.5% ( -12% - 8%) 0.115 BrowseDayOfYearSSDVFacets 28.63 (12.5%) 28.01 (14.5%) -2.2% ( -25% - 28%) 0.612 MedTermDayTaxoFacets 79.74 (5.1%) 78.04 (4.3%) -2.1% ( -10% - 7%) 0.150 TermGroup100 36.28 (3.5%) 35.54 (3.1%) -2.1% ( -8% - 4%) 0.050 TermBGroup1M 30.37 (3.7%) 29.87 (3.8%) -1.6% ( -8% - 6%) 0.165 TermGroup10K 41.33 (3.2%) 40.66 (3.3%) -1.6% ( -7% - 5%) 0.117 PKLookup 330.84 (5.1%) 326.20 (4.3%) -1.4% ( -10% - 8%) 0.349 SloppyPhrase 13.56 (2.8%) 13.39 (2.5%) -1.2% ( -6% - 4%) 0.139 TermGroup1M 39.76 (3.2%) 39.32 (3.2%) -1.1% ( -7% - 5%) 0.272 AndMedOrHighHigh 88.13 (5.5%) 87.22 (4.4%) -1.0% ( -10% - 9%) 0.511 BrowseDateSSDVFacets 4.17 (29.0%) 4.13 (29.0%) -0.8% ( -45% - 80%) 0.933 SpanNear 169.70 (2.6%) 168.59 (2.0%) -0.7% ( -5% - 4%) 0.369 Fuzzy2 83.59 (2.4%) 83.12 (2.2%) -0.6% ( -5% - 4%) 0.442 Respell 96.22 (3.1%) 95.85 (2.6%) -0.4% ( -5% - 5%) 0.672 IntervalsOrdered 23.02 (4.3%) 22.94 (4.2%) -0.3% ( -8% - 8%) 0.799 Wildcard 231.51 (4.5%) 230.74 (5.0%) -0.3% ( -9% - 9%) 0.827 AndHighMed 143.73 (5.8%) 143.48 (4.0%) -0.2% ( -9% - 10%) 0.914 AndHighHighDayTaxoFacets 54.80 (1.3%) 54.71 (1.5%) -0.2% ( -2% - 2%) 0.717 Fuzzy1 154.41 (2.8%) 154.44 (1.9%) 0.0% ( -4% - 4%) 0.981 BrowseMonthSSDVFacets 28.06 (10.4%) 28.06 (13.3%) 0.0% ( -21% - 26%) 0.995 OrHighMedDayTaxoFacets 7.38 (5.7%) 7.39 (5.4%) 0.0% ( -10% - 11%) 0.981 AndHighMedDayTaxoFacets 134.39 (2.0%) 134.59 (1.9%) 0.2% ( -3% - 4%) 0.809 Phrase 38.80 (1.8%) 38.86 (2.5%) 0.2% ( -4% - 4%) 0.823 TermMonthSort 357.58 (5.2%) 359.31 (6.7%) 0.5% ( -10% - 13%) 0.801 TermTitleSort 274.32 (5.1%) 275.77 (6.8%) 0.5% ( -10% - 13%) 0.781 TermDayOfYearSort 259.44 (2.5%) 261.32 (5.9%) 0.7% ( -7% - 9%) 0.615 TermDTSort 208.81 (2.6%) 210.65 (5.8%) 0.9% ( -7% - 9%) 0.532 Term 2535.91 (3.4%) 2562.20 (3.1%) 1.0% ( -5% - 7%) 0.317 AndHighHigh 32.15 (3.3%) 32.55 (3.1%) 1.2% ( -5% - 7%) 0.231 BrowseMonthTaxoFacets 32.02 (34.2%) 32.57 (35.8%) 1.7% ( -50% - 109%) 0.876 Prefix3 404.00 (9.7%) 411.80 (8.5%) 1.9% ( -14% - 22%) 0.505 IntNRQ 86.95 (9.5%) 88.74 (8.7%) 2.1% ( -14% - 22%) 0.475 BrowseRandomLabelSSDVFacets 19.82 (7.6%) 20.34 (9.2%) 2.7% ( -13% - 21%) 0.318 BrowseDayOfYearTaxoFacets 28.80 (33.0%) 31.25 (33.8%) 8.5% ( -43% - 112%) 0.420 BrowseDateTaxoFacets 28.68 (33.1%) 31.15 (33.9%) 8.6% ( -43% - 113%) 0.415 BrowseRandomLabelTaxoFacets 28.81 (45.1%) 31.60 (47.4%) 9.7% ( -57% - 186%) 0.508 OrHighHigh 23.12 (3.5%) 29.67 (6.8%) 28.3% ( 17% - 39%) 0.000 OrHighMed 103.75 (3.7%) 180.03 (11.5%) 73.5% ( 56% - 92%) 0.000 ``` ``` TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value BrowseDateSSDVFacets 4.27 (27.4%) 3.92 (26.3%) -8.2% ( -48% - 62%) 0.333 BrowseRandomLabelTaxoFacets 32.05 (49.5%) 30.51 (47.1%) -4.8% ( -67% - 181%) 0.753 BrowseDateTaxoFacets 30.22 (32.6%) 29.15 (32.4%) -3.6% ( -51% - 91%) 0.730 BrowseDayOfYearTaxoFacets 30.27 (32.4%) 29.24 (32.4%) -3.4% ( -51% - 90%) 0.738 TermDateFacets 34.74 (4.2%) 33.81 (5.9%) -2.7% ( -12% - 7%) 0.101 Prefix3 475.23 (5.7%) 462.80 (5.7%) -2.6% ( -13% - 9%) 0.146 TermBGroup1M1P 55.99 (7.5%) 54.91 (8.1%) -1.9% ( -16% - 14%) 0.434 MedTermDayTaxoFacets 77.78 (3.6%) 76.33 (4.7%) -1.9% ( -9% - 6%) 0.160 BrowseMonthTaxoFacets 30.33 (37.8%) 29.78 (35.6%) -1.8% ( -54% - 115%) 0.877 PKLookup 332.94 (5.3%) 327.77 (5.3%) -1.6% ( -11% - 9%) 0.351 BrowseRandomLabelSSDVFacets 21.30 (9.6%) 21.01 (8.8%) -1.3% ( -18% - 18%) 0.648 AndHighMedDayTaxoFacets 140.47 (1.7%) 138.62 (4.1%) -1.3% ( -6% - 4%) 0.186 IntervalsOrdered 130.34 (5.8%) 128.98 (6.8%) -1.0% ( -12% - 12%) 0.602 SloppyPhrase 97.23 (7.7%) 96.30 (6.1%) -1.0% ( -13% - 13%) 0.667 AndMedOrHighHigh 35.81 (4.0%) 35.48 (3.3%) -0.9% ( -7% - 6%) 0.431 TermGroup100 36.34 (4.1%) 36.02 (3.9%) -0.9% ( -8% - 7%) 0.481 AndHighOrMedMed 92.43 (6.3%) 91.64 (5.2%) -0.9% ( -11% - 11%) 0.635 AndHighMed 151.18 (5.3%) 150.06 (5.0%) -0.7% ( -10% - 10%) 0.651 AndHighHighDayTaxoFacets 13.90 (2.1%) 13.82 (3.0%) -0.6% ( -5% - 4%) 0.461 Phrase 53.96 (4.9%) 53.66 (4.6%) -0.5% ( -9% - 9%) 0.716 Wildcard 382.71 (4.6%) 380.67 (4.5%) -0.5% ( -9% - 8%) 0.710 TermDTSort 142.16 (3.3%) 141.51 (3.2%) -0.5% ( -6% - 6%) 0.659 Term 3700.77 (3.7%) 3686.00 (3.5%) -0.4% ( -7% - 7%) 0.726 TermBGroup1M 36.19 (4.7%) 36.07 (3.5%) -0.3% ( -8% - 8%) 0.808 TermTitleSort 191.76 (6.3%) 191.17 (7.4%) -0.3% ( -13% - 14%) 0.888 TermDayOfYearSort 211.60 (3.5%) 210.97 (4.6%) -0.3% ( -8% - 8%) 0.819 TermMonthSort 224.12 (6.1%) 223.53 (7.6%) -0.3% ( -13% - 14%) 0.902 TermGroup1M 23.75 (3.9%) 23.72 (3.0%) -0.1% ( -6% - 7%) 0.924 OrHighMedDayTaxoFacets 15.65 (5.1%) 15.64 (5.3%) -0.0% ( -9% - 10%) 0.985 IntNRQ 110.02 (0.4%) 109.99 (0.4%) -0.0% ( 0% - 0%) 0.823 Fuzzy1 164.30 (1.3%) 164.42 (2.0%) 0.1% ( -3% - 3%) 0.889 Fuzzy2 83.64 (1.4%) 83.85 (2.0%) 0.2% ( -3% - 3%) 0.649 AndHighHigh 44.91 (3.5%) 45.09 (3.7%) 0.4% ( -6% - 7%) 0.724 Respell 117.19 (2.2%) 117.69 (2.7%) 0.4% ( -4% - 5%) 0.585 SpanNear 28.79 (2.9%) 29.02 (3.4%) 0.8% ( -5% - 7%) 0.426 TermGroup10K 43.33 (3.8%) 43.73 (2.4%) 0.9% ( -5% - 7%) 0.357 BrowseDayOfYearSSDVFacets 26.67 (12.7%) 26.94 (8.8%) 1.0% ( -18% - 25%) 0.767 BrowseMonthSSDVFacets 27.30 (8.6%) 28.47 (10.1%) 4.3% ( -13% - 25%) 0.149 OrHighHigh 25.20 (4.7%) 39.24 (4.8%) 55.7% ( 44% - 68%) 0.000 OrHighMed 103.99 (5.1%) 184.11 (5.5%) 77.0% ( 63% - 92%) 0.000 ``` ``` TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value BrowseMonthSSDVFacets 29.26 (10.2%) 28.24 (5.9%) -3.5% ( -17% - 13%) 0.185 BrowseRandomLabelSSDVFacets 20.62 (6.1%) 20.36 (7.4%) -1.2% ( -13% - 13%) 0.565 Prefix3 475.09 (6.1%) 470.79 (6.7%) -0.9% ( -12% - 12%) 0.655 IntervalsOrdered 13.21 (3.1%) 13.12 (2.9%) -0.7% ( -6% - 5%) 0.461 TermDateFacets 48.80 (6.2%) 48.56 (6.2%) -0.5% ( -12% - 12%) 0.800 PKLookup 329.55 (4.9%) 328.09 (3.9%) -0.4% ( -8% - 8%) 0.750 AndHighOrMedMed 117.25 (3.9%) 116.74 (4.9%) -0.4% ( -8% - 8%) 0.759 MedTermDayTaxoFacets 78.41 (5.5%) 78.16 (5.5%) -0.3% ( -10% - 11%) 0.855 Phrase 232.22 (2.8%) 231.49 (2.3%) -0.3% ( -5% - 4%) 0.694 TermTitleSort 224.25 (6.4%) 223.78 (6.3%) -0.2% ( -12% - 13%) 0.916 TermBGroup1M 39.69 (3.2%) 39.60 (3.9%) -0.2% ( -6% - 7%) 0.852 TermMonthSort 277.13 (6.3%) 276.61 (6.4%) -0.2% ( -12% - 13%) 0.925 TermDayOfYearSort 259.60 (3.2%) 259.15 (3.6%) -0.2% ( -6% - 6%) 0.872 AndHighMed 185.96 (3.9%) 185.74 (5.3%) -0.1% ( -8% - 9%) 0.936 TermGroup100 51.51 (3.3%) 51.52 (3.8%) 0.0% ( -6% - 7%) 0.976 TermGroup1M 39.83 (2.7%) 39.85 (3.4%) 0.1% ( -5% - 6%) 0.957 TermGroup10K 41.38 (2.9%) 41.40 (3.1%) 0.1% ( -5% - 6%) 0.950 Wildcard 238.58 (4.1%) 238.92 (3.4%) 0.1% ( -7% - 7%) 0.904 SloppyPhrase 4.75 (4.1%) 4.76 (3.0%) 0.2% ( -6% - 7%) 0.834 IntNRQ 265.85 (1.0%) 266.60 (0.7%) 0.3% ( -1% - 1%) 0.298 AndHighHighDayTaxoFacets 54.33 (1.6%) 54.49 (1.5%) 0.3% ( -2% - 3%) 0.554 TermBGroup1M1P 62.07 (4.4%) 62.41 (5.6%) 0.5% ( -9% - 11%) 0.735 AndHighMedDayTaxoFacets 202.86 (2.6%) 204.19 (2.6%) 0.7% ( -4% - 6%) 0.434 Respell 111.78 (3.4%) 112.52 (3.1%) 0.7% ( -5% - 7%) 0.519 Fuzzy1 155.04 (3.6%) 156.09 (3.4%) 0.7% ( -6% - 7%) 0.541 SpanNear 13.68 (1.7%) 13.77 (1.3%) 0.7% ( -2% - 3%) 0.142 Fuzzy2 144.74 (2.9%) 145.81 (2.8%) 0.7% ( -4% - 6%) 0.419 TermDTSort 214.37 (7.3%) 216.04 (8.8%) 0.8% ( -14% - 18%) 0.761 AndMedOrHighHigh 35.72 (3.2%) 36.03 (3.8%) 0.9% ( -5% - 8%) 0.441 AndHighHigh 44.40 (2.8%) 44.83 (4.7%) 1.0% ( -6% - 8%) 0.429 OrHighMedDayTaxoFacets 4.34 (5.5%) 4.38 (7.7%) 1.1% ( -11% - 15%) 0.621 BrowseRandomLabelTaxoFacets 29.93 (46.8%) 30.38 (47.5%) 1.5% ( -63% - 179%) 0.921 BrowseDateSSDVFacets 3.98 (33.0%) 4.04 (32.0%) 1.6% ( -47% - 99%) 0.879 Term 2674.87 (3.8%) 2721.55 (4.1%) 1.7% ( -5% - 10%) 0.164 BrowseDayOfYearTaxoFacets 29.20 (31.2%) 29.75 (33.2%) 1.9% ( -47% - 96%) 0.853 BrowseDateTaxoFacets 29.07 (31.3%) 29.63 (33.1%) 1.9% ( -47% - 96%) 0.851 BrowseDayOfYearSSDVFacets 26.74 (9.6%) 27.76 (9.4%) 3.8% ( -13% - 25%) 0.205 BrowseMonthTaxoFacets 30.73 (33.3%) 32.84 (35.6%) 6.9% ( -46% - 113%) 0.528 OrHighHigh 117.70 (4.1%) 161.58 (10.4%) 37.3% ( 21% - 54%) 0.000 OrHighMed 105.09 (4.3%) 186.95 (10.2%) 77.9% ( 60% - 96%) 0.000 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org