jpountz commented on PR #964: URL: https://github.com/apache/lucene/pull/964#issuecomment-1158911923
Thanks for looking @romseygeek. To make sure this new API would effectively have more than one use-case, I migrated `TopScoreDocCollector` and `TopFieldCollector` to it too. The immediate benefit is that collectors that pass a `totalHitsThreshold` of `Integer.MAX_VALUE` will still be able to skip non-competitive hits if the weight supports counting hits. In addition to that, I fixed some tests that were assuming that `TotalHitCountCollector` would naively iterate over matches by using a new `DummyTotalHitCountCollector` instead. I verified that there is no performance impact on luceneutil using `wikimedium10m`: ``` TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value HighTerm 2374.78 (5.1%) 2297.55 (5.2%) -3.3% ( -12% - 7%) 0.047 MedTerm 2795.30 (5.4%) 2704.66 (5.6%) -3.2% ( -13% - 8%) 0.063 OrNotHighMed 1448.25 (3.9%) 1427.48 (4.5%) -1.4% ( -9% - 7%) 0.286 OrNotHighHigh 996.35 (3.1%) 982.37 (4.6%) -1.4% ( -8% - 6%) 0.255 OrHighNotMed 1898.69 (3.8%) 1876.02 (4.7%) -1.2% ( -9% - 7%) 0.375 AndHighLow 1049.40 (3.3%) 1042.92 (3.8%) -0.6% ( -7% - 6%) 0.583 HighSloppyPhrase 21.77 (4.0%) 21.66 (4.8%) -0.5% ( -8% - 8%) 0.716 LowTerm 2640.20 (6.3%) 2629.11 (4.2%) -0.4% ( -10% - 10%) 0.803 OrHighNotLow 1667.62 (4.2%) 1660.75 (5.6%) -0.4% ( -9% - 9%) 0.794 OrNotHighLow 1663.32 (3.0%) 1658.41 (4.2%) -0.3% ( -7% - 7%) 0.801 LowSloppyPhrase 54.27 (3.1%) 54.15 (3.6%) -0.2% ( -6% - 6%) 0.834 OrHighNotHigh 1259.39 (3.7%) 1257.03 (4.7%) -0.2% ( -8% - 8%) 0.889 MedSloppyPhrase 115.91 (4.3%) 115.79 (6.1%) -0.1% ( -10% - 10%) 0.952 PKLookup 249.41 (1.2%) 249.32 (1.5%) -0.0% ( -2% - 2%) 0.934 Fuzzy2 118.47 (1.1%) 118.75 (1.2%) 0.2% ( -2% - 2%) 0.538 Respell 74.59 (1.1%) 74.90 (1.5%) 0.4% ( -2% - 3%) 0.323 IntNRQ 682.36 (2.8%) 685.81 (3.7%) 0.5% ( -5% - 7%) 0.628 Fuzzy1 124.32 (1.1%) 125.09 (1.1%) 0.6% ( -1% - 2%) 0.079 MedPhrase 623.13 (3.3%) 627.26 (3.0%) 0.7% ( -5% - 7%) 0.502 OrHighMed 130.02 (3.7%) 130.94 (4.2%) 0.7% ( -6% - 8%) 0.571 LowPhrase 110.49 (3.6%) 111.30 (2.5%) 0.7% ( -5% - 7%) 0.459 Wildcard 40.65 (1.6%) 40.95 (1.8%) 0.7% ( -2% - 4%) 0.167 OrHighLow 1092.12 (3.0%) 1101.15 (2.7%) 0.8% ( -4% - 6%) 0.360 AndHighMed 234.73 (4.5%) 236.77 (5.3%) 0.9% ( -8% - 11%) 0.575 MedSpanNear 28.83 (4.1%) 29.14 (3.3%) 1.1% ( -6% - 8%) 0.369 LowSpanNear 16.20 (4.2%) 16.38 (3.4%) 1.1% ( -6% - 9%) 0.363 HighSpanNear 7.51 (4.7%) 7.59 (3.5%) 1.1% ( -6% - 9%) 0.405 AndHighHigh 70.69 (5.3%) 71.60 (6.4%) 1.3% ( -9% - 13%) 0.486 OrHighHigh 30.64 (3.2%) 31.07 (4.3%) 1.4% ( -5% - 9%) 0.244 HighPhrase 22.89 (3.8%) 23.25 (3.6%) 1.6% ( -5% - 9%) 0.178 Prefix3 421.34 (3.5%) 430.69 (4.4%) 2.2% ( -5% - 10%) 0.078 LowIntervalsOrdered 67.14 (4.8%) 69.35 (5.5%) 3.3% ( -6% - 14%) 0.043 HighIntervalsOrdered 6.49 (7.8%) 6.73 (7.1%) 3.7% ( -10% - 20%) 0.112 MedIntervalsOrdered 37.02 (7.8%) 38.45 (7.3%) 3.9% ( -10% - 20%) 0.108 HighTermDayOfYearSort 144.92 (3.7%) 150.78 (4.6%) 4.0% ( -4% - 12%) 0.002 TermDTSort 204.11 (7.0%) 213.24 (7.7%) 4.5% ( -9% - 20%) 0.055 HighTermMonthSort 154.26 (4.0%) 161.70 (4.9%) 4.8% ( -3% - 14%) 0.001 HighTermTitleBDVSort 248.08 (3.7%) 262.32 (8.8%) 5.7% ( -6% - 18%) 0.007 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org