jpountz opened a new pull request, #11903: URL: https://github.com/apache/lucene/pull/11903
Since increasing the number of hits retrieved in nightly benchmarks from 10 to 100, the performance of sorting documents by title dropped back to the level it had before introducing dynamic pruning. This is not too surprising given that the `title` field is a unique field, so the optimization would only kick in when the current 100th hit would have an ordinal that is less than 128 - something that would only happen after collecting most hits. This change increases the threshold to 1024, so that the optimization would kick in when the current 100th hit has an ordinal that is less than 1024, something that happens a bit sooner. Title sort performance chart: http://people.apache.org/~mikemccand/lucenebench/TermTitleSort.html Results on wikimedium10m: ``` TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value HighTermMonthSort 3565.28 (4.1%) 3463.05 (5.2%) -2.9% ( -11% - 6%) 0.052 BrowseMonthTaxoFacets 26.41 (14.7%) 26.09 (14.2%) -1.2% ( -26% - 32%) 0.790 OrHighMedDayTaxoFacets 11.43 (5.8%) 11.32 (5.4%) -1.0% ( -11% - 10%) 0.569 AndHighLow 1956.19 (3.5%) 1938.62 (4.4%) -0.9% ( -8% - 7%) 0.473 BrowseDayOfYearSSDVFacets 20.51 (7.6%) 20.35 (6.1%) -0.8% ( -13% - 13%) 0.720 OrHighNotLow 557.09 (8.6%) 552.97 (6.9%) -0.7% ( -14% - 16%) 0.763 BrowseMonthSSDVFacets 21.68 (9.3%) 21.52 (9.0%) -0.7% ( -17% - 19%) 0.801 TermDTSort 190.07 (3.3%) 188.83 (2.0%) -0.7% ( -5% - 4%) 0.451 MedTermDayTaxoFacets 47.63 (7.7%) 47.34 (6.1%) -0.6% ( -13% - 14%) 0.784 AndHighMed 241.34 (5.8%) 239.93 (4.7%) -0.6% ( -10% - 10%) 0.726 OrHighNotHigh 386.32 (7.3%) 384.46 (6.1%) -0.5% ( -12% - 13%) 0.821 Fuzzy2 89.94 (2.1%) 89.57 (1.4%) -0.4% ( -3% - 3%) 0.461 OrHighMed 219.18 (4.9%) 218.28 (3.0%) -0.4% ( -7% - 7%) 0.748 BrowseDateSSDVFacets 4.98 (10.0%) 4.96 (10.9%) -0.4% ( -19% - 22%) 0.906 AndHighHigh 123.21 (6.5%) 122.74 (6.2%) -0.4% ( -12% - 13%) 0.847 BrowseRandomLabelSSDVFacets 14.74 (5.4%) 14.70 (5.2%) -0.3% ( -10% - 10%) 0.872 Fuzzy1 126.17 (1.6%) 125.86 (1.2%) -0.2% ( -2% - 2%) 0.576 OrNotHighMed 447.90 (4.8%) 446.93 (4.5%) -0.2% ( -9% - 9%) 0.883 OrNotHighHigh 380.27 (6.6%) 379.44 (5.8%) -0.2% ( -11% - 12%) 0.912 OrHighNotMed 439.48 (8.3%) 438.86 (7.1%) -0.1% ( -14% - 16%) 0.953 OrHighHigh 45.25 (5.5%) 45.19 (4.5%) -0.1% ( -9% - 10%) 0.935 PKLookup 239.58 (3.7%) 239.31 (3.5%) -0.1% ( -7% - 7%) 0.923 OrHighLow 218.38 (5.8%) 218.17 (4.8%) -0.1% ( -10% - 11%) 0.954 AndHighHighDayTaxoFacets 21.48 (4.0%) 21.47 (2.6%) -0.1% ( -6% - 6%) 0.945 HighIntervalsOrdered 38.37 (3.7%) 38.35 (4.3%) -0.1% ( -7% - 8%) 0.958 AndHighMedDayTaxoFacets 42.81 (2.2%) 42.78 (1.8%) -0.1% ( -3% - 4%) 0.923 HighTermDayOfYearSort 370.49 (3.6%) 370.29 (2.2%) -0.1% ( -5% - 5%) 0.956 MedTerm 724.30 (8.8%) 723.98 (5.5%) -0.0% ( -13% - 15%) 0.985 LowSloppyPhrase 21.63 (4.5%) 21.63 (4.2%) -0.0% ( -8% - 9%) 0.998 HighTerm 546.02 (8.7%) 546.09 (6.5%) 0.0% ( -13% - 16%) 0.996 MedIntervalsOrdered 80.04 (4.0%) 80.14 (3.8%) 0.1% ( -7% - 8%) 0.919 LowTerm 868.07 (5.9%) 869.19 (6.0%) 0.1% ( -11% - 12%) 0.946 Respell 88.18 (2.2%) 88.41 (1.6%) 0.3% ( -3% - 4%) 0.659 Wildcard 199.91 (3.7%) 200.66 (2.9%) 0.4% ( -6% - 7%) 0.724 LowIntervalsOrdered 132.39 (4.6%) 133.23 (4.9%) 0.6% ( -8% - 10%) 0.676 LowSpanNear 146.08 (4.6%) 147.12 (4.6%) 0.7% ( -8% - 10%) 0.626 OrNotHighLow 956.83 (4.5%) 964.01 (4.0%) 0.8% ( -7% - 9%) 0.579 MedPhrase 86.15 (3.9%) 86.80 (4.2%) 0.8% ( -7% - 9%) 0.558 HighTermTitleBDVSort 20.71 (4.6%) 20.87 (3.2%) 0.8% ( -6% - 9%) 0.522 MedSpanNear 125.92 (2.9%) 127.18 (2.5%) 1.0% ( -4% - 6%) 0.244 HighPhrase 115.28 (4.8%) 116.49 (4.5%) 1.1% ( -7% - 10%) 0.471 LowPhrase 195.71 (4.1%) 197.79 (3.9%) 1.1% ( -6% - 9%) 0.405 Prefix3 235.38 (2.5%) 237.95 (2.3%) 1.1% ( -3% - 6%) 0.153 MedSloppyPhrase 68.60 (3.3%) 69.47 (2.5%) 1.3% ( -4% - 7%) 0.176 BrowseDayOfYearTaxoFacets 37.87 (18.0%) 38.48 (19.3%) 1.6% ( -30% - 47%) 0.785 BrowseDateTaxoFacets 36.92 (17.7%) 37.52 (19.0%) 1.6% ( -29% - 46%) 0.781 HighSloppyPhrase 19.05 (6.2%) 19.36 (5.8%) 1.7% ( -9% - 14%) 0.382 HighSpanNear 46.29 (4.6%) 47.14 (3.2%) 1.8% ( -5% - 10%) 0.139 IntNRQ 125.61 (22.6%) 128.45 (21.5%) 2.3% ( -34% - 59%) 0.746 BrowseRandomLabelTaxoFacets 29.19 (16.5%) 29.88 (18.7%) 2.4% ( -28% - 44%) 0.671 HighTermTitleSort 141.76 (3.6%) 172.58 (3.4%) 21.7% ( 14% - 29%) 0.000 ``` ### Description <!-- If this is your first contribution to Lucene, please make sure you have reviewed the contribution guide. https://github.com/apache/lucene/blob/main/CONTRIBUTING.md --> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org