jpountz opened a new pull request, #11903:
URL: https://github.com/apache/lucene/pull/11903

   Since increasing the number of hits retrieved in nightly benchmarks from 10 
to 100, the performance of sorting documents by title dropped back to the level 
it had before introducing dynamic pruning. This is not too surprising given 
that the `title` field is a unique field, so the optimization would only kick 
in when the current 100th hit would have an ordinal that is less than 128 - 
something that would only happen after collecting most hits.
   
   This change increases the threshold to 1024, so that the optimization would 
kick in when the current 100th hit has an ordinal that is less than 1024, 
something that happens a bit sooner.
   
   Title sort performance chart: 
http://people.apache.org/~mikemccand/lucenebench/TermTitleSort.html
   
   Results on wikimedium10m:
   
   ```
                              TaskQPS baseline      StdDevQPS 
my_modified_version      StdDev                Pct diff p-value
                  HighTermMonthSort     3565.28      (4.1%)     3463.05      
(5.2%)   -2.9% ( -11% -    6%) 0.052
              BrowseMonthTaxoFacets       26.41     (14.7%)       26.09     
(14.2%)   -1.2% ( -26% -   32%) 0.790
             OrHighMedDayTaxoFacets       11.43      (5.8%)       11.32      
(5.4%)   -1.0% ( -11% -   10%) 0.569
                         AndHighLow     1956.19      (3.5%)     1938.62      
(4.4%)   -0.9% (  -8% -    7%) 0.473
          BrowseDayOfYearSSDVFacets       20.51      (7.6%)       20.35      
(6.1%)   -0.8% ( -13% -   13%) 0.720
                       OrHighNotLow      557.09      (8.6%)      552.97      
(6.9%)   -0.7% ( -14% -   16%) 0.763
              BrowseMonthSSDVFacets       21.68      (9.3%)       21.52      
(9.0%)   -0.7% ( -17% -   19%) 0.801
                         TermDTSort      190.07      (3.3%)      188.83      
(2.0%)   -0.7% (  -5% -    4%) 0.451
               MedTermDayTaxoFacets       47.63      (7.7%)       47.34      
(6.1%)   -0.6% ( -13% -   14%) 0.784
                         AndHighMed      241.34      (5.8%)      239.93      
(4.7%)   -0.6% ( -10% -   10%) 0.726
                      OrHighNotHigh      386.32      (7.3%)      384.46      
(6.1%)   -0.5% ( -12% -   13%) 0.821
                             Fuzzy2       89.94      (2.1%)       89.57      
(1.4%)   -0.4% (  -3% -    3%) 0.461
                          OrHighMed      219.18      (4.9%)      218.28      
(3.0%)   -0.4% (  -7% -    7%) 0.748
               BrowseDateSSDVFacets        4.98     (10.0%)        4.96     
(10.9%)   -0.4% ( -19% -   22%) 0.906
                        AndHighHigh      123.21      (6.5%)      122.74      
(6.2%)   -0.4% ( -12% -   13%) 0.847
        BrowseRandomLabelSSDVFacets       14.74      (5.4%)       14.70      
(5.2%)   -0.3% ( -10% -   10%) 0.872
                             Fuzzy1      126.17      (1.6%)      125.86      
(1.2%)   -0.2% (  -2% -    2%) 0.576
                       OrNotHighMed      447.90      (4.8%)      446.93      
(4.5%)   -0.2% (  -9% -    9%) 0.883
                      OrNotHighHigh      380.27      (6.6%)      379.44      
(5.8%)   -0.2% ( -11% -   12%) 0.912
                       OrHighNotMed      439.48      (8.3%)      438.86      
(7.1%)   -0.1% ( -14% -   16%) 0.953
                         OrHighHigh       45.25      (5.5%)       45.19      
(4.5%)   -0.1% (  -9% -   10%) 0.935
                           PKLookup      239.58      (3.7%)      239.31      
(3.5%)   -0.1% (  -7% -    7%) 0.923
                          OrHighLow      218.38      (5.8%)      218.17      
(4.8%)   -0.1% ( -10% -   11%) 0.954
           AndHighHighDayTaxoFacets       21.48      (4.0%)       21.47      
(2.6%)   -0.1% (  -6% -    6%) 0.945
               HighIntervalsOrdered       38.37      (3.7%)       38.35      
(4.3%)   -0.1% (  -7% -    8%) 0.958
            AndHighMedDayTaxoFacets       42.81      (2.2%)       42.78      
(1.8%)   -0.1% (  -3% -    4%) 0.923
              HighTermDayOfYearSort      370.49      (3.6%)      370.29      
(2.2%)   -0.1% (  -5% -    5%) 0.956
                            MedTerm      724.30      (8.8%)      723.98      
(5.5%)   -0.0% ( -13% -   15%) 0.985
                    LowSloppyPhrase       21.63      (4.5%)       21.63      
(4.2%)   -0.0% (  -8% -    9%) 0.998
                           HighTerm      546.02      (8.7%)      546.09      
(6.5%)    0.0% ( -13% -   16%) 0.996
                MedIntervalsOrdered       80.04      (4.0%)       80.14      
(3.8%)    0.1% (  -7% -    8%) 0.919
                            LowTerm      868.07      (5.9%)      869.19      
(6.0%)    0.1% ( -11% -   12%) 0.946
                            Respell       88.18      (2.2%)       88.41      
(1.6%)    0.3% (  -3% -    4%) 0.659
                           Wildcard      199.91      (3.7%)      200.66      
(2.9%)    0.4% (  -6% -    7%) 0.724
                LowIntervalsOrdered      132.39      (4.6%)      133.23      
(4.9%)    0.6% (  -8% -   10%) 0.676
                        LowSpanNear      146.08      (4.6%)      147.12      
(4.6%)    0.7% (  -8% -   10%) 0.626
                       OrNotHighLow      956.83      (4.5%)      964.01      
(4.0%)    0.8% (  -7% -    9%) 0.579
                          MedPhrase       86.15      (3.9%)       86.80      
(4.2%)    0.8% (  -7% -    9%) 0.558
               HighTermTitleBDVSort       20.71      (4.6%)       20.87      
(3.2%)    0.8% (  -6% -    9%) 0.522
                        MedSpanNear      125.92      (2.9%)      127.18      
(2.5%)    1.0% (  -4% -    6%) 0.244
                         HighPhrase      115.28      (4.8%)      116.49      
(4.5%)    1.1% (  -7% -   10%) 0.471
                          LowPhrase      195.71      (4.1%)      197.79      
(3.9%)    1.1% (  -6% -    9%) 0.405
                            Prefix3      235.38      (2.5%)      237.95      
(2.3%)    1.1% (  -3% -    6%) 0.153
                    MedSloppyPhrase       68.60      (3.3%)       69.47      
(2.5%)    1.3% (  -4% -    7%) 0.176
          BrowseDayOfYearTaxoFacets       37.87     (18.0%)       38.48     
(19.3%)    1.6% ( -30% -   47%) 0.785
               BrowseDateTaxoFacets       36.92     (17.7%)       37.52     
(19.0%)    1.6% ( -29% -   46%) 0.781
                   HighSloppyPhrase       19.05      (6.2%)       19.36      
(5.8%)    1.7% (  -9% -   14%) 0.382
                       HighSpanNear       46.29      (4.6%)       47.14      
(3.2%)    1.8% (  -5% -   10%) 0.139
                             IntNRQ      125.61     (22.6%)      128.45     
(21.5%)    2.3% ( -34% -   59%) 0.746
        BrowseRandomLabelTaxoFacets       29.19     (16.5%)       29.88     
(18.7%)    2.4% ( -28% -   44%) 0.671
                  HighTermTitleSort      141.76      (3.6%)      172.58      
(3.4%)   21.7% (  14% -   29%) 0.000
   ```
   
   ### Description
   
   <!--
   If this is your first contribution to Lucene, please make sure you have 
reviewed the contribution guide.
   https://github.com/apache/lucene/blob/main/CONTRIBUTING.md
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to