[GitHub] [lucene] jpountz commented on pull request #964: LUCENE-10620: Pass the Weight to Collectors.

GitBox Mon, 20 Jun 2022 00:34:37 -0700


jpountz commented on PR #964:
URL: https://github.com/apache/lucene/pull/964#issuecomment-1160079281


   Now when collectors need to count hits too (I changed IndexSearcher's 
`TOTAL_HITS_THRESHOLD` to `Integer.MAX_VALUE`):
   
   ```
                               TaskQPS baseline      StdDevQPS 
my_modified_version      StdDev                Pct diff p-value
                          OrHighLow       90.31      (6.7%)       35.49      
(1.8%)  -60.7% ( -64% -  -55%) 0.000
                         OrHighHigh       40.98      (5.6%)       21.17      
(2.4%)  -48.3% ( -53% -  -42%) 0.000
                       OrHighNotLow      143.61      (8.2%)       76.04      
(5.4%)  -47.1% ( -56% -  -36%) 0.000
                       OrHighNotMed       88.77      (7.7%)       49.41      
(5.6%)  -44.3% ( -53% -  -33%) 0.000
                      OrHighNotHigh       18.24      (7.4%)       10.59      
(5.9%)  -41.9% ( -51% -  -30%) 0.000
                          OrHighMed       80.82      (5.0%)       48.18      
(2.8%)  -40.4% ( -45% -  -34%) 0.000
                      OrNotHighHigh       51.35      (5.7%)       39.11      
(5.6%)  -23.8% ( -33% -  -13%) 0.000
                        AndHighHigh       53.49      (1.9%)       41.97      
(4.1%)  -21.5% ( -27% -  -15%) 0.000
                         AndHighMed      321.43      (2.4%)      258.39      
(4.5%)  -19.6% ( -25% -  -13%) 0.000
                         AndHighLow     1777.06      (2.7%)     1474.52      
(3.1%)  -17.0% ( -22% -  -11%) 0.000
                          MedPhrase      391.41      (5.9%)      332.93      
(5.1%)  -14.9% ( -24% -   -4%) 0.000
                       OrNotHighMed      313.44      (6.7%)      269.25      
(5.3%)  -14.1% ( -24% -   -2%) 0.000
                       OrNotHighLow     1977.65      (4.2%)     1803.88      
(4.7%)   -8.8% ( -16% -    0%) 0.000
           AndHighHighDayTaxoFacets       25.28      (1.7%)       23.30      
(1.9%)   -7.8% ( -11% -   -4%) 0.000
               MedTermDayTaxoFacets       79.97      (2.6%)       74.42      
(3.6%)   -6.9% ( -12% -    0%) 0.000
                            Prefix3       27.72      (6.2%)       25.83      
(5.2%)   -6.8% ( -17% -    4%) 0.000
                          LowPhrase      159.63      (5.0%)      148.90      
(3.3%)   -6.7% ( -14% -    1%) 0.000
             OrHighMedDayTaxoFacets       19.30      (5.7%)       18.11      
(4.2%)   -6.2% ( -15% -    3%) 0.000
                         HighPhrase       16.15      (5.7%)       15.30      
(4.6%)   -5.2% ( -14% -    5%) 0.001
                           Wildcard       79.98      (2.3%)       76.50      
(3.0%)   -4.4% (  -9% -    1%) 0.000
            AndHighMedDayTaxoFacets       72.60      (2.1%)       69.79      
(1.9%)   -3.9% (  -7% -    0%) 0.000
                       HighSpanNear       44.71      (4.9%)       43.00      
(4.7%)   -3.8% ( -12% -    6%) 0.012
          BrowseDayOfYearTaxoFacets       47.80      (2.0%)       46.05     
(12.3%)   -3.7% ( -17% -   10%) 0.189
                             Fuzzy2      103.64      (2.1%)      100.39      
(1.7%)   -3.1% (  -6% -    0%) 0.000
               BrowseDateTaxoFacets       46.13      (1.9%)       44.69     
(12.0%)   -3.1% ( -16% -   10%) 0.249
        BrowseRandomLabelTaxoFacets       37.71      (2.1%)       36.53     
(10.5%)   -3.1% ( -15% -    9%) 0.195
                        MedSpanNear       68.62      (3.0%)       66.69      
(3.0%)   -2.8% (  -8% -    3%) 0.003
                        LowSpanNear       57.05      (3.0%)       55.49      
(2.8%)   -2.7% (  -8% -    3%) 0.003
              BrowseMonthTaxoFacets       29.68      (7.3%)       28.87     
(12.8%)   -2.7% ( -21% -   18%) 0.410
                             Fuzzy1      128.59      (2.2%)      125.27      
(1.8%)   -2.6% (  -6% -    1%) 0.000
                LowIntervalsOrdered      219.66      (4.2%)      216.18      
(3.4%)   -1.6% (  -8% -    6%) 0.184
               HighIntervalsOrdered       35.55      (5.7%)       35.03      
(4.3%)   -1.5% ( -10% -    9%) 0.361
                   HighSloppyPhrase        8.33     (15.0%)        8.22     
(13.6%)   -1.3% ( -25% -   32%) 0.775
          BrowseDayOfYearSSDVFacets       21.93      (9.9%)       21.80      
(9.3%)   -0.6% ( -17% -   20%) 0.841
              BrowseMonthSSDVFacets       23.61      (8.5%)       23.53      
(8.2%)   -0.3% ( -15% -   17%) 0.904
                            Respell       77.59      (2.0%)       77.42      
(2.3%)   -0.2% (  -4% -    4%) 0.740
        BrowseRandomLabelSSDVFacets       15.20      (5.5%)       15.19      
(5.7%)   -0.1% ( -10% -   11%) 0.971
                MedIntervalsOrdered       43.08      (5.0%)       43.14      
(4.6%)    0.1% (  -9% -   10%) 0.934
                    LowSloppyPhrase       54.27     (10.7%)       54.76     
(10.1%)    0.9% ( -17% -   24%) 0.782
               BrowseDateSSDVFacets        4.21     (12.4%)        4.26     
(11.8%)    1.0% ( -20% -   28%) 0.784
                           PKLookup      240.85      (2.3%)      244.63      
(1.9%)    1.6% (  -2% -    5%) 0.018
                    MedSloppyPhrase       15.08      (9.3%)       15.47      
(9.8%)    2.6% ( -15% -   24%) 0.384
                         TermDTSort      104.10      (2.0%)      108.13      
(3.5%)    3.9% (  -1% -    9%) 0.000
              HighTermDayOfYearSort      103.69      (2.0%)      107.85      
(3.5%)    4.0% (  -1% -    9%) 0.000
               HighTermTitleBDVSort      106.26      (2.3%)      112.05      
(4.1%)    5.5% (   0% -   12%) 0.000
                  HighTermMonthSort      210.13      (2.5%)      224.43     
(12.5%)    6.8% (  -7% -   22%) 0.017
                            LowTerm      754.49     (16.4%)     2902.39     
(24.5%)  284.7% ( 209% -  389%) 0.000
                            MedTerm      251.51      (3.4%)     2585.44     
(48.9%)  928.0% ( 846% - 1015%) 0.000
                           HighTerm      124.40      (3.1%)     1782.93     
(74.1%) 1333.3% (1217% - 1456%) 0.000
                             IntNRQ       12.42     (10.8%)      308.10    
(193.8%) 2380.8% (1963% - 2899%) 0.000
   ```
   
    - IntNRQ and term queries benefit from this change the most because 
`Weight#count` gives the hit count up-front which then enables skipping 
non-competitive hits.
    - Pure disjunctions suffer the most because BS1 is no longer used since the 
weight doesn't know if hits will be skipped based on scores or not.
    - Other scoring queries are impacted because they need to read impacts in 
case the collector would like to skip based on scores.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] jpountz commented on pull request #964: LUCENE-10620: Pass the Weight to Collectors.

Reply via email to