[PR] Wraps all iterator with likelyImpactsEnum under BlockMaxConjunctionBulkScorer [lucene]

via GitHub Tue, 29 Jul 2025 05:07:28 -0700


HUSTERGS opened a new pull request, #15004:
URL: https://github.com/apache/lucene/pull/15004


   ### Description
   Like #14023, this PR propose to wrap all iterators (not just the lead) with 
`ScorerUtil::likelyImpactsEnum`, it seems to be helpful with 
`ScorerUtil.applyRequiredClause` (I guess). 
   As before,  I ran luceneutil on `wikimediumall` with `searchConcurrency=0, 
taskCountPerCat=5, taskRepeatCount=50`, result after 20 iterations are shown 
below:
   
   ```
                               TaskQPS baseline      StdDevQPS 
my_modified_version      StdDev                Pct diff p-value
                         OrHighHigh       21.32      (1.9%)       20.70     
(13.5%)   -2.9% ( -17% -   12%) 0.343
                          OrHighMed       67.84      (3.6%)       66.34     
(12.4%)   -2.2% ( -17% -   14%) 0.445
                    DismaxOrHighMed       50.02      (3.3%)       49.17      
(8.1%)   -1.7% ( -12% -    9%) 0.379
                        OrStopWords        8.93      (1.9%)        8.81     
(10.5%)   -1.3% ( -13% -   11%) 0.577
                   DismaxOrHighHigh       35.32      (2.1%)       34.85      
(7.6%)   -1.3% ( -10% -    8%) 0.458
                   AndMedOrHighHigh       16.37      (3.9%)       16.16      
(4.8%)   -1.3% (  -9% -    7%) 0.348
                   IntervalsOrdered        2.43      (3.2%)        2.41      
(3.9%)   -0.8% (  -7% -    6%) 0.501
                           Or3Terms       64.13      (3.8%)       63.74     
(10.2%)   -0.6% ( -14% -   13%) 0.803
                        CountPhrase        2.63      (4.2%)        2.62      
(4.4%)   -0.6% (  -8% -    8%) 0.673
                        AndHighHigh       21.66      (9.0%)       21.58     
(13.0%)   -0.4% ( -20% -   23%) 0.921
                    CountAndHighMed       74.71      (2.8%)       74.58      
(2.3%)   -0.2% (  -5% -    5%) 0.837
                CombinedAndHighHigh        5.69      (2.6%)        5.68      
(1.7%)   -0.1% (  -4% -    4%) 0.839
                         OrHighRare       95.49      (4.7%)       95.39      
(5.2%)   -0.1% (  -9% -   10%) 0.946
                       SloppyPhrase        1.11      (4.8%)        1.11      
(5.4%)   -0.0% (  -9% -   10%) 0.994
                         AndHighMed       52.26      (8.6%)       52.32     
(12.0%)    0.1% ( -18% -   22%) 0.970
                         DismaxTerm      493.97      (5.6%)      494.70      
(6.5%)    0.1% ( -11% -   12%) 0.939
                               Term      455.45      (6.8%)      456.19      
(8.7%)    0.2% ( -14% -   16%) 0.947
                     CountOrHighMed       77.56      (2.6%)       77.69      
(1.9%)    0.2% (  -4% -    4%) 0.820
            CountFilteredOrHighHigh       15.73      (0.9%)       15.77      
(0.8%)    0.2% (  -1% -    1%) 0.443
                            Prefix3       73.94      (3.9%)       74.13      
(4.0%)    0.3% (  -7% -    8%) 0.839
             CountFilteredOrHighMed       17.81      (0.8%)       17.86      
(0.7%)    0.3% (  -1% -    1%) 0.277
                             Phrase        7.50      (3.2%)        7.52      
(2.5%)    0.3% (  -5% -    6%) 0.774
                 Or2Terms2StopWords       61.00      (5.8%)       61.19      
(9.1%)    0.3% ( -13% -   16%) 0.900
                            TermB1M      454.62      (6.7%)      456.15      
(8.6%)    0.3% ( -14% -   16%) 0.890
                FilteredOrStopWords        8.09      (1.8%)        8.12      
(1.7%)    0.4% (  -3% -    3%) 0.510
                             IntSet      295.46      (4.5%)      296.57      
(4.3%)    0.4% (  -8% -    9%) 0.789
                            Term100      454.42      (6.7%)      456.17      
(8.7%)    0.4% ( -14% -   16%) 0.875
                             OrMany        4.63      (5.2%)        4.65      
(6.5%)    0.4% ( -10% -   12%) 0.834
                          TermB1M1P      454.90      (6.9%)      456.70      
(8.8%)    0.4% ( -14% -   17%) 0.874
                CountFilteredPhrase        9.05      (3.2%)        9.09      
(2.0%)    0.4% (  -4% -    5%) 0.632
                   CountAndHighHigh       48.41      (2.2%)       48.61      
(1.9%)    0.4% (  -3% -    4%) 0.515
                             Term1M      454.46      (6.7%)      456.41      
(8.9%)    0.4% ( -14% -   17%) 0.864
                    CountOrHighHigh       49.92      (2.5%)       50.15      
(2.0%)    0.5% (  -3% -    5%) 0.517
                    AndHighOrMedMed       14.07      (3.4%)       14.14      
(3.2%)    0.5% (  -5% -    7%) 0.648
                CountFilteredIntNRQ       16.29      (1.4%)       16.37      
(1.0%)    0.5% (  -1% -    2%) 0.188
                            Respell       36.65      (2.9%)       36.86      
(2.8%)    0.5% (  -5% -    6%) 0.545
                            Term10K      453.90      (6.7%)      456.41      
(8.9%)    0.6% ( -14% -   17%) 0.823
                     FilteredPhrase        9.68      (2.8%)        9.73      
(2.3%)    0.6% (  -4% -    5%) 0.473
                           SpanNear        2.46      (4.8%)        2.48      
(4.2%)    0.6% (  -7% -   10%) 0.677
                    FilteredPrefix3       69.05      (3.8%)       69.47      
(3.5%)    0.6% (  -6% -    8%) 0.600
                     FilteredOrMany        3.98      (2.0%)        4.00      
(2.2%)    0.6% (  -3% -    4%) 0.350
                             Fuzzy1       39.99      (3.8%)       40.26      
(3.7%)    0.7% (  -6% -    8%) 0.574
                           Wildcard       46.83      (3.1%)       47.16      
(2.8%)    0.7% (  -5% -    6%) 0.448
                 FilteredOrHighHigh       12.88      (2.5%)       12.97      
(1.7%)    0.7% (  -3% -    5%) 0.297
                  TermDayOfYearSort      260.15      (2.3%)      262.15      
(2.1%)    0.8% (  -3% -    5%) 0.276
                     FilteredIntNRQ       42.41      (3.3%)       42.74      
(2.4%)    0.8% (  -4% -    6%) 0.390
                        CountOrMany        4.98      (3.7%)        5.02      
(3.3%)    0.8% (  -5% -    8%) 0.471
                             IntNRQ       42.73      (3.2%)       43.10      
(2.5%)    0.9% (  -4% -    6%) 0.345
                             Fuzzy2       36.05      (3.5%)       36.36      
(3.0%)    0.9% (  -5% -    7%) 0.407
                         TermDTSort      135.02      (2.5%)      136.28      
(2.1%)    0.9% (  -3% -    5%) 0.203
                CountFilteredOrMany        4.39      (2.8%)        4.43      
(2.1%)    1.0% (  -3% -    6%) 0.225
                          And3Terms       70.30      (7.1%)       71.03      
(9.8%)    1.0% ( -14% -   19%) 0.701
                  FilteredOrHighMed       38.48      (3.3%)       38.94      
(2.4%)    1.2% (  -4% -    7%) 0.194
                 CombinedOrHighHigh        5.58      (4.3%)        5.64      
(3.1%)    1.2% (  -5% -    9%) 0.307
                          CountTerm     5742.44      (5.3%)     5814.16      
(4.5%)    1.2% (  -8% -   11%) 0.422
                   FilteredOr3Terms       43.18      (3.2%)       43.73      
(2.4%)    1.3% (  -4% -    7%) 0.159
                      TermTitleSort       50.08      (5.9%)       50.72      
(5.0%)    1.3% (  -9% -   12%) 0.463
                 CombinedAndHighMed       21.29      (4.9%)       21.59      
(3.3%)    1.4% (  -6% -   10%) 0.287
                       FilteredTerm       63.73      (3.8%)       64.72      
(2.8%)    1.5% (  -4% -    8%) 0.142
                      TermMonthSort     2060.78      (3.6%)     2092.88      
(3.5%)    1.6% (  -5% -    8%) 0.164
         FilteredOr2Terms2StopWords       49.37      (4.3%)       50.16      
(3.2%)    1.6% (  -5% -    9%) 0.188
                       AndStopWords        8.60      (6.3%)        8.74      
(9.9%)    1.6% ( -13% -   18%) 0.540
                 FilteredAndHighMed       31.03      (3.2%)       31.54      
(3.8%)    1.6% (  -5% -    8%) 0.137
                And2Terms2StopWords       58.34      (7.6%)       59.46      
(8.5%)    1.9% ( -13% -   19%) 0.449
                  CombinedOrHighMed       20.91      (5.7%)       21.36      
(4.3%)    2.2% (  -7% -   12%) 0.170
                       CombinedTerm       11.07      (4.5%)       11.33      
(3.2%)    2.4% (  -5% -   10%) 0.053
        FilteredAnd2Terms2StopWords       58.94      (5.0%)       60.51      
(5.3%)    2.7% (  -7% -   13%) 0.101
               FilteredAndStopWords        8.30      (3.1%)        8.57      
(1.8%)    3.3% (  -1% -    8%) 0.000
                FilteredAndHighHigh       10.23      (3.0%)       10.58      
(2.0%)    3.4% (  -1% -    8%) 0.000
                  FilteredAnd3Terms      100.40      (2.7%)      103.97      
(2.3%)    3.6% (  -1% -    8%) 0.000
   ```
   
   What I'm curious about is that many `Or` type tasks clustered at the top, 
which might not be coincident ?
   
   <!--
   If this is your first contribution to Lucene, please make sure you have 
reviewed the contribution guide.
   https://github.com/apache/lucene/blob/main/CONTRIBUTING.md
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[PR] Wraps all iterator with likelyImpactsEnum under BlockMaxConjunctionBulkScorer [lucene]

Reply via email to