zacharymorn opened a new pull request, #1006:
URL: https://github.com/apache/lucene/pull/1006

   ### Description (or a Jira issue link if you have one) 
   
   Follow-up changes for https://issues.apache.org/jira/browse/LUCENE-10480 to 
improve performance for disjunction within conjunction queries.
   
   Benchmark results with `wikinightly.tasks` boolean queries below:
   
   ```
   AndHighHigh: +be +up # freq=2115632 freq=824628
   AndHighHigh: +cite +had # freq=1367577 freq=1223103
   AndHighHigh: +is +he # freq=4214104 freq=1663980
   AndHighHigh: +no +4 # freq=1060681 freq=944177
   AndHighHigh: +title +see # freq=2077102 freq=1100862
   AndHighMed: +2010 +16 # freq=933686 freq=531050
   AndHighMed: +5 +power # freq=849829 freq=257919
   AndHighMed: +only +particularly # freq=895806 freq=100045
   AndHighMed: +united +1983 # freq=1185528 freq=150075
   AndHighMed: +who +ed # freq=1201585 freq=127497
   OrHighHigh: are last # freq=1921211 freq=830278
   OrHighHigh: at united # freq=2834104 freq=1185528
   OrHighHigh: but year # freq=1484398 freq=1098425
   OrHighHigh: name its # freq=2577591 freq=1160703
   OrHighHigh: to but # freq=6105155 freq=1484398
   OrHighMed: at mostly # freq=2834104 freq=89401
   OrHighMed: his interview # freq=1771920 freq=94736
   OrHighMed: http 9 # freq=3289683 freq=541405
   OrHighMed: they hard # freq=1031516 freq=92045
   OrHighMed: title bay # freq=2077102 freq=117167
   AndHighOrMedMed: +be +(mostly interview) # freq=2115632 freq=89401 freq=94736
   AndHighOrMedMed: +cite +(9 hard) # freq=1367577 freq=541405 freq=92045
   AndHighOrMedMed: +is +(bay 16) # freq=4214104 freq=117167 freq=531050
   AndHighOrMedMed: +no +(power particularly) # freq=1060681 freq=257919 
freq=100045
   AndHighOrMedMed: +title +(1983 ed) # freq=2077102 freq=150075 freq=127497
   AndMedOrHighHigh: +mostly +(are last) # freq=89401 freq=1921211 freq=830278
   AndMedOrHighHigh: +interview +(at united) # freq=94736 freq=2834104 
freq=1185528
   AndMedOrHighHigh: +hard +(but year) # freq=92045 freq=1484398 freq=1098425
   AndMedOrHighHigh: +9 +(name its) # freq=541405 freq=2577591 freq=1160703
   AndMedOrHighHigh: +bay +(to but) # freq=117167 freq=6105155 freq=1484398
   ```
   
   ```
                               TaskQPS baseline      StdDevQPS 
my_modified_version      StdDev                Pct diff p-value
                        AndHighHigh       40.93      (2.8%)       40.72      
(4.2%)   -0.5% (  -7% -    6%) 0.659
                         AndHighMed      150.71      (3.4%)      152.22      
(3.7%)    1.0% (  -5% -    8%) 0.371
                           PKLookup      250.85      (8.7%)      257.51      
(8.9%)    2.7% ( -13% -   22%) 0.340
                    AndHighOrMedMed       66.87      (4.0%)       68.70      
(2.7%)    2.7% (  -3% -    9%) 0.012
                   AndMedOrHighHigh       89.04      (2.6%)       93.28      
(3.1%)    4.8% (   0% -   10%) 0.000
                         OrHighHigh       21.71      (6.0%)       34.50      
(6.8%)   58.9% (  43% -   76%) 0.000
                          OrHighMed       85.11      (5.0%)      189.37      
(8.0%)  122.5% ( 104% -  142%) 0.000
   ```
   ```
                               TaskQPS baseline      StdDevQPS 
my_modified_version      StdDev                Pct diff p-value
                   AndMedOrHighHigh       68.90      (4.5%)       67.15      
(4.3%)   -2.5% ( -10% -    6%) 0.074
                        AndHighHigh       73.07      (3.0%)       72.11      
(3.5%)   -1.3% (  -7% -    5%) 0.212
                         AndHighMed      146.94      (4.7%)      145.56      
(4.9%)   -0.9% ( -10% -    9%) 0.550
                           PKLookup      252.01      (9.3%)      249.71     
(13.2%)   -0.9% ( -21% -   23%) 0.806
                    AndHighOrMedMed       65.49      (5.8%)       66.09      
(4.9%)    0.9% (  -9% -   12%) 0.600
                         OrHighHigh       21.34      (6.7%)       29.63      
(6.7%)   38.8% (  23% -   55%) 0.000
                          OrHighMed      122.61      (8.2%)      227.04      
(9.0%)   85.2% (  62% -  111%) 0.000
   
   ```
   ```
                               TaskQPS baseline      StdDevQPS 
my_modified_version      StdDev                Pct diff p-value
                         AndHighMed      113.58      (2.8%)      113.98      
(4.8%)    0.3% (  -7% -    8%) 0.779
                        AndHighHigh       51.37      (3.2%)       51.58      
(5.2%)    0.4% (  -7% -    9%) 0.759
                           PKLookup      272.05      (8.9%)      276.89     
(12.6%)    1.8% ( -18% -   25%) 0.605
                    AndHighOrMedMed      102.86      (5.1%)      107.47      
(5.4%)    4.5% (  -5% -   15%) 0.007
                   AndMedOrHighHigh       91.55      (3.8%)       96.43      
(5.2%)    5.3% (  -3% -   14%) 0.000
                         OrHighHigh       27.08      (6.5%)       47.16     
(11.3%)   74.2% (  52% -   98%) 0.000
                          OrHighMed       78.78      (5.9%)      153.46     
(12.1%)   94.8% (  72% -  119%) 0.000
   ```
   ```
                               TaskQPS baseline      StdDevQPS 
my_modified_version      StdDev                Pct diff p-value
                           PKLookup      260.41      (9.8%)      261.79     
(10.0%)    0.5% ( -17% -   22%) 0.866
                        AndHighHigh      122.91      (4.0%)      124.37      
(5.0%)    1.2% (  -7% -   10%) 0.406
                         AndHighMed      112.99      (4.6%)      114.77      
(5.9%)    1.6% (  -8% -   12%) 0.345
                    AndHighOrMedMed       81.97      (5.6%)       83.37      
(5.9%)    1.7% (  -9% -   13%) 0.342
                   AndMedOrHighHigh       91.34      (4.7%)       98.16      
(5.8%)    7.5% (  -2% -   18%) 0.000
                         OrHighHigh       21.05      (5.5%)       30.30      
(5.7%)   43.9% (  31% -   58%) 0.000
                          OrHighMed       98.48      (6.3%)      274.14     
(11.2%)  178.4% ( 151% -  208%) 0.000
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to