zacharymorn opened a new pull request, #1006: URL: https://github.com/apache/lucene/pull/1006
### Description (or a Jira issue link if you have one) Follow-up changes for https://issues.apache.org/jira/browse/LUCENE-10480 to improve performance for disjunction within conjunction queries. Benchmark results with `wikinightly.tasks` boolean queries below: ``` AndHighHigh: +be +up # freq=2115632 freq=824628 AndHighHigh: +cite +had # freq=1367577 freq=1223103 AndHighHigh: +is +he # freq=4214104 freq=1663980 AndHighHigh: +no +4 # freq=1060681 freq=944177 AndHighHigh: +title +see # freq=2077102 freq=1100862 AndHighMed: +2010 +16 # freq=933686 freq=531050 AndHighMed: +5 +power # freq=849829 freq=257919 AndHighMed: +only +particularly # freq=895806 freq=100045 AndHighMed: +united +1983 # freq=1185528 freq=150075 AndHighMed: +who +ed # freq=1201585 freq=127497 OrHighHigh: are last # freq=1921211 freq=830278 OrHighHigh: at united # freq=2834104 freq=1185528 OrHighHigh: but year # freq=1484398 freq=1098425 OrHighHigh: name its # freq=2577591 freq=1160703 OrHighHigh: to but # freq=6105155 freq=1484398 OrHighMed: at mostly # freq=2834104 freq=89401 OrHighMed: his interview # freq=1771920 freq=94736 OrHighMed: http 9 # freq=3289683 freq=541405 OrHighMed: they hard # freq=1031516 freq=92045 OrHighMed: title bay # freq=2077102 freq=117167 AndHighOrMedMed: +be +(mostly interview) # freq=2115632 freq=89401 freq=94736 AndHighOrMedMed: +cite +(9 hard) # freq=1367577 freq=541405 freq=92045 AndHighOrMedMed: +is +(bay 16) # freq=4214104 freq=117167 freq=531050 AndHighOrMedMed: +no +(power particularly) # freq=1060681 freq=257919 freq=100045 AndHighOrMedMed: +title +(1983 ed) # freq=2077102 freq=150075 freq=127497 AndMedOrHighHigh: +mostly +(are last) # freq=89401 freq=1921211 freq=830278 AndMedOrHighHigh: +interview +(at united) # freq=94736 freq=2834104 freq=1185528 AndMedOrHighHigh: +hard +(but year) # freq=92045 freq=1484398 freq=1098425 AndMedOrHighHigh: +9 +(name its) # freq=541405 freq=2577591 freq=1160703 AndMedOrHighHigh: +bay +(to but) # freq=117167 freq=6105155 freq=1484398 ``` ``` TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value AndHighHigh 40.93 (2.8%) 40.72 (4.2%) -0.5% ( -7% - 6%) 0.659 AndHighMed 150.71 (3.4%) 152.22 (3.7%) 1.0% ( -5% - 8%) 0.371 PKLookup 250.85 (8.7%) 257.51 (8.9%) 2.7% ( -13% - 22%) 0.340 AndHighOrMedMed 66.87 (4.0%) 68.70 (2.7%) 2.7% ( -3% - 9%) 0.012 AndMedOrHighHigh 89.04 (2.6%) 93.28 (3.1%) 4.8% ( 0% - 10%) 0.000 OrHighHigh 21.71 (6.0%) 34.50 (6.8%) 58.9% ( 43% - 76%) 0.000 OrHighMed 85.11 (5.0%) 189.37 (8.0%) 122.5% ( 104% - 142%) 0.000 ``` ``` TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value AndMedOrHighHigh 68.90 (4.5%) 67.15 (4.3%) -2.5% ( -10% - 6%) 0.074 AndHighHigh 73.07 (3.0%) 72.11 (3.5%) -1.3% ( -7% - 5%) 0.212 AndHighMed 146.94 (4.7%) 145.56 (4.9%) -0.9% ( -10% - 9%) 0.550 PKLookup 252.01 (9.3%) 249.71 (13.2%) -0.9% ( -21% - 23%) 0.806 AndHighOrMedMed 65.49 (5.8%) 66.09 (4.9%) 0.9% ( -9% - 12%) 0.600 OrHighHigh 21.34 (6.7%) 29.63 (6.7%) 38.8% ( 23% - 55%) 0.000 OrHighMed 122.61 (8.2%) 227.04 (9.0%) 85.2% ( 62% - 111%) 0.000 ``` ``` TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value AndHighMed 113.58 (2.8%) 113.98 (4.8%) 0.3% ( -7% - 8%) 0.779 AndHighHigh 51.37 (3.2%) 51.58 (5.2%) 0.4% ( -7% - 9%) 0.759 PKLookup 272.05 (8.9%) 276.89 (12.6%) 1.8% ( -18% - 25%) 0.605 AndHighOrMedMed 102.86 (5.1%) 107.47 (5.4%) 4.5% ( -5% - 15%) 0.007 AndMedOrHighHigh 91.55 (3.8%) 96.43 (5.2%) 5.3% ( -3% - 14%) 0.000 OrHighHigh 27.08 (6.5%) 47.16 (11.3%) 74.2% ( 52% - 98%) 0.000 OrHighMed 78.78 (5.9%) 153.46 (12.1%) 94.8% ( 72% - 119%) 0.000 ``` ``` TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value PKLookup 260.41 (9.8%) 261.79 (10.0%) 0.5% ( -17% - 22%) 0.866 AndHighHigh 122.91 (4.0%) 124.37 (5.0%) 1.2% ( -7% - 10%) 0.406 AndHighMed 112.99 (4.6%) 114.77 (5.9%) 1.6% ( -8% - 12%) 0.345 AndHighOrMedMed 81.97 (5.6%) 83.37 (5.9%) 1.7% ( -9% - 13%) 0.342 AndMedOrHighHigh 91.34 (4.7%) 98.16 (5.8%) 7.5% ( -2% - 18%) 0.000 OrHighHigh 21.05 (5.5%) 30.30 (5.7%) 43.9% ( 31% - 58%) 0.000 OrHighMed 98.48 (6.3%) 274.14 (11.2%) 178.4% ( 151% - 208%) 0.000 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org