jpountz commented on PR #12490: URL: https://github.com/apache/lucene/pull/12490#issuecomment-1666598365
Opened this PR as a draft to get feedback on the API (if any). Existing tests pass, but I plan on adding more tests before merging as well. Here are the results of this PR on wikimedium10m. Top-k queries on disjunctions and conjunctions get a significant performance boost by removing this overhead. ``` TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value OrHighNotMed 416.61 (4.8%) 399.18 (5.6%) -4.2% ( -13% - 6%) 0.074 OrHighNotLow 534.35 (4.8%) 512.73 (5.7%) -4.0% ( -13% - 6%) 0.087 OrHighNotHigh 340.89 (4.4%) 327.66 (5.1%) -3.9% ( -12% - 5%) 0.069 OrNotHighHigh 395.37 (3.9%) 380.50 (4.7%) -3.8% ( -11% - 5%) 0.051 OrNotHighMed 511.30 (1.9%) 497.11 (3.7%) -2.8% ( -8% - 2%) 0.034 HighTerm 597.37 (6.3%) 580.82 (4.1%) -2.8% ( -12% - 8%) 0.244 MedTerm 802.79 (5.3%) 783.67 (3.4%) -2.4% ( -10% - 6%) 0.235 LowTerm 1092.20 (4.6%) 1069.37 (4.6%) -2.1% ( -10% - 7%) 0.306 IntNRQ 61.95 (14.7%) 61.33 (16.1%) -1.0% ( -27% - 34%) 0.883 HighTermMonthSort 5037.79 (3.0%) 4992.75 (2.8%) -0.9% ( -6% - 5%) 0.494 MedIntervalsOrdered 23.71 (3.8%) 23.55 (7.2%) -0.7% ( -11% - 10%) 0.793 PKLookup 247.34 (3.2%) 245.97 (3.0%) -0.6% ( -6% - 5%) 0.688 HighTermTitleSort 177.34 (4.9%) 176.39 (3.9%) -0.5% ( -8% - 8%) 0.785 LowIntervalsOrdered 35.58 (3.2%) 35.43 (5.2%) -0.4% ( -8% - 8%) 0.829 OrNotHighLow 1495.12 (2.1%) 1489.60 (2.4%) -0.4% ( -4% - 4%) 0.718 HighTermDayOfYearSort 407.38 (1.4%) 405.96 (1.1%) -0.4% ( -2% - 2%) 0.537 LowPhrase 68.81 (2.4%) 68.66 (1.3%) -0.2% ( -3% - 3%) 0.796 Respell 85.14 (1.2%) 85.05 (2.1%) -0.1% ( -3% - 3%) 0.884 MedSpanNear 28.29 (2.6%) 28.27 (3.3%) -0.1% ( -5% - 5%) 0.944 Wildcard 157.04 (3.0%) 157.00 (3.5%) -0.0% ( -6% - 6%) 0.987 TermDTSort 194.42 (2.2%) 194.53 (1.1%) 0.1% ( -3% - 3%) 0.943 LowSpanNear 84.26 (2.5%) 84.32 (3.0%) 0.1% ( -5% - 5%) 0.950 HighPhrase 42.16 (3.0%) 42.23 (2.2%) 0.2% ( -4% - 5%) 0.892 MedPhrase 132.52 (2.5%) 132.78 (1.4%) 0.2% ( -3% - 4%) 0.827 HighSpanNear 21.37 (3.7%) 21.44 (4.5%) 0.3% ( -7% - 8%) 0.857 LowSloppyPhrase 64.26 (2.1%) 64.62 (2.4%) 0.6% ( -3% - 5%) 0.575 HighTermTitleBDVSort 21.59 (1.6%) 21.79 (2.3%) 0.9% ( -2% - 4%) 0.281 Prefix3 299.62 (2.8%) 302.58 (3.9%) 1.0% ( -5% - 7%) 0.517 MedSloppyPhrase 73.65 (3.0%) 74.38 (3.7%) 1.0% ( -5% - 7%) 0.503 HighIntervalsOrdered 6.22 (4.5%) 6.30 (4.7%) 1.2% ( -7% - 10%) 0.565 HighSloppyPhrase 11.09 (3.7%) 11.22 (3.4%) 1.2% ( -5% - 8%) 0.449 AndHighLow 1246.88 (2.8%) 1262.41 (2.7%) 1.2% ( -4% - 6%) 0.311 Fuzzy2 86.08 (1.1%) 87.67 (1.2%) 1.9% ( 0% - 4%) 0.000 Fuzzy1 121.95 (1.0%) 124.27 (1.3%) 1.9% ( 0% - 4%) 0.000 AndHighMed 199.47 (5.3%) 234.03 (3.0%) 17.3% ( 8% - 27%) 0.000 OrHighLow 413.43 (7.0%) 485.29 (4.6%) 17.4% ( 5% - 31%) 0.000 OrHighMed 191.17 (5.6%) 225.07 (3.7%) 17.7% ( 8% - 28%) 0.000 AndHighHigh 92.26 (5.6%) 108.84 (3.2%) 18.0% ( 8% - 28%) 0.000 OrHighHigh 59.24 (9.2%) 73.86 (6.9%) 24.7% ( 7% - 44%) 0.000 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org