jpountz opened a new pull request, #13958: URL: https://github.com/apache/lucene/pull/13958
PR #13692 tried to speed up advancing by using branchless binary search, but while this yielded a speedup on my machine, this yielded a slowdown on nightly benchmarks. This PR tries a different approach using vectorization. Experimentation suggests that it slows down a bit queries when advancing often goes to the very next doc ID, such as term queries and `OrHighNotXXX` tasks. But it speeds up queries that advance to the next few doc IDs, such as `AndHighHigh`. I think that this is a good trade-off since it slows down some plenty fast queries in exchange for a speedup with some more expensive queries. Here is a `luceneutil` run on `wikibigall` with `-searchConcurrency 0`: ``` TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value OrHighNotHigh 302.78 (2.4%) 283.75 (2.9%) -6.3% ( -11% - -1%) 0.000 OrHighNotMed 384.69 (3.0%) 363.33 (2.8%) -5.6% ( -10% - 0%) 0.000 MedTerm 564.86 (2.2%) 537.04 (3.5%) -4.9% ( -10% - 0%) 0.000 LowTerm 1014.02 (2.2%) 967.37 (3.6%) -4.6% ( -10% - 1%) 0.000 OrHighNotLow 446.38 (3.4%) 427.10 (3.3%) -4.3% ( -10% - 2%) 0.000 HighTerm 485.41 (1.9%) 464.49 (3.2%) -4.3% ( -9% - 0%) 0.000 OrNotHighHigh 229.78 (2.4%) 221.51 (3.1%) -3.6% ( -8% - 1%) 0.000 OrNotHighMed 396.63 (2.7%) 382.41 (3.1%) -3.6% ( -9% - 2%) 0.000 Prefix3 145.65 (3.6%) 142.39 (3.7%) -2.2% ( -9% - 5%) 0.051 IntNRQ 158.04 (4.7%) 154.77 (5.6%) -2.1% ( -11% - 8%) 0.205 CountTerm 8320.96 (3.2%) 8198.56 (4.7%) -1.5% ( -9% - 6%) 0.246 PKLookup 273.35 (3.6%) 269.71 (5.2%) -1.3% ( -9% - 7%) 0.345 Wildcard 83.30 (3.4%) 82.28 (3.1%) -1.2% ( -7% - 5%) 0.234 HighTermMonthSort 3235.98 (3.1%) 3198.04 (2.9%) -1.2% ( -6% - 4%) 0.215 HighTermTitleSort 148.94 (2.5%) 148.38 (2.6%) -0.4% ( -5% - 4%) 0.638 CountOrHighMed 104.51 (2.0%) 104.22 (1.7%) -0.3% ( -3% - 3%) 0.640 HighTermTitleBDVSort 14.67 (5.3%) 14.64 (5.9%) -0.2% ( -10% - 11%) 0.899 AndStopWords 30.68 (3.0%) 30.66 (2.7%) -0.1% ( -5% - 5%) 0.941 CountOrHighHigh 50.17 (2.0%) 50.19 (1.9%) 0.0% ( -3% - 3%) 0.947 OrHighRare 273.82 (4.5%) 273.96 (3.8%) 0.0% ( -7% - 8%) 0.971 TermDTSort 353.37 (6.4%) 354.23 (6.7%) 0.2% ( -12% - 14%) 0.907 Fuzzy1 77.85 (2.6%) 78.12 (2.0%) 0.3% ( -4% - 4%) 0.633 Fuzzy2 73.23 (2.5%) 73.50 (1.9%) 0.4% ( -3% - 4%) 0.594 HighTermDayOfYearSort 836.62 (3.1%) 841.07 (4.0%) 0.5% ( -6% - 7%) 0.639 And2Terms2StopWords 154.49 (1.8%) 155.41 (2.1%) 0.6% ( -3% - 4%) 0.340 OrHighLow 771.90 (2.0%) 778.20 (2.2%) 0.8% ( -3% - 5%) 0.217 And3Terms 167.63 (2.3%) 169.23 (2.2%) 1.0% ( -3% - 5%) 0.176 OrStopWords 33.99 (4.6%) 34.39 (4.1%) 1.2% ( -7% - 10%) 0.388 CountAndHighMed 148.01 (2.4%) 149.91 (1.0%) 1.3% ( -2% - 4%) 0.025 Or2Terms2StopWords 156.93 (2.8%) 159.21 (3.0%) 1.5% ( -4% - 7%) 0.117 AndHighHigh 67.06 (1.3%) 68.07 (1.6%) 1.5% ( -1% - 4%) 0.001 OrMany 18.67 (2.9%) 18.96 (2.9%) 1.5% ( -4% - 7%) 0.089 AndHighMed 185.02 (1.6%) 189.06 (1.3%) 2.2% ( 0% - 5%) 0.000 AndHighLow 948.34 (2.6%) 970.47 (2.6%) 2.3% ( -2% - 7%) 0.004 OrHighHigh 68.42 (1.4%) 70.08 (1.3%) 2.4% ( 0% - 5%) 0.000 Or3Terms 166.47 (2.7%) 171.10 (3.1%) 2.8% ( -2% - 8%) 0.003 OrNotHighLow 964.69 (3.1%) 994.46 (3.3%) 3.1% ( -3% - 9%) 0.002 OrHighMed 222.32 (2.1%) 230.93 (1.5%) 3.9% ( 0% - 7%) 0.000 CountAndHighHigh 48.88 (2.4%) 52.87 (1.3%) 8.2% ( 4% - 12%) 0.000 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org