HUSTERGS commented on PR #14896: URL: https://github.com/apache/lucene/pull/14896#issuecomment-3041718485
Here is the result comparing the branchless way (candidate) vs main branch (baseline) under identical setup: ``` TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value FilteredOr2Terms2StopWords 67.28 (6.6%) 65.96 (6.8%) -2.0% ( -14% - 12%) 0.355 OrHighMed 92.82 (3.8%) 91.23 (4.1%) -1.7% ( -9% - 6%) 0.171 CombinedOrHighMed 28.37 (6.3%) 27.89 (5.8%) -1.7% ( -13% - 11%) 0.376 CombinedAndHighMed 29.11 (5.4%) 28.61 (4.5%) -1.7% ( -11% - 8%) 0.280 AndHighOrMedMed 18.44 (3.3%) 18.14 (2.8%) -1.6% ( -7% - 4%) 0.089 DismaxOrHighMed 67.66 (3.7%) 66.60 (3.9%) -1.6% ( -8% - 6%) 0.194 TermB1M 595.71 (3.6%) 586.50 (4.9%) -1.5% ( -9% - 7%) 0.257 Term1M 596.06 (3.6%) 586.93 (5.0%) -1.5% ( -9% - 7%) 0.266 Term 595.68 (3.6%) 586.74 (5.0%) -1.5% ( -9% - 7%) 0.274 Term10K 594.97 (3.8%) 586.18 (4.9%) -1.5% ( -9% - 7%) 0.283 TermB1M1P 595.60 (3.7%) 586.84 (4.9%) -1.5% ( -9% - 7%) 0.286 Term100 595.04 (3.6%) 586.49 (5.0%) -1.4% ( -9% - 7%) 0.294 CountTerm 7970.70 (5.6%) 7856.82 (5.6%) -1.4% ( -12% - 10%) 0.422 FilteredTerm 87.37 (4.3%) 86.13 (4.6%) -1.4% ( -9% - 7%) 0.314 FilteredOrHighMed 54.11 (5.0%) 53.35 (5.3%) -1.4% ( -11% - 9%) 0.388 OrHighRare 121.49 (3.2%) 119.92 (3.4%) -1.3% ( -7% - 5%) 0.220 CountOrHighHigh 64.26 (2.6%) 63.43 (1.9%) -1.3% ( -5% - 3%) 0.075 DismaxOrHighHigh 47.55 (2.5%) 46.94 (2.9%) -1.3% ( -6% - 4%) 0.135 FilteredAndStopWords 12.03 (2.8%) 11.89 (3.7%) -1.2% ( -7% - 5%) 0.270 FilteredOr3Terms 59.60 (4.5%) 58.92 (4.7%) -1.2% ( -9% - 8%) 0.426 TermTitleSort 73.24 (4.2%) 72.44 (4.2%) -1.1% ( -9% - 7%) 0.410 CountOrMany 6.48 (2.3%) 6.41 (1.9%) -1.1% ( -5% - 3%) 0.108 CountFilteredOrMany 6.08 (1.9%) 6.02 (1.5%) -1.0% ( -4% - 2%) 0.061 CountOrHighMed 97.07 (2.8%) 96.11 (2.6%) -1.0% ( -6% - 4%) 0.250 TermDTSort 193.19 (2.7%) 191.42 (2.4%) -0.9% ( -5% - 4%) 0.252 TermMonthSort 2982.24 (3.7%) 2958.62 (3.4%) -0.8% ( -7% - 6%) 0.482 FilteredAndHighHigh 15.23 (2.8%) 15.12 (3.6%) -0.8% ( -7% - 5%) 0.446 Fuzzy1 50.54 (3.2%) 50.15 (3.7%) -0.8% ( -7% - 6%) 0.473 FilteredOrHighHigh 18.16 (3.2%) 18.02 (3.2%) -0.8% ( -6% - 5%) 0.447 CountAndHighMed 94.00 (2.8%) 93.32 (2.8%) -0.7% ( -6% - 5%) 0.416 DismaxTerm 657.56 (3.5%) 652.99 (3.8%) -0.7% ( -7% - 6%) 0.547 CountAndHighHigh 62.08 (1.8%) 61.67 (1.1%) -0.7% ( -3% - 2%) 0.161 Fuzzy2 45.36 (1.8%) 45.11 (2.4%) -0.6% ( -4% - 3%) 0.399 FilteredPhrase 12.66 (2.3%) 12.59 (2.3%) -0.6% ( -5% - 4%) 0.433 AndHighHigh 28.01 (3.5%) 27.85 (4.2%) -0.5% ( -8% - 7%) 0.658 CombinedTerm 14.62 (3.8%) 14.55 (4.9%) -0.5% ( -8% - 8%) 0.705 Phrase 9.91 (2.1%) 9.86 (2.2%) -0.5% ( -4% - 3%) 0.453 Or2Terms2StopWords 79.00 (7.2%) 78.61 (7.4%) -0.5% ( -14% - 15%) 0.829 IntervalsOrdered 2.96 (2.4%) 2.94 (2.6%) -0.5% ( -5% - 4%) 0.564 FilteredAnd3Terms 132.86 (3.0%) 132.29 (3.4%) -0.4% ( -6% - 6%) 0.670 FilteredOrStopWords 11.15 (2.2%) 11.11 (2.4%) -0.4% ( -4% - 4%) 0.591 And2Terms2StopWords 76.63 (8.2%) 76.34 (8.3%) -0.4% ( -15% - 17%) 0.882 CountFilteredOrHighHigh 25.24 (1.3%) 25.16 (1.3%) -0.3% ( -2% - 2%) 0.451 CountFilteredOrHighMed 29.64 (1.4%) 29.56 (1.3%) -0.2% ( -2% - 2%) 0.570 IntNRQ 48.50 (2.2%) 48.38 (2.3%) -0.2% ( -4% - 4%) 0.741 FilteredIntNRQ 48.16 (2.2%) 48.10 (2.3%) -0.1% ( -4% - 4%) 0.868 CountFilteredIntNRQ 22.17 (1.3%) 22.14 (1.8%) -0.1% ( -3% - 3%) 0.829 FilteredOrMany 5.11 (3.0%) 5.11 (2.7%) 0.0% ( -5% - 5%) 0.993 FilteredAnd2Terms2StopWords 77.42 (6.3%) 77.46 (6.5%) 0.0% ( -12% - 13%) 0.984 SloppyPhrase 1.47 (4.1%) 1.47 (4.6%) 0.1% ( -8% - 9%) 0.928 AndMedOrHighHigh 21.15 (1.9%) 21.19 (1.8%) 0.2% ( -3% - 3%) 0.719 IntSet 401.08 (4.4%) 402.07 (3.9%) 0.2% ( -7% - 8%) 0.851 TermDayOfYearSort 356.48 (1.1%) 357.37 (1.2%) 0.2% ( -2% - 2%) 0.490 SpanNear 3.07 (4.9%) 3.08 (4.9%) 0.3% ( -9% - 10%) 0.858 AndHighMed 70.36 (3.1%) 70.60 (3.0%) 0.3% ( -5% - 6%) 0.722 Respell 43.66 (2.9%) 43.81 (2.3%) 0.3% ( -4% - 5%) 0.674 FilteredAndHighMed 45.14 (2.9%) 45.33 (3.3%) 0.4% ( -5% - 6%) 0.671 CountFilteredPhrase 11.53 (2.0%) 11.59 (2.2%) 0.5% ( -3% - 4%) 0.435 FilteredPrefix3 93.08 (3.5%) 93.69 (2.6%) 0.7% ( -5% - 6%) 0.494 CombinedOrHighHigh 7.09 (5.4%) 7.14 (5.8%) 0.7% ( -9% - 12%) 0.700 OrHighHigh 26.79 (2.4%) 26.97 (4.2%) 0.7% ( -5% - 7%) 0.530 Prefix3 99.61 (3.8%) 100.33 (2.5%) 0.7% ( -5% - 7%) 0.483 CombinedAndHighHigh 7.30 (1.8%) 7.36 (1.9%) 0.8% ( -2% - 4%) 0.161 Wildcard 58.26 (3.3%) 58.86 (2.9%) 1.0% ( -4% - 7%) 0.287 OrStopWords 10.76 (5.4%) 10.88 (7.2%) 1.1% ( -10% - 14%) 0.574 CountPhrase 3.30 (3.6%) 3.34 (2.1%) 1.4% ( -4% - 7%) 0.131 OrMany 5.87 (3.0%) 6.03 (4.3%) 2.7% ( -4% - 10%) 0.019 Or3Terms 83.12 (2.8%) 85.65 (4.4%) 3.0% ( -4% - 10%) 0.009 And3Terms 91.23 (3.3%) 94.32 (4.2%) 3.4% ( -3% - 11%) 0.004 AndStopWords 10.01 (3.7%) 10.48 (6.1%) 4.7% ( -4% - 15%) 0.003 ``` I think this change shows a good enough speedup, will ran another luceneutil comparing the explict vectorizing and branchless way. BTW, should I keep that part of vectorize code or just keep the branchless way if we are about to merge this? ( IMHO, It might be beneficial if we can figure out a way to enable those complex vectorized operations (of couse, not in this PR), without slowing down on machines that don’t support the underlying instructions (or where they are not enabled in the JVM), because there may be other places where we could benefit from vectorization ? ) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org