HUSTERGS commented on PR #14896: URL: https://github.com/apache/lucene/pull/14896#issuecomment-3043216222
> honestly, to make it reasonable and prevent traps like this, a good approach would be to support less stuff: e.g. only support 256-bit SVE on ARM and 256-bit AVX2 on x86. Good idea i think, sometimes the preferred bitsize is still 256-bit even on a machine support AVX-512, and the 512-bit also have risk of thermal throttling. > Would you like to open a PR that switches main to the branchless impl? Of course! I'v opened a new PR #14906 > and then see if / how much explicit vectorization can further help. I also ran a luceneutil with branchless as baseline and explict vectorize as candidate, with identical setup, here is the result if it helps: ``` TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value CountTerm 7866.13 (4.7%) 7645.58 (3.8%) -2.8% ( -10% - 5%) 0.038 FilteredOr2Terms2StopWords 65.25 (6.6%) 63.67 (4.7%) -2.4% ( -12% - 9%) 0.180 OrHighRare 120.97 (7.1%) 118.20 (6.3%) -2.3% ( -14% - 11%) 0.283 CombinedTerm 14.54 (3.5%) 14.24 (4.4%) -2.0% ( -9% - 6%) 0.108 TermB1M1P 585.82 (5.3%) 574.47 (4.8%) -1.9% ( -11% - 8%) 0.222 DismaxTerm 648.69 (4.1%) 636.40 (3.8%) -1.9% ( -9% - 6%) 0.131 FilteredOrHighMed 52.82 (5.1%) 51.83 (3.4%) -1.9% ( -9% - 7%) 0.175 Term10K 584.69 (5.4%) 574.22 (4.8%) -1.8% ( -11% - 8%) 0.271 Term1M 585.51 (5.3%) 575.04 (5.0%) -1.8% ( -11% - 8%) 0.272 Term 585.35 (5.2%) 575.06 (4.9%) -1.8% ( -11% - 8%) 0.273 Term100 585.37 (5.3%) 575.21 (4.9%) -1.7% ( -11% - 8%) 0.285 TermB1M 585.01 (5.2%) 574.89 (4.9%) -1.7% ( -11% - 8%) 0.276 FilteredTerm 85.08 (4.6%) 83.72 (3.7%) -1.6% ( -9% - 7%) 0.227 Fuzzy1 50.46 (3.7%) 49.68 (2.9%) -1.5% ( -7% - 5%) 0.143 And2Terms2StopWords 76.00 (9.0%) 74.86 (7.2%) -1.5% ( -16% - 16%) 0.561 TermMonthSort 2924.72 (3.3%) 2885.32 (3.1%) -1.3% ( -7% - 5%) 0.188 FilteredOr3Terms 58.33 (4.6%) 57.56 (2.9%) -1.3% ( -8% - 6%) 0.273 CombinedOrHighMed 27.77 (5.0%) 27.40 (6.2%) -1.3% ( -11% - 10%) 0.457 CombinedAndHighMed 28.30 (5.2%) 27.95 (4.9%) -1.3% ( -10% - 9%) 0.426 FilteredOrHighHigh 17.87 (3.6%) 17.66 (2.3%) -1.2% ( -6% - 4%) 0.202 FilteredAnd2Terms2StopWords 77.04 (6.6%) 76.25 (5.4%) -1.0% ( -12% - 11%) 0.590 FilteredAnd3Terms 132.17 (2.8%) 130.83 (1.7%) -1.0% ( -5% - 3%) 0.166 Fuzzy2 45.46 (2.6%) 45.07 (2.0%) -0.9% ( -5% - 3%) 0.234 FilteredAndHighHigh 14.97 (2.4%) 14.84 (3.2%) -0.9% ( -6% - 4%) 0.326 FilteredAndStopWords 11.75 (2.6%) 11.65 (3.4%) -0.9% ( -6% - 5%) 0.374 TermTitleSort 71.29 (5.1%) 70.68 (3.4%) -0.9% ( -8% - 8%) 0.534 Prefix3 100.81 (3.0%) 100.06 (2.9%) -0.7% ( -6% - 5%) 0.428 FilteredOrStopWords 11.02 (2.9%) 10.94 (2.1%) -0.7% ( -5% - 4%) 0.374 SloppyPhrase 1.45 (4.4%) 1.44 (5.1%) -0.7% ( -9% - 9%) 0.655 FilteredPhrase 12.59 (1.8%) 12.51 (2.1%) -0.7% ( -4% - 3%) 0.272 FilteredPrefix3 94.17 (2.9%) 93.55 (2.7%) -0.7% ( -6% - 5%) 0.460 Phrase 9.84 (2.7%) 9.78 (3.8%) -0.6% ( -6% - 6%) 0.556 Or2Terms2StopWords 78.52 (8.2%) 78.09 (6.5%) -0.5% ( -14% - 15%) 0.815 CountOrHighMed 96.16 (2.8%) 95.66 (2.2%) -0.5% ( -5% - 4%) 0.514 TermDTSort 192.18 (2.6%) 191.26 (2.0%) -0.5% ( -4% - 4%) 0.516 CountPhrase 3.26 (5.5%) 3.25 (4.5%) -0.4% ( -9% - 10%) 0.802 IntNRQ 49.15 (1.4%) 48.96 (1.8%) -0.4% ( -3% - 2%) 0.458 CountAndHighMed 93.17 (3.1%) 92.86 (2.6%) -0.3% ( -5% - 5%) 0.716 IntervalsOrdered 2.95 (3.1%) 2.94 (3.8%) -0.3% ( -7% - 6%) 0.770 FilteredOrMany 5.08 (2.6%) 5.07 (1.5%) -0.3% ( -4% - 3%) 0.633 DismaxOrHighMed 66.40 (4.1%) 66.21 (2.9%) -0.3% ( -7% - 7%) 0.798 CountOrHighHigh 64.28 (2.2%) 64.11 (2.1%) -0.3% ( -4% - 4%) 0.700 FilteredAndHighMed 45.07 (3.0%) 44.96 (3.0%) -0.3% ( -6% - 5%) 0.789 FilteredIntNRQ 48.82 (1.4%) 48.76 (2.0%) -0.1% ( -3% - 3%) 0.799 SpanNear 3.08 (3.9%) 3.08 (4.5%) 0.0% ( -8% - 8%) 0.985 CountAndHighHigh 62.29 (1.5%) 62.33 (1.4%) 0.1% ( -2% - 3%) 0.869 CountFilteredOrHighMed 29.65 (1.7%) 29.70 (1.7%) 0.2% ( -3% - 3%) 0.732 CountFilteredOrHighHigh 25.21 (1.4%) 25.27 (1.6%) 0.2% ( -2% - 3%) 0.633 CountOrMany 6.49 (1.9%) 6.50 (2.5%) 0.2% ( -4% - 4%) 0.735 Respell 44.51 (2.0%) 44.62 (2.0%) 0.3% ( -3% - 4%) 0.683 CountFilteredOrMany 6.09 (1.6%) 6.11 (2.0%) 0.3% ( -3% - 3%) 0.612 IntSet 405.88 (4.3%) 407.55 (4.7%) 0.4% ( -8% - 9%) 0.773 AndHighOrMedMed 18.11 (2.9%) 18.18 (2.9%) 0.4% ( -5% - 6%) 0.640 CombinedOrHighHigh 7.13 (4.0%) 7.17 (6.0%) 0.5% ( -9% - 10%) 0.771 DismaxOrHighHigh 47.02 (3.7%) 47.26 (3.1%) 0.5% ( -6% - 7%) 0.633 Wildcard 59.04 (3.5%) 59.35 (3.3%) 0.5% ( -6% - 7%) 0.623 CountFilteredIntNRQ 22.32 (1.7%) 22.46 (1.9%) 0.6% ( -2% - 4%) 0.290 TermDayOfYearSort 358.17 (0.8%) 360.43 (1.2%) 0.6% ( -1% - 2%) 0.052 CountFilteredPhrase 11.64 (2.0%) 11.71 (1.5%) 0.7% ( -2% - 4%) 0.246 CombinedAndHighHigh 7.28 (2.8%) 7.35 (3.1%) 1.0% ( -4% - 7%) 0.286 AndMedOrHighHigh 21.26 (2.3%) 21.48 (2.2%) 1.0% ( -3% - 5%) 0.148 OrStopWords 11.13 (5.3%) 11.27 (6.1%) 1.3% ( -9% - 13%) 0.476 And3Terms 94.41 (4.5%) 95.72 (3.5%) 1.4% ( -6% - 9%) 0.281 AndStopWords 10.63 (4.4%) 10.79 (5.0%) 1.6% ( -7% - 11%) 0.287 OrHighMed 91.11 (4.8%) 92.89 (3.7%) 2.0% ( -6% - 10%) 0.145 Or3Terms 86.02 (4.3%) 87.73 (3.4%) 2.0% ( -5% - 10%) 0.103 OrMany 6.09 (4.9%) 6.25 (4.3%) 2.5% ( -6% - 12%) 0.081 AndHighMed 70.55 (3.9%) 72.50 (3.1%) 2.8% ( -4% - 10%) 0.013 AndHighHigh 28.24 (4.1%) 29.23 (4.1%) 3.5% ( -4% - 12%) 0.007 OrHighHigh 27.26 (4.3%) 28.43 (3.4%) 4.3% ( -3% - 12%) 0.000 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org