HUSTERGS commented on PR #14735: URL: https://github.com/apache/lucene/pull/14735#issuecomment-2922761475
I added a single block pre-check, so if the target locate at the first block, we will not pay the double cost. This change seems quite useful, because it reduces the cost of common cases. And the worse case is also mitigated. The `AdvanceBenchmark` shows a better result (the test result was worse than the original implemetation without this check) ``` AdvanceBenchmark.vectorUtilSearch thrpt 15 752.037 ± 33.398 ops/ms (no expand, current implementation) AdvanceBenchmark.vectorUtilSearch thrpt 15 625.892 ± 20.849 ops/ms (expand 2, previous proposed implementation) AdvanceBenchmark.vectorUtilSearch thrpt 15 802.733 ± 25.096 ops/ms (expand 2, currently proposed implementation by this PR) AdvanceBenchmark.vectorUtilSearch thrpt 15 893.295 ± 17.879 ops/ms (expand 3) AdvanceBenchmark.vectorUtilSearch thrpt 15 955.528 ± 20.036 ops/ms (expand 4) ``` the result from luceneutil also confirmed that, `taskCountPerCat` set to 5 and concurrent search disabled: ``` TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value CountAndHighMed 93.29 (3.5%) 90.14 (4.3%) -3.4% ( -10% - 4%) 0.086 CountOrHighMed 97.01 (2.4%) 94.42 (4.3%) -2.7% ( -9% - 4%) 0.125 FilteredAndHighHigh 13.18 (2.2%) 12.94 (1.6%) -1.8% ( -5% - 2%) 0.062 FilteredOrMany 5.10 (1.9%) 5.01 (2.8%) -1.7% ( -6% - 3%) 0.164 Prefix3 98.88 (2.1%) 97.33 (4.5%) -1.6% ( -7% - 5%) 0.371 FilteredPrefix3 92.03 (2.0%) 90.85 (3.9%) -1.3% ( -7% - 4%) 0.408 FilteredOr3Terms 59.68 (1.9%) 59.00 (2.6%) -1.1% ( -5% - 3%) 0.318 SloppyPhrase 1.54 (4.2%) 1.52 (3.3%) -1.1% ( -8% - 6%) 0.579 AndHighOrMedMed 17.85 (2.0%) 17.66 (1.6%) -1.0% ( -4% - 2%) 0.263 FilteredAndStopWords 10.36 (2.1%) 10.25 (1.6%) -1.0% ( -4% - 2%) 0.288 CombinedAndHighMed 24.68 (3.8%) 24.44 (2.9%) -1.0% ( -7% - 5%) 0.554 CombinedAndHighHigh 6.15 (2.7%) 6.10 (1.9%) -0.9% ( -5% - 3%) 0.457 FilteredOrHighHigh 18.16 (1.8%) 18.02 (1.7%) -0.8% ( -4% - 2%) 0.379 FilteredOrHighMed 54.23 (2.9%) 53.82 (2.6%) -0.8% ( -6% - 4%) 0.582 CountOrMany 6.85 (1.7%) 6.80 (3.0%) -0.7% ( -5% - 4%) 0.589 TermMonthSort 2508.14 (3.7%) 2494.67 (3.7%) -0.5% ( -7% - 7%) 0.770 PKLookup 185.70 (8.8%) 185.00 (8.7%) -0.4% ( -16% - 18%) 0.932 TermTitleSort 71.42 (3.2%) 71.20 (3.3%) -0.3% ( -6% - 6%) 0.851 TermDTSort 179.37 (4.6%) 178.92 (2.1%) -0.3% ( -6% - 6%) 0.887 TermDayOfYearSort 375.84 (1.6%) 375.20 (0.7%) -0.2% ( -2% - 2%) 0.786 OrHighRare 123.25 (4.9%) 123.10 (2.1%) -0.1% ( -6% - 7%) 0.948 FilteredOr2Terms2StopWords 69.66 (5.5%) 69.60 (5.6%) -0.1% ( -10% - 11%) 0.977 Wildcard 57.13 (3.1%) 57.12 (3.1%) -0.0% ( -6% - 6%) 0.988 CountPhrase 3.25 (4.4%) 3.25 (2.8%) -0.0% ( -6% - 7%) 0.994 FilteredTerm 88.15 (3.3%) 88.15 (4.2%) 0.0% ( -7% - 7%) 0.999 CountOrHighHigh 63.31 (2.4%) 63.34 (2.9%) 0.0% ( -5% - 5%) 0.976 FilteredOrStopWords 11.01 (3.3%) 11.03 (2.0%) 0.2% ( -4% - 5%) 0.900 FilteredPhrase 12.75 (2.4%) 12.78 (2.6%) 0.2% ( -4% - 5%) 0.880 CombinedTerm 13.47 (3.7%) 13.50 (3.9%) 0.2% ( -7% - 8%) 0.918 IntervalsOrdered 2.95 (3.3%) 2.96 (3.5%) 0.3% ( -6% - 7%) 0.871 Respell 43.81 (1.9%) 43.97 (2.1%) 0.4% ( -3% - 4%) 0.703 CountAndHighHigh 60.80 (2.5%) 61.04 (2.8%) 0.4% ( -4% - 5%) 0.763 SpanNear 3.10 (2.3%) 3.12 (1.9%) 0.6% ( -3% - 4%) 0.552 IntNRQ 48.14 (1.8%) 48.46 (2.4%) 0.7% ( -3% - 4%) 0.522 CountFilteredIntNRQ 22.06 (1.2%) 22.21 (1.5%) 0.7% ( -2% - 3%) 0.306 CountTerm 7031.70 (2.8%) 7085.76 (5.2%) 0.8% ( -7% - 9%) 0.715 CountFilteredOrHighHigh 25.07 (1.1%) 25.27 (1.4%) 0.8% ( -1% - 3%) 0.194 IntSet 391.92 (4.6%) 395.19 (5.4%) 0.8% ( -8% - 11%) 0.739 CountFilteredOrMany 6.01 (1.8%) 6.06 (2.2%) 0.9% ( -3% - 4%) 0.394 FilteredIntNRQ 47.96 (2.1%) 48.37 (2.3%) 0.9% ( -3% - 5%) 0.443 CountFilteredOrHighMed 29.55 (0.8%) 29.81 (1.7%) 0.9% ( -1% - 3%) 0.196 Phrase 9.76 (2.7%) 9.87 (2.5%) 1.1% ( -4% - 6%) 0.414 Fuzzy2 45.67 (3.3%) 46.25 (4.1%) 1.3% ( -5% - 8%) 0.495 CountFilteredPhrase 11.67 (3.4%) 11.82 (3.9%) 1.3% ( -5% - 8%) 0.483 Fuzzy1 50.95 (4.0%) 51.97 (3.5%) 2.0% ( -5% - 9%) 0.286 CombinedOrHighMed 24.10 (9.1%) 24.73 (4.2%) 2.6% ( -9% - 17%) 0.464 CombinedOrHighHigh 5.99 (9.2%) 6.18 (2.6%) 3.1% ( -7% - 16%) 0.355 FilteredAndHighMed 38.19 (8.4%) 39.39 (2.1%) 3.1% ( -6% - 14%) 0.306 AndMedOrHighHigh 19.76 (7.8%) 20.66 (2.3%) 4.6% ( -5% - 15%) 0.112 OrMany 5.44 (8.2%) 5.70 (4.3%) 4.7% ( -7% - 18%) 0.149 FilteredAnd2Terms2StopWords 66.23 (12.4%) 69.62 (5.8%) 5.1% ( -11% - 26%) 0.290 DismaxOrHighMed 63.25 (11.6%) 66.58 (3.5%) 5.3% ( -8% - 22%) 0.219 DismaxTerm 593.96 (8.5%) 625.85 (6.5%) 5.4% ( -8% - 22%) 0.157 And2Terms2StopWords 73.30 (12.9%) 77.34 (8.0%) 5.5% ( -13% - 30%) 0.305 DismaxOrHighHigh 43.52 (12.1%) 46.09 (3.1%) 5.9% ( -8% - 23%) 0.180 TermB1M 526.21 (13.7%) 561.74 (12.2%) 6.8% ( -16% - 37%) 0.298 Term10K 527.47 (13.8%) 563.17 (12.3%) 6.8% ( -16% - 38%) 0.300 FilteredAnd3Terms 80.07 (15.7%) 85.52 (2.0%) 6.8% ( -9% - 29%) 0.224 TermB1M1P 529.07 (13.9%) 565.30 (12.8%) 6.8% ( -17% - 38%) 0.305 AndStopWords 8.52 (14.5%) 9.10 (2.9%) 6.9% ( -9% - 28%) 0.190 Or2Terms2StopWords 71.07 (14.9%) 76.05 (7.5%) 7.0% ( -13% - 34%) 0.235 Term1M 526.93 (13.9%) 565.31 (11.8%) 7.3% ( -16% - 38%) 0.258 Term100 525.57 (13.3%) 563.91 (12.0%) 7.3% ( -15% - 37%) 0.249 And3Terms 78.11 (15.0%) 83.90 (1.8%) 7.4% ( -8% - 28%) 0.167 Term 526.25 (13.5%) 566.72 (12.1%) 7.7% ( -15% - 38%) 0.230 Or3Terms 72.68 (16.4%) 78.71 (1.6%) 8.3% ( -8% - 31%) 0.155 OrStopWords 8.89 (19.6%) 9.76 (3.1%) 9.7% ( -10% - 40%) 0.165 AndHighMed 61.59 (18.7%) 67.79 (2.1%) 10.1% ( -9% - 38%) 0.131 OrHighMed 80.86 (19.6%) 89.47 (2.8%) 10.6% ( -9% - 41%) 0.129 OrHighHigh 22.38 (21.7%) 24.98 (1.6%) 11.6% ( -9% - 44%) 0.131 AndHighHigh 23.37 (21.6%) 26.10 (1.7%) 11.7% ( -9% - 44%) 0.128 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org