easyice commented on PR #12954: URL: https://github.com/apache/lucene/pull/12954#issuecomment-1866446008
Here is the benchmark for new approach (avoid for-loop in `reset()`), the `PKLookup` task still has a speedup, but the speedup for `Wildcard` task is disappeared, i checked the memory allocate flamegraph for `Wildcard` task, the `BlockDocsEnum.<init>` call account for about 8%, so the 128-size long array allocate has performance impact on this task. this is different from `PKLookup`. i'm sorry i only checked the memory allocate for `PKLookup` yesterday. so: * `PKLookup` is speedup by avoid for-loop in `reset()` * `Wildcard` is speedup by reduce frequencies buffer allocation The Benchmark delta about new approach: * baseline: main * my_modified_version: new approach (avoid for-loop in `reset()`) <details> <summary >Benchmark result</summary> ``` TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value OrHighHigh 15.94 (4.3%) 15.56 (5.1%) -2.4% ( -11% - 7%) 0.112 OrHighMed 49.58 (5.0%) 48.77 (4.7%) -1.6% ( -10% - 8%) 0.293 OrNotHighHigh 146.95 (3.5%) 144.64 (4.8%) -1.6% ( -9% - 6%) 0.237 LowSloppyPhrase 15.35 (4.1%) 15.13 (3.6%) -1.4% ( -8% - 6%) 0.234 HighSloppyPhrase 5.61 (2.8%) 5.53 (3.6%) -1.4% ( -7% - 5%) 0.176 AndHighMed 21.66 (3.9%) 21.43 (4.1%) -1.1% ( -8% - 7%) 0.390 MedPhrase 9.56 (5.0%) 9.47 (5.0%) -1.0% ( -10% - 9%) 0.512 AndHighHigh 15.95 (3.6%) 15.80 (3.7%) -1.0% ( -8% - 6%) 0.409 MedSloppyPhrase 2.03 (2.7%) 2.02 (3.4%) -0.9% ( -6% - 5%) 0.344 OrHighLow 179.58 (3.1%) 177.95 (3.7%) -0.9% ( -7% - 6%) 0.399 LowTerm 177.24 (6.0%) 176.00 (5.2%) -0.7% ( -11% - 11%) 0.695 HighTermDayOfYearSort 128.70 (3.0%) 127.85 (3.0%) -0.7% ( -6% - 5%) 0.482 OrHighNotLow 204.26 (3.4%) 203.10 (3.5%) -0.6% ( -7% - 6%) 0.601 MedTerm 235.64 (3.5%) 234.40 (4.0%) -0.5% ( -7% - 7%) 0.662 OrNotHighMed 123.29 (4.0%) 122.67 (3.9%) -0.5% ( -8% - 7%) 0.689 HighTerm 286.35 (4.1%) 284.95 (4.3%) -0.5% ( -8% - 8%) 0.713 TermDTSort 60.32 (10.4%) 60.06 (10.8%) -0.4% ( -19% - 23%) 0.895 OrHighNotMed 135.57 (4.2%) 135.07 (5.3%) -0.4% ( -9% - 9%) 0.806 OrHighNotHigh 118.48 (3.6%) 118.06 (5.5%) -0.4% ( -9% - 9%) 0.808 LowIntervalsOrdered 37.27 (4.7%) 37.15 (4.9%) -0.3% ( -9% - 9%) 0.834 MedIntervalsOrdered 2.24 (4.5%) 2.24 (5.0%) -0.1% ( -9% - 9%) 0.928 LowPhrase 20.47 (3.3%) 20.44 (3.3%) -0.1% ( -6% - 6%) 0.916 Fuzzy2 30.51 (2.2%) 30.50 (1.3%) -0.0% ( -3% - 3%) 0.948 HighIntervalsOrdered 2.55 (5.1%) 2.55 (5.4%) 0.1% ( -9% - 11%) 0.975 IntNRQ 24.26 (3.5%) 24.27 (3.7%) 0.1% ( -6% - 7%) 0.962 AndHighLow 270.76 (3.7%) 271.29 (5.0%) 0.2% ( -8% - 9%) 0.889 HighPhrase 32.64 (4.5%) 32.71 (4.1%) 0.2% ( -8% - 9%) 0.878 HighTermTitleSort 72.27 (5.1%) 72.44 (4.1%) 0.2% ( -8% - 9%) 0.871 MedSpanNear 6.40 (4.3%) 6.42 (3.7%) 0.4% ( -7% - 8%) 0.782 HighTermTitleBDVSort 2.31 (3.4%) 2.32 (3.5%) 0.4% ( -6% - 7%) 0.716 OrNotHighLow 235.32 (3.0%) 236.80 (4.7%) 0.6% ( -6% - 8%) 0.613 Respell 25.22 (2.7%) 25.40 (2.2%) 0.7% ( -4% - 5%) 0.353 Fuzzy1 12.56 (2.0%) 12.65 (2.0%) 0.7% ( -3% - 4%) 0.251 HighSpanNear 4.94 (4.5%) 4.98 (4.1%) 0.8% ( -7% - 9%) 0.566 LowSpanNear 1.10 (6.4%) 1.11 (5.0%) 0.8% ( -9% - 13%) 0.646 Wildcard 50.34 (2.5%) 51.06 (2.5%) 1.4% ( -3% - 6%) 0.071 HighTermMonthSort 924.52 (3.1%) 938.49 (4.1%) 1.5% ( -5% - 8%) 0.188 Prefix3 44.92 (2.9%) 48.21 (3.0%) 7.3% ( 1% - 13%) 0.000 PKLookup 86.57 (2.6%) 98.05 (2.3%) 13.3% ( 8% - 18%) 0.000 ``` </details> I also ran a diff between two approaches: * baseline: new approach (avoid for-loop in `reset()`) * my_modified_version: PR <details> <summary >Benchmark result</summary> ``` TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value MedIntervalsOrdered 2.03 (6.6%) 2.00 (5.9%) -1.4% ( -13% - 11%) 0.473 LowSpanNear 3.44 (3.6%) 3.39 (4.1%) -1.4% ( -8% - 6%) 0.259 HighIntervalsOrdered 2.40 (6.2%) 2.37 (4.3%) -1.3% ( -11% - 9%) 0.432 PKLookup 103.25 (3.9%) 101.88 (2.5%) -1.3% ( -7% - 5%) 0.205 OrNotHighMed 164.60 (3.3%) 163.19 (3.5%) -0.9% ( -7% - 6%) 0.421 AndHighHigh 19.66 (5.3%) 19.50 (5.4%) -0.8% ( -10% - 10%) 0.635 MedSpanNear 21.13 (3.6%) 20.98 (3.7%) -0.7% ( -7% - 6%) 0.541 HighPhrase 2.61 (4.8%) 2.59 (7.7%) -0.7% ( -12% - 12%) 0.728 OrHighMed 43.75 (2.4%) 43.55 (2.4%) -0.5% ( -5% - 4%) 0.537 HighSloppyPhrase 3.66 (3.7%) 3.65 (5.6%) -0.4% ( -9% - 9%) 0.770 Fuzzy1 33.55 (2.5%) 33.40 (2.5%) -0.4% ( -5% - 4%) 0.583 LowPhrase 14.64 (4.7%) 14.58 (6.7%) -0.4% ( -11% - 11%) 0.823 Fuzzy2 21.99 (2.8%) 21.91 (2.8%) -0.4% ( -5% - 5%) 0.658 HighSpanNear 4.27 (4.6%) 4.25 (4.1%) -0.3% ( -8% - 8%) 0.812 OrHighNotLow 139.60 (4.5%) 139.17 (5.3%) -0.3% ( -9% - 9%) 0.845 HighTerm 175.84 (6.3%) 175.37 (5.4%) -0.3% ( -11% - 12%) 0.886 MedTerm 239.54 (4.3%) 239.38 (3.5%) -0.1% ( -7% - 8%) 0.959 MedPhrase 14.05 (4.8%) 14.04 (5.2%) -0.0% ( -9% - 10%) 0.989 LowTerm 311.40 (3.8%) 312.41 (3.3%) 0.3% ( -6% - 7%) 0.771 LowIntervalsOrdered 14.48 (3.4%) 14.55 (3.0%) 0.4% ( -5% - 7%) 0.674 OrNotHighLow 242.84 (2.5%) 244.11 (2.6%) 0.5% ( -4% - 5%) 0.517 Respell 37.79 (2.1%) 38.01 (2.3%) 0.6% ( -3% - 5%) 0.410 OrHighNotHigh 150.48 (4.2%) 151.44 (4.8%) 0.6% ( -8% - 10%) 0.656 LowSloppyPhrase 2.58 (3.8%) 2.60 (6.2%) 0.6% ( -9% - 11%) 0.690 OrHighHigh 18.16 (6.5%) 18.29 (4.1%) 0.7% ( -9% - 12%) 0.688 OrHighNotMed 180.47 (5.1%) 181.73 (5.2%) 0.7% ( -9% - 11%) 0.665 OrHighLow 248.00 (2.4%) 249.97 (2.7%) 0.8% ( -4% - 6%) 0.327 AndHighMed 53.28 (5.2%) 53.78 (3.2%) 0.9% ( -7% - 9%) 0.496 OrNotHighHigh 153.56 (4.7%) 155.04 (5.2%) 1.0% ( -8% - 11%) 0.540 AndHighLow 835.08 (2.9%) 843.49 (2.7%) 1.0% ( -4% - 6%) 0.258 TermDTSort 61.75 (6.7%) 62.50 (6.6%) 1.2% ( -11% - 15%) 0.563 MedSloppyPhrase 7.40 (4.1%) 7.50 (4.3%) 1.4% ( -6% - 10%) 0.311 Prefix3 67.01 (8.3%) 67.98 (8.4%) 1.4% ( -14% - 19%) 0.584 HighTermTitleBDVSort 2.72 (5.2%) 2.77 (5.3%) 1.7% ( -8% - 12%) 0.303 IntNRQ 13.99 (7.0%) 14.32 (6.4%) 2.4% ( -10% - 16%) 0.258 HighTermDayOfYearSort 119.72 (6.7%) 123.20 (7.2%) 2.9% ( -10% - 17%) 0.185 Wildcard 8.39 (4.7%) 8.67 (6.4%) 3.3% ( -7% - 15%) 0.061 HighTermMonthSort 1104.51 (3.5%) 1153.19 (2.5%) 4.4% ( -1% - 10%) 0.000 HighTermTitleSort 64.12 (5.8%) 69.63 (5.4%) 8.6% ( -2% - 21%) 0.000 ``` </details> code for new approach is [here](https://github.com/easyice/lucene/commit/8aede033bd64167a0e0a6cca687ecadf04284b43) the memory allocate flamegraph for `wildcard`, `prefix3` tasks in baseline: [mem_flamegraph.zip](https://github.com/apache/lucene/files/13743254/mem_flamegraph.zip) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org