gf2121 opened a new pull request, #12699: URL: https://github.com/apache/lucene/pull/12699
## Description Previous talk is too long to track so i opened a new PR to make a summery here. More details are available in https://github.com/apache/lucene/pull/12661. After merging of https://github.com/apache/lucene/pull/12631, we found a [regression](https://home.apache.org/~mikemccand/lucenebench/2023.10.10.18.03.55.html) for term-dictionary-related tasks like PKLookup, Fuzzy ... After some digging I suspect that the regression is caused by more `Outputs#add` and `Outputs#read` on reading side, as the 'MSB VLong output format' making FST sharing more output prefix. The difference of calling times of `Outputs#add` and `Outputs#read` is shown below: | LSB VLong | MSB VLong | diff -- | -- | -- | -- Outputs#read times | 116097 | 149803 | 29.03% Outputs#add times | 144 | 111568 | 77377.78% ## Solution This patch tries to do two optimizations to get back the speed: * Instead of combining all outputs into single one, this patch collect all BytesRefs into an array and build a DataInput view over it, reducing the cost of objects construction and memory copy. * Instead of copying bytes into floor data, this patch directly uses the last output as floor data. Floor data is guaranteed continuous in single arc because we can not have two same `fp` encoded before floor data. ## Benchmark > Both run 30 JVM round and 50 tasks per JVM: ### Comparing to after merging of https://github.com/apache/lucene/pull/12631 BaseLine ( **After** merging of https://github.com/apache/lucene/pull/12631 ) > $ git log > Write MSB VLong for better outputs sharing in block tree index (#12631) > DeletedTerms#clear should reset ByteBlockPool (#12630) Candidate > $ git log > Optimize output accumulating > Write MSB VLong for better outputs sharing in block tree index > DeletedTerms#clear should reset ByteBlockPool ``` TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value OrHighNotHigh 166.79 (4.5%) 165.12 (4.3%) -1.0% ( -9% - 8%) 0.376 OrHighNotLow 258.47 (4.3%) 256.46 (3.3%) -0.8% ( -7% - 7%) 0.430 HighIntervalsOrdered 0.38 (3.3%) 0.38 (3.1%) -0.8% ( -6% - 5%) 0.359 LowSloppyPhrase 1.64 (3.4%) 1.63 (4.9%) -0.7% ( -8% - 7%) 0.544 OrNotHighHigh 150.83 (5.1%) 149.86 (4.4%) -0.6% ( -9% - 9%) 0.597 HighSloppyPhrase 3.67 (3.3%) 3.64 (3.6%) -0.6% ( -7% - 6%) 0.470 Prefix3 67.16 (4.3%) 66.78 (2.8%) -0.6% ( -7% - 6%) 0.543 OrNotHighMed 135.65 (4.4%) 134.95 (3.7%) -0.5% ( -8% - 7%) 0.620 OrNotHighLow 257.14 (2.0%) 255.85 (1.7%) -0.5% ( -4% - 3%) 0.291 AndHighHigh 19.33 (2.8%) 19.25 (2.2%) -0.4% ( -5% - 4%) 0.517 OrHighNotMed 128.51 (5.3%) 128.06 (4.2%) -0.3% ( -9% - 9%) 0.778 MedSloppyPhrase 9.56 (3.6%) 9.54 (3.8%) -0.3% ( -7% - 7%) 0.788 OrHighHigh 13.23 (3.3%) 13.20 (3.2%) -0.2% ( -6% - 6%) 0.801 OrHighLow 199.40 (1.9%) 199.18 (1.7%) -0.1% ( -3% - 3%) 0.812 Wildcard 69.22 (2.7%) 69.19 (1.9%) -0.0% ( -4% - 4%) 0.952 AndHighMed 42.14 (3.7%) 42.14 (2.9%) 0.0% ( -6% - 6%) 0.993 HighSpanNear 7.00 (3.1%) 7.00 (3.1%) 0.1% ( -5% - 6%) 0.945 Fuzzy1 44.05 (1.8%) 44.08 (1.7%) 0.1% ( -3% - 3%) 0.872 OrHighMed 38.54 (2.6%) 38.59 (3.2%) 0.1% ( -5% - 6%) 0.871 Respell 28.40 (2.2%) 28.45 (2.2%) 0.2% ( -4% - 4%) 0.790 LowSpanNear 4.23 (1.7%) 4.24 (1.5%) 0.2% ( -2% - 3%) 0.580 MedIntervalsOrdered 3.47 (6.6%) 3.49 (7.3%) 0.3% ( -12% - 15%) 0.856 LowIntervalsOrdered 2.76 (4.3%) 2.77 (5.0%) 0.4% ( -8% - 10%) 0.761 Fuzzy2 35.28 (1.6%) 35.42 (1.6%) 0.4% ( -2% - 3%) 0.325 HighPhrase 38.91 (2.6%) 39.07 (2.5%) 0.4% ( -4% - 5%) 0.532 MedTerm 248.39 (4.2%) 249.48 (4.2%) 0.4% ( -7% - 9%) 0.685 LowPhrase 26.44 (2.0%) 26.57 (2.3%) 0.5% ( -3% - 4%) 0.385 MedPhrase 9.51 (2.3%) 9.57 (2.6%) 0.6% ( -4% - 5%) 0.365 AndHighLow 338.26 (3.0%) 340.61 (2.6%) 0.7% ( -4% - 6%) 0.339 LowTerm 242.60 (4.5%) 244.38 (4.8%) 0.7% ( -8% - 10%) 0.545 MedSpanNear 1.11 (1.4%) 1.12 (1.5%) 0.8% ( -2% - 3%) 0.042 HighTerm 230.63 (3.8%) 233.01 (4.6%) 1.0% ( -7% - 9%) 0.339 PKLookup 100.76 (2.9%) 105.17 (3.2%) 4.4% ( -1% - 10%) 0.000 ``` ### Comparing to before merging of https://github.com/apache/lucene/pull/12631 BaseLine ( **Before** merging of https://github.com/apache/lucene/pull/12631 ) > $ git log > DeletedTerms#clear should reset ByteBlockPool Candidate > $ git log > Optimize output accumulating > Write MSB VLong for better outputs sharing in block tree index > DeletedTerms#clear should reset ByteBlockPool (Not sure why seeing tiny speed up for Orxxx tasks. I suspect it is a noise) ``` TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value MedIntervalsOrdered 3.50 (7.9%) 3.43 (7.4%) -2.2% ( -16% - 14%) 0.269 Wildcard 70.27 (2.9%) 69.22 (2.2%) -1.5% ( -6% - 3%) 0.023 HighIntervalsOrdered 0.38 (4.0%) 0.37 (4.0%) -1.3% ( -8% - 6%) 0.194 LowIntervalsOrdered 2.78 (5.3%) 2.75 (5.2%) -1.1% ( -10% - 9%) 0.417 Prefix3 66.95 (4.3%) 66.25 (3.1%) -1.0% ( -8% - 6%) 0.282 LowTerm 246.48 (8.3%) 244.15 (6.8%) -0.9% ( -14% - 15%) 0.631 MedPhrase 9.57 (3.6%) 9.50 (3.1%) -0.8% ( -7% - 6%) 0.366 HighSloppyPhrase 3.63 (3.5%) 3.62 (3.8%) -0.3% ( -7% - 7%) 0.761 HighPhrase 39.07 (3.5%) 38.96 (3.3%) -0.3% ( -6% - 6%) 0.749 PKLookup 105.27 (2.8%) 105.13 (3.0%) -0.1% ( -5% - 5%) 0.859 MedSloppyPhrase 9.49 (3.2%) 9.48 (3.9%) -0.1% ( -6% - 7%) 0.920 Respell 28.45 (2.6%) 28.46 (2.2%) 0.0% ( -4% - 5%) 0.957 LowSloppyPhrase 1.63 (5.0%) 1.63 (5.6%) 0.1% ( -9% - 11%) 0.929 LowPhrase 26.31 (3.3%) 26.36 (2.1%) 0.2% ( -5% - 5%) 0.782 Fuzzy1 43.71 (1.3%) 43.82 (1.7%) 0.3% ( -2% - 3%) 0.509 OrHighNotHigh 164.97 (3.7%) 165.40 (4.4%) 0.3% ( -7% - 8%) 0.806 AndHighHigh 19.17 (2.1%) 19.23 (2.3%) 0.3% ( -3% - 4%) 0.569 MedSpanNear 1.11 (2.0%) 1.12 (2.0%) 0.6% ( -3% - 4%) 0.266 AndHighLow 338.06 (2.1%) 340.30 (1.9%) 0.7% ( -3% - 4%) 0.196 LowSpanNear 4.20 (2.4%) 4.23 (2.1%) 0.7% ( -3% - 5%) 0.194 OrHighHigh 13.15 (4.0%) 13.26 (4.5%) 0.8% ( -7% - 9%) 0.454 OrNotHighHigh 148.59 (3.9%) 149.94 (5.1%) 0.9% ( -7% - 10%) 0.442 HighTerm 229.24 (3.7%) 231.60 (4.2%) 1.0% ( -6% - 9%) 0.314 HighSpanNear 6.94 (3.3%) 7.01 (3.6%) 1.1% ( -5% - 8%) 0.224 OrHighNotMed 126.33 (4.3%) 127.74 (4.9%) 1.1% ( -7% - 10%) 0.343 OrHighNotLow 254.19 (3.1%) 257.19 (3.9%) 1.2% ( -5% - 8%) 0.200 AndHighMed 41.35 (5.0%) 42.00 (3.2%) 1.6% ( -6% - 10%) 0.148 Fuzzy2 34.61 (1.8%) 35.34 (2.1%) 2.1% ( -1% - 6%) 0.000 OrHighMed 37.84 (5.8%) 38.65 (4.2%) 2.1% ( -7% - 12%) 0.107 OrNotHighLow 248.52 (2.8%) 254.75 (2.5%) 2.5% ( -2% - 7%) 0.000 OrHighLow 193.81 (2.3%) 200.07 (2.5%) 3.2% ( -1% - 8%) 0.000 OrNotHighMed 130.29 (3.6%) 134.66 (4.7%) 3.4% ( -4% - 12%) 0.002 MedTerm 241.15 (3.8%) 249.46 (4.5%) 3.4% ( -4% - 12%) 0.001 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org