[PR] Optimize outputs accumulating for SegmentTermsEnum and IntersectTermsEnum [lucene]

via GitHub Wed, 18 Oct 2023 22:09:27 -0700


gf2121 opened a new pull request, #12699:
URL: https://github.com/apache/lucene/pull/12699


   ## Description
   Previous talk is too long to track so i opened a new PR to make a summery 
here. More details are available in https://github.com/apache/lucene/pull/12661.
   
   After merging of https://github.com/apache/lucene/pull/12631, we found a 
[regression](https://home.apache.org/~mikemccand/lucenebench/2023.10.10.18.03.55.html)
 for term-dictionary-related tasks like PKLookup, Fuzzy ... After some digging 
I suspect that the regression is caused by  more `Outputs#add` and 
`Outputs#read` on reading side, as the 'MSB VLong output format' making FST 
sharing more output prefix. The difference of calling times of `Outputs#add` 
and `Outputs#read` is shown below:
   
     | LSB VLong | MSB VLong | diff
   -- | -- | -- | --
   Outputs#read times | 116097 | 149803 | 29.03%
   Outputs#add times | 144 | 111568 | 77377.78%
   
   ## Solution
   
   This patch tries to do two optimizations to get back the speed:
   * Instead of combining all outputs into single one, this patch collect all 
BytesRefs into an array and build a DataInput view over it, reducing the cost 
of objects construction and memory copy.
   * Instead of copying bytes into floor data, this patch directly uses the 
last output as floor data. Floor data is guaranteed continuous in single arc 
because we can not have two same `fp` encoded before floor data.
   
   ## Benchmark
   
   > Both run 30 JVM round and 50 tasks per JVM:
   
   ### Comparing to after merging of https://github.com/apache/lucene/pull/12631
   
   BaseLine ( **After** merging of https://github.com/apache/lucene/pull/12631 )
   
   > $ git log
   > Write MSB VLong for better outputs sharing in block tree index (#12631)
   > DeletedTerms#clear should reset ByteBlockPool (#12630)
   
   Candidate
   
   > $ git log
   > Optimize output accumulating
   > Write MSB VLong for better outputs sharing in block tree index
   > DeletedTerms#clear should reset ByteBlockPool
   
   ```
                               TaskQPS baseline      StdDevQPS 
my_modified_version      StdDev                Pct diff p-value
                      OrHighNotHigh      166.79      (4.5%)      165.12      
(4.3%)   -1.0% (  -9% -    8%) 0.376
                       OrHighNotLow      258.47      (4.3%)      256.46      
(3.3%)   -0.8% (  -7% -    7%) 0.430
               HighIntervalsOrdered        0.38      (3.3%)        0.38      
(3.1%)   -0.8% (  -6% -    5%) 0.359
                    LowSloppyPhrase        1.64      (3.4%)        1.63      
(4.9%)   -0.7% (  -8% -    7%) 0.544
                      OrNotHighHigh      150.83      (5.1%)      149.86      
(4.4%)   -0.6% (  -9% -    9%) 0.597
                   HighSloppyPhrase        3.67      (3.3%)        3.64      
(3.6%)   -0.6% (  -7% -    6%) 0.470
                            Prefix3       67.16      (4.3%)       66.78      
(2.8%)   -0.6% (  -7% -    6%) 0.543
                       OrNotHighMed      135.65      (4.4%)      134.95      
(3.7%)   -0.5% (  -8% -    7%) 0.620
                       OrNotHighLow      257.14      (2.0%)      255.85      
(1.7%)   -0.5% (  -4% -    3%) 0.291
                        AndHighHigh       19.33      (2.8%)       19.25      
(2.2%)   -0.4% (  -5% -    4%) 0.517
                       OrHighNotMed      128.51      (5.3%)      128.06      
(4.2%)   -0.3% (  -9% -    9%) 0.778
                    MedSloppyPhrase        9.56      (3.6%)        9.54      
(3.8%)   -0.3% (  -7% -    7%) 0.788
                         OrHighHigh       13.23      (3.3%)       13.20      
(3.2%)   -0.2% (  -6% -    6%) 0.801
                          OrHighLow      199.40      (1.9%)      199.18      
(1.7%)   -0.1% (  -3% -    3%) 0.812
                           Wildcard       69.22      (2.7%)       69.19      
(1.9%)   -0.0% (  -4% -    4%) 0.952
                         AndHighMed       42.14      (3.7%)       42.14      
(2.9%)    0.0% (  -6% -    6%) 0.993
                       HighSpanNear        7.00      (3.1%)        7.00      
(3.1%)    0.1% (  -5% -    6%) 0.945
                             Fuzzy1       44.05      (1.8%)       44.08      
(1.7%)    0.1% (  -3% -    3%) 0.872
                          OrHighMed       38.54      (2.6%)       38.59      
(3.2%)    0.1% (  -5% -    6%) 0.871
                            Respell       28.40      (2.2%)       28.45      
(2.2%)    0.2% (  -4% -    4%) 0.790
                        LowSpanNear        4.23      (1.7%)        4.24      
(1.5%)    0.2% (  -2% -    3%) 0.580
                MedIntervalsOrdered        3.47      (6.6%)        3.49      
(7.3%)    0.3% ( -12% -   15%) 0.856
                LowIntervalsOrdered        2.76      (4.3%)        2.77      
(5.0%)    0.4% (  -8% -   10%) 0.761
                             Fuzzy2       35.28      (1.6%)       35.42      
(1.6%)    0.4% (  -2% -    3%) 0.325
                         HighPhrase       38.91      (2.6%)       39.07      
(2.5%)    0.4% (  -4% -    5%) 0.532
                            MedTerm      248.39      (4.2%)      249.48      
(4.2%)    0.4% (  -7% -    9%) 0.685
                          LowPhrase       26.44      (2.0%)       26.57      
(2.3%)    0.5% (  -3% -    4%) 0.385
                          MedPhrase        9.51      (2.3%)        9.57      
(2.6%)    0.6% (  -4% -    5%) 0.365
                         AndHighLow      338.26      (3.0%)      340.61      
(2.6%)    0.7% (  -4% -    6%) 0.339
                            LowTerm      242.60      (4.5%)      244.38      
(4.8%)    0.7% (  -8% -   10%) 0.545
                        MedSpanNear        1.11      (1.4%)        1.12      
(1.5%)    0.8% (  -2% -    3%) 0.042
                           HighTerm      230.63      (3.8%)      233.01      
(4.6%)    1.0% (  -7% -    9%) 0.339
                           PKLookup      100.76      (2.9%)      105.17      
(3.2%)    4.4% (  -1% -   10%) 0.000
   ```
   
   ### Comparing to before merging of 
https://github.com/apache/lucene/pull/12631
   
   BaseLine ( **Before** merging of https://github.com/apache/lucene/pull/12631 
)
   
   > $ git log
   > DeletedTerms#clear should reset ByteBlockPool 
   
   Candidate
   
   > $ git log
   > Optimize output accumulating
   > Write MSB VLong for better outputs sharing in block tree index
   > DeletedTerms#clear should reset ByteBlockPool
   
   (Not sure why seeing tiny speed up for Orxxx tasks. I suspect it is a noise)
   ```
                               TaskQPS baseline      StdDevQPS 
my_modified_version      StdDev                Pct diff p-value
                MedIntervalsOrdered        3.50      (7.9%)        3.43      
(7.4%)   -2.2% ( -16% -   14%) 0.269
                           Wildcard       70.27      (2.9%)       69.22      
(2.2%)   -1.5% (  -6% -    3%) 0.023
               HighIntervalsOrdered        0.38      (4.0%)        0.37      
(4.0%)   -1.3% (  -8% -    6%) 0.194
                LowIntervalsOrdered        2.78      (5.3%)        2.75      
(5.2%)   -1.1% ( -10% -    9%) 0.417
                            Prefix3       66.95      (4.3%)       66.25      
(3.1%)   -1.0% (  -8% -    6%) 0.282
                            LowTerm      246.48      (8.3%)      244.15      
(6.8%)   -0.9% ( -14% -   15%) 0.631
                          MedPhrase        9.57      (3.6%)        9.50      
(3.1%)   -0.8% (  -7% -    6%) 0.366
                   HighSloppyPhrase        3.63      (3.5%)        3.62      
(3.8%)   -0.3% (  -7% -    7%) 0.761
                         HighPhrase       39.07      (3.5%)       38.96      
(3.3%)   -0.3% (  -6% -    6%) 0.749
                           PKLookup      105.27      (2.8%)      105.13      
(3.0%)   -0.1% (  -5% -    5%) 0.859
                    MedSloppyPhrase        9.49      (3.2%)        9.48      
(3.9%)   -0.1% (  -6% -    7%) 0.920
                            Respell       28.45      (2.6%)       28.46      
(2.2%)    0.0% (  -4% -    5%) 0.957
                    LowSloppyPhrase        1.63      (5.0%)        1.63      
(5.6%)    0.1% (  -9% -   11%) 0.929
                          LowPhrase       26.31      (3.3%)       26.36      
(2.1%)    0.2% (  -5% -    5%) 0.782
                             Fuzzy1       43.71      (1.3%)       43.82      
(1.7%)    0.3% (  -2% -    3%) 0.509
                      OrHighNotHigh      164.97      (3.7%)      165.40      
(4.4%)    0.3% (  -7% -    8%) 0.806
                        AndHighHigh       19.17      (2.1%)       19.23      
(2.3%)    0.3% (  -3% -    4%) 0.569
                        MedSpanNear        1.11      (2.0%)        1.12      
(2.0%)    0.6% (  -3% -    4%) 0.266
                         AndHighLow      338.06      (2.1%)      340.30      
(1.9%)    0.7% (  -3% -    4%) 0.196
                        LowSpanNear        4.20      (2.4%)        4.23      
(2.1%)    0.7% (  -3% -    5%) 0.194
                         OrHighHigh       13.15      (4.0%)       13.26      
(4.5%)    0.8% (  -7% -    9%) 0.454
                      OrNotHighHigh      148.59      (3.9%)      149.94      
(5.1%)    0.9% (  -7% -   10%) 0.442
                           HighTerm      229.24      (3.7%)      231.60      
(4.2%)    1.0% (  -6% -    9%) 0.314
                       HighSpanNear        6.94      (3.3%)        7.01      
(3.6%)    1.1% (  -5% -    8%) 0.224
                       OrHighNotMed      126.33      (4.3%)      127.74      
(4.9%)    1.1% (  -7% -   10%) 0.343
                       OrHighNotLow      254.19      (3.1%)      257.19      
(3.9%)    1.2% (  -5% -    8%) 0.200
                         AndHighMed       41.35      (5.0%)       42.00      
(3.2%)    1.6% (  -6% -   10%) 0.148
                             Fuzzy2       34.61      (1.8%)       35.34      
(2.1%)    2.1% (  -1% -    6%) 0.000
                          OrHighMed       37.84      (5.8%)       38.65      
(4.2%)    2.1% (  -7% -   12%) 0.107
                       OrNotHighLow      248.52      (2.8%)      254.75      
(2.5%)    2.5% (  -2% -    7%) 0.000
                          OrHighLow      193.81      (2.3%)      200.07      
(2.5%)    3.2% (  -1% -    8%) 0.000
                       OrNotHighMed      130.29      (3.6%)      134.66      
(4.7%)    3.4% (  -4% -   12%) 0.002
                            MedTerm      241.15      (3.8%)      249.46      
(4.5%)    3.4% (  -4% -   12%) 0.001
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[PR] Optimize outputs accumulating for SegmentTermsEnum and IntersectTermsEnum [lucene]

Reply via email to