[ 
https://issues.apache.org/jira/browse/LUCENE-9850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17314116#comment-17314116
 ] 

Greg Miller commented on LUCENE-9850:
-------------------------------------

I think I've actually made good progress on optimizing this. The last bit to 
crack was reading the exception bytes in all at once instead of byte-by-byte. 
The benchmark results look pretty promising to me. I'd be curious what others 
think of these results. If others agree they look promising, I'll tidy up the 
[code|https://github.com/apache/lucene/compare/main...gsmiller:LUCENE-9850/pfordocids2],
 write some tests and put a PR out there. And just as a reminder, this change 
reduces the doc ID data in the Wikipedia benchmark index by *~12%*, resulting 
in an *~3.3% overall index size reduction*.

Benchmark results are as follows. It does look like there's a significant 
regression in some task, but much smaller than before. But now there are also 
some significant QPS improvements for some other tasks. Seems like decent 
results overall, but others will have more opinions on this I'm sure.
{code:java}
                    TaskQPS baseline      StdDevQPS pfordocids      StdDev      
          Pct diff p-value
   HighTermDayOfYearSort       76.66     (13.5%)       74.15     (12.0%)   
-3.3% ( -25% -   25%) 0.418
       HighTermMonthSort      105.44     (10.3%)      102.66      (8.5%)   
-2.6% ( -19% -   18%) 0.379
             LowSpanNear       38.95      (2.3%)       38.05      (1.7%)   
-2.3% (  -6% -    1%) 0.000
              AndHighMed       85.41      (2.6%)       83.58      (2.5%)   
-2.1% (  -7% -    3%) 0.008
              OrHighHigh       16.29      (1.9%)       16.00      (6.1%)   
-1.8% (  -9% -    6%) 0.209
              TermDTSort      186.08     (11.7%)      182.87     (10.0%)   
-1.7% ( -20% -   22%) 0.616
             MedSpanNear       12.45      (2.2%)       12.24      (1.6%)   
-1.7% (  -5% -    2%) 0.005
    HighIntervalsOrdered        6.11      (2.8%)        6.00      (2.8%)   
-1.7% (  -7% -    4%) 0.061
    HighTermTitleBDVSort       61.20     (10.8%)       60.19     (10.2%)   
-1.7% ( -20% -   21%) 0.617
         MedSloppyPhrase       29.64      (4.6%)       29.16      (4.5%)   
-1.6% ( -10% -    7%) 0.262
             AndHighHigh       58.98      (1.9%)       58.05      (2.2%)   
-1.6% (  -5% -    2%) 0.015
               OrHighMed       62.28      (2.6%)       61.33      (6.0%)   
-1.5% (  -9% -    7%) 0.293
         LowSloppyPhrase       11.15      (3.2%)       10.98      (3.2%)   
-1.5% (  -7% -    5%) 0.134
            HighSpanNear        7.33      (2.7%)        7.25      (2.3%)   
-1.2% (  -6% -    3%) 0.143
                 Prefix3      239.67     (15.4%)      237.28     (18.6%)   
-1.0% ( -30% -   38%) 0.853
               LowPhrase      131.88      (2.3%)      130.75      (2.6%)   
-0.9% (  -5% -    4%) 0.264
               OrHighLow      264.48      (5.4%)      262.67      (5.7%)   
-0.7% ( -11% -   10%) 0.695
BrowseDayOfYearSSDVFacets       11.57      (4.5%)       11.49      (5.0%)   
-0.7% (  -9% -    9%) 0.661
   BrowseMonthSSDVFacets       12.38      (6.6%)       12.31      (7.9%)   
-0.6% ( -14% -   14%) 0.799
                  Fuzzy2       52.44     (12.0%)       52.24     (12.3%)   
-0.4% ( -22% -   27%) 0.920
                PKLookup      146.58      (3.8%)      146.05      (4.1%)   
-0.4% (  -7% -    7%) 0.774
        HighSloppyPhrase        1.78      (5.1%)        1.77      (4.2%)   
-0.3% (  -9% -    9%) 0.839
BrowseDayOfYearTaxoFacets        4.15      (3.4%)        4.15      (2.9%)   
-0.0% (  -6% -    6%) 0.962
    BrowseDateTaxoFacets        4.15      (3.4%)        4.14      (2.8%)   
-0.0% (  -6% -    6%) 0.964
   BrowseMonthTaxoFacets        5.02      (2.7%)        5.02      (2.4%)    
0.1% (  -4% -    5%) 0.920
               MedPhrase       87.31      (3.1%)       87.43      (3.1%)    
0.1% (  -5% -    6%) 0.893
                 Respell       44.49      (2.2%)       44.74      (2.8%)    
0.6% (  -4% -    5%) 0.484
                Wildcard       54.13      (3.3%)       54.49      (3.8%)    
0.7% (  -6% -    7%) 0.557
            OrHighNotMed      521.75      (5.5%)      526.04      (6.2%)    
0.8% ( -10% -   13%) 0.659
                  Fuzzy1       54.77      (9.9%)       55.25      (7.8%)    
0.9% ( -15% -   20%) 0.753
                  IntNRQ       94.71      (0.5%)       95.68      (1.3%)    
1.0% (   0% -    2%) 0.001
            OrNotHighLow      618.18      (5.1%)      624.93      (6.1%)    
1.1% (  -9% -   12%) 0.537
            OrNotHighMed      473.89      (3.2%)      479.65      (5.4%)    
1.2% (  -7% -   10%) 0.388
                 MedTerm     1115.74      (4.5%)     1130.77      (6.8%)    
1.3% (  -9% -   13%) 0.463
           OrHighNotHigh      428.11      (4.0%)      436.25      (6.9%)    
1.9% (  -8% -   13%) 0.286
              HighPhrase       86.34      (2.6%)       87.99      (3.8%)    
1.9% (  -4% -    8%) 0.064
              AndHighLow      612.62      (4.5%)      625.92      (6.6%)    
2.2% (  -8% -   13%) 0.223
            OrHighNotLow      551.28      (4.5%)      567.10      (7.0%)    
2.9% (  -8% -   15%) 0.125
                HighTerm      917.35      (2.5%)      945.58      (4.8%)    
3.1% (  -4% -   10%) 0.011
           OrNotHighHigh      473.03      (6.0%)      494.08      (8.5%)    
4.4% (  -9% -   20%) 0.056
                 LowTerm     1060.12      (3.2%)     1111.79      (6.5%)    
4.9% (  -4% -   15%) 0.003
{code}
Note the flame chart difference specifically for the "applyExceptionsIn32Space" 
step by reading all bytes at once (compared to my last comment above), which 
seems to have helped out. Reading the exception bytes now appears to be ~1/2 of 
the time spent in applying the exceptions vs. ~2/3 as before:

!bulk_read_1.png|width=845,height=109!

!bulk_read_2.png|width=839,height=84!

> Explore PFOR for Doc ID delta encoding (instead of FOR)
> -------------------------------------------------------
>
>                 Key: LUCENE-9850
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9850
>             Project: Lucene - Core
>          Issue Type: Task
>          Components: core/codecs
>    Affects Versions: main (9.0)
>            Reporter: Greg Miller
>            Priority: Minor
>         Attachments: apply_exceptions.png, bulk_read_1.png, bulk_read_2.png, 
> for.png, pfor.png
>
>
> It'd be interesting to explore using PFOR instead of FOR for doc ID encoding. 
> Right now PFOR is used for positions, frequencies and payloads, but FOR is 
> used for doc ID deltas. From a recent 
> [conversation|http://mail-archives.apache.org/mod_mbox/lucene-dev/202103.mbox/%3CCAPsWd%2BOp7d_GxNosB5r%3DQMPA-v0SteHWjXUmG3gwQot4gkubWw%40mail.gmail.com%3E]
>  on the dev mailing list, it sounds like this decision was made based on the 
> optimization possible when expanding the deltas.
> I'd be interesting in measuring the index size reduction possible with 
> switching to PFOR compared to the performance reduction we might see by no 
> longer being able to apply the deltas in as optimal a way.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to