[ 
https://issues.apache.org/jira/browse/LUCENE-9613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17404419#comment-17404419
 ] 

Adrien Grand commented on LUCENE-9613:
--------------------------------------

I pushed some more specialization that gave the following results on 
wikimedium10m.
{noformat}
                    TaskQPS baseline      StdDev   QPS patch      StdDev        
        Pct diff p-value
        HighSloppyPhrase       24.30      (6.9%)       23.50      (9.3%)   
-3.3% ( -18% -   13%) 0.204
         MedSloppyPhrase       70.81      (4.4%)       69.37      (5.4%)   
-2.0% ( -11% -    8%) 0.191
         LowSloppyPhrase       30.96      (3.3%)       30.49      (3.6%)   
-1.5% (  -8% -    5%) 0.163
    HighTermTitleBDVSort       19.49      (2.8%)       19.38      (2.6%)   
-0.6% (  -5% -    4%) 0.503
               MedPhrase      322.19      (3.8%)      321.87      (9.1%)   
-0.1% ( -12% -   13%) 0.964
BrowseDayOfYearTaxoFacets        3.18      (3.7%)        3.18      (3.3%)   
-0.0% (  -6% -    7%) 0.981
    BrowseDateTaxoFacets        3.18      (3.7%)        3.18      (3.2%)    
0.1% (  -6% -    7%) 0.952
   BrowseMonthTaxoFacets        3.45      (4.7%)        3.46      (4.3%)    
0.2% (  -8% -    9%) 0.895
                  IntNRQ       91.30     (43.5%)       91.51     (43.4%)    
0.2% ( -60% -  154%) 0.987
     LowIntervalsOrdered       17.60      (6.6%)       17.64      (7.1%)    
0.2% ( -12% -   14%) 0.915
              AndHighLow     1005.24      (4.1%)     1008.36      (4.0%)    
0.3% (  -7% -    8%) 0.808
                 Prefix3      378.76     (11.9%)      380.28     (10.5%)    
0.4% ( -19% -   25%) 0.910
               LowPhrase      112.90      (2.8%)      113.37      (3.8%)    
0.4% (  -6% -    7%) 0.694
            HighSpanNear       51.40      (3.0%)       51.64      (3.0%)    
0.5% (  -5% -    6%) 0.621
            OrHighNotLow     1445.33      (4.9%)     1456.37      (4.6%)    
0.8% (  -8% -   10%) 0.614
                 MedTerm     2527.24      (6.3%)     2548.62      (4.6%)    
0.8% (  -9% -   12%) 0.628
            OrNotHighMed     1157.13      (2.7%)     1167.00      (3.3%)    
0.9% (  -5% -    7%) 0.370
             LowSpanNear       44.09      (2.0%)       44.48      (2.1%)    
0.9% (  -3% -    5%) 0.184
     MedIntervalsOrdered       10.95      (3.4%)       11.04      (3.5%)    
0.9% (  -5% -    8%) 0.420
    HighIntervalsOrdered       25.53      (3.6%)       25.77      (4.1%)    
1.0% (  -6% -    8%) 0.435
             MedSpanNear      109.47      (2.0%)      110.57      (2.7%)    
1.0% (  -3% -    5%) 0.183
           OrHighNotHigh     1095.98      (4.0%)     1107.45      (3.4%)    
1.0% (  -6% -    8%) 0.373
                  Fuzzy1      212.12      (6.8%)      214.37      (6.3%)    
1.1% ( -11% -   15%) 0.609
              OrHighHigh       34.88      (4.7%)       35.26      (3.2%)    
1.1% (  -6% -    9%) 0.392
               OrHighMed      124.51      (4.6%)      125.91      (2.5%)    
1.1% (  -5% -    8%) 0.339
                 Respell      271.84      (3.0%)      274.94      (2.8%)    
1.1% (  -4% -    7%) 0.210
            OrHighNotMed     1397.92      (4.0%)     1414.46      (3.9%)    
1.2% (  -6% -    9%) 0.344
              HighPhrase      674.43      (2.1%)      682.48      (4.1%)    
1.2% (  -4% -    7%) 0.245
             AndHighHigh       53.28      (3.4%)       53.92      (4.0%)    
1.2% (  -6% -    8%) 0.308
               OrHighLow      477.86      (4.1%)      483.78      (3.5%)    
1.2% (  -6% -    9%) 0.308
            OrNotHighLow     1223.79      (3.8%)     1239.31      (4.3%)    
1.3% (  -6% -    9%) 0.321
              AndHighMed      106.80      (3.5%)      108.17      (3.9%)    
1.3% (  -5% -    8%) 0.271
                 LowTerm     2514.56      (5.8%)     2549.48      (6.6%)    
1.4% ( -10% -   14%) 0.478
                Wildcard      157.42      (3.9%)      159.71      (4.1%)    
1.5% (  -6% -    9%) 0.246
           OrNotHighHigh     1013.51      (3.2%)     1028.66      (3.7%)    
1.5% (  -5% -    8%) 0.176
                  Fuzzy2      154.94      (8.8%)      157.37      (8.4%)    
1.6% ( -14% -   20%) 0.565
                HighTerm     1590.75      (4.9%)     1624.88      (4.9%)    
2.1% (  -7% -   12%) 0.168
       HighTermMonthSort       78.11      (7.6%)       81.58      (9.1%)    
4.4% ( -11% -   22%) 0.093
              TermDTSort       84.05      (7.2%)       87.88      (7.1%)    
4.5% (  -9% -   20%) 0.044
   HighTermDayOfYearSort      116.77      (6.1%)      122.33      (6.8%)    
4.8% (  -7% -   18%) 0.020
   BrowseMonthSSDVFacets       12.98      (3.1%)       14.45      (5.1%)   
11.3% (   2% -   20%) 0.000
BrowseDayOfYearSSDVFacets       12.38      (3.5%)       15.52     (12.7%)   
25.3% (   8% -   43%) 0.000
{noformat}

> Create blocks for ords when it helps in Lucene80DocValuesFormat
> ---------------------------------------------------------------
>
>                 Key: LUCENE-9613
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9613
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Priority: Minor
>             Fix For: main (9.0)
>
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Currently for sorted(-set) values, we always write ords using 
> log2(valueCount) bits per entry. However in several cases like when the field 
> is used in the index sort, or if one value is _very_common, splitting into 
> blocks like we do for numerics would help.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to