[jira] [Commented] (LUCENE-10639) WANDScorer performs better without two-phase

Greg Miller (Jira) Sat, 02 Jul 2022 09:58:04 -0700


    [ 
https://issues.apache.org/jira/browse/LUCENE-10639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17561772#comment-17561772
 ]


Greg Miller commented on LUCENE-10639:
--------------------------------------

{quote}I suspected there was some overhead to two-phase iteration but not as 
much as this.
{quote}
Right. Yeah, I guess I was so surprised by the performance shift that I assumed 
there must be an interesting second-phase happening. But from what you're 
saying, it sounds like these {{OrHighLow/Med/High}} tasks aren't doing that. 
And that the performance change is purely some side-effect of running the two 
phases instead of doing all the checks in the first phase. I should have dug 
into what these tasks are doing.
{quote}Hotspot was not always able to optimize "if (liveDocs == null)" checks
{quote}
Interesting. Seems worth a shot.

 

Thanks for the quick thoughts!

> WANDScorer performs better without two-phase
> --------------------------------------------
>
>                 Key: LUCENE-10639
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10639
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/search
>            Reporter: Greg Miller
>            Priority: Major
>
> After looking at the recent improvement [~jpountz] made to WAND scoring in 
> LUCENE-10634, which does additional work during match confirmation to not 
> confirm a match who's score wouldn't be competitive, I wanted to see how 
> performance would shift if we squashed the two-phase iteration completely and 
> only returned true matches (that were also known to be competitive by score) 
> in the "approximation" phase. I was a bit surprised to find that luceneutil 
> benchmarks (run with {{{}wikimediumall{}}}), improves significantly on some 
> disjunction tasks and doesn't show significant regressions anywhere else.
> Note that I used LUCENE-10634 as a baseline, and built my candidate change on 
> top of that. The diff can be seen here: 
> [DIFF|https://github.com/gsmiller/lucene/compare/b2d46440998fe4a972e8cc8c948580111359ed0f..c5bab794c92dbc66e70f9389948c1bdfe9b45231]
> A simple conclusion here might be that we shouldn't do two-phase iteration in 
> WANDScorer, but I'm pretty sure that's not right. I wonder if what's really 
> going on is that we're under-estimating the cost of confirming a match? Right 
> now we just return the tail size as the cost. While the cost of confirming a 
> match is proportional to the tail size, the actual work involved can be quite 
> significant (having to advance tail iterators to new blocks and decompress 
> them). I wonder if the WAND second phase is being run too early on 
> approximate candidates, and if less-expensive, (and even possibly more 
> restrictive?), second phases could/should be running first?
> I'm raising this here as more of a curiosity to see if it sparks ideas on how 
> to move forward. Again, I'm not proposing we do away with two-phase 
> iteration, but it seems we might be able to improve things. Maybe I'll 
> explore changing the cost heuristic next. Also, maybe there's some different 
> benchmarking that would be useful here that I may not be familiar with?
> Benchmark results on wikimediumall:
> {code:java}
>                             TaskQPS baseline      StdDevQPS candidate      
> StdDev                Pct diff p-value
>             HighTermTitleBDVSort       22.52     (18.9%)       21.66     
> (15.6%)   -3.8% ( -32% -   37%) 0.485
>                          Prefix3        9.38      (9.2%)        9.09     
> (10.6%)   -3.1% ( -20% -   18%) 0.326
>                HighTermMonthSort       25.37     (16.0%)       24.87     
> (17.1%)   -2.0% ( -30% -   37%) 0.710
>             MedTermDayTaxoFacets        9.62      (4.2%)        9.51      
> (4.1%)   -1.2% (  -9% -    7%) 0.368
>                       TermDTSort       74.69     (18.0%)       74.13     
> (18.2%)   -0.7% ( -31% -   43%) 0.897
>            HighTermDayOfYearSort       52.64     (16.1%)       52.32     
> (15.4%)   -0.6% ( -27% -   36%) 0.903
>            BrowseMonthTaxoFacets        8.64     (19.1%)        8.59     
> (19.8%)   -0.6% ( -33% -   47%) 0.926
>             BrowseDateSSDVFacets        0.86      (9.5%)        0.86     
> (13.1%)   -0.4% ( -20% -   24%) 0.914
>                         PKLookup      147.18      (3.9%)      146.66      
> (3.3%)   -0.3% (  -7% -    7%) 0.759
>        BrowseDayOfYearSSDVFacets        3.47      (4.5%)        3.45      
> (4.8%)   -0.3% (  -9% -    9%) 0.822
>                         Wildcard       36.36      (4.4%)       36.26      
> (5.2%)   -0.3% (  -9% -    9%) 0.866
>            BrowseMonthSSDVFacets        4.15     (12.7%)        4.13     
> (12.8%)   -0.3% ( -22% -   28%) 0.950
>          AndHighMedDayTaxoFacets       15.21      (2.7%)       15.18      
> (2.9%)   -0.2% (  -5% -    5%) 0.819
>                           Fuzzy1       68.33      (1.8%)       68.22      
> (2.0%)   -0.2% (  -3% -    3%) 0.783
>           OrHighMedDayTaxoFacets        2.90      (4.1%)        2.89      
> (4.0%)   -0.1% (  -7% -    8%) 0.930
>                        MedPhrase       52.81      (2.3%)       52.76      
> (1.8%)   -0.1% (  -4% -    4%) 0.878
>                          Respell       36.80      (1.9%)       36.78      
> (1.9%)   -0.1% (  -3% -    3%) 0.933
>                           Fuzzy2       63.06      (1.9%)       63.05      
> (2.1%)   -0.0% (  -3% -    4%) 0.971
>                        LowPhrase       74.60      (1.9%)       74.61      
> (1.8%)    0.0% (  -3% -    3%) 0.987
>         AndHighHighDayTaxoFacets        4.54      (2.3%)        4.55      
> (2.0%)    0.0% (  -4% -    4%) 0.960
>                       HighPhrase      353.13      (2.6%)      353.28      
> (2.5%)    0.0% (  -4% -    5%) 0.958
>                    OrNotHighHigh      761.72      (4.0%)      762.48      
> (3.6%)    0.1% (  -7% -    8%) 0.935
>                     OrHighNotLow     1129.94      (4.1%)     1131.56      
> (3.6%)    0.1% (  -7% -    8%) 0.906
>                          LowTerm     1315.90      (2.9%)     1318.61      
> (2.5%)    0.2% (  -5% -    5%) 0.810
>                           IntNRQ      192.33      (2.8%)      192.93      
> (2.3%)    0.3% (  -4% -    5%) 0.701
>                      LowSpanNear       23.60      (2.2%)       23.68      
> (1.6%)    0.3% (  -3% -    4%) 0.592
>                     OrNotHighMed      867.21      (2.3%)      870.27      
> (2.8%)    0.4% (  -4% -    5%) 0.664
>      BrowseRandomLabelSSDVFacets        2.53      (1.6%)        2.54      
> (1.9%)    0.4% (  -3% -    3%) 0.494
>                       AndHighMed      105.33      (4.5%)      105.83      
> (4.6%)    0.5% (  -8% -    9%) 0.739
>                         HighTerm     1030.35      (5.7%)     1035.54      
> (5.9%)    0.5% ( -10% -   12%) 0.783
>                  MedSloppyPhrase       41.07      (3.0%)       41.28      
> (2.9%)    0.5% (  -5% -    6%) 0.581
>                       AndHighLow      287.51      (3.2%)      289.03      
> (4.3%)    0.5% (  -6% -    8%) 0.657
>                     OrHighNotMed      910.71      (3.9%)      915.93      
> (4.1%)    0.6% (  -7% -    8%) 0.651
>                      AndHighHigh       28.96      (5.0%)       29.15      
> (5.3%)    0.6% (  -9% -   11%) 0.695
>                     OrNotHighLow      679.21      (2.7%)      683.68      
> (4.1%)    0.7% (  -6% -    7%) 0.551
>                          MedTerm     1425.49      (4.8%)     1435.41      
> (5.1%)    0.7% (  -8% -   11%) 0.657
>                      MedSpanNear        8.74      (3.0%)        8.80      
> (2.8%)    0.7% (  -4% -    6%) 0.448
>      BrowseRandomLabelTaxoFacets        6.11     (14.4%)        6.16     
> (15.2%)    0.7% ( -25% -   35%) 0.875
>                    OrHighNotHigh      674.18      (4.1%)      679.40      
> (4.5%)    0.8% (  -7% -    9%) 0.569
>                  LowSloppyPhrase        5.08      (3.3%)        5.12      
> (3.5%)    0.8% (  -5% -    7%) 0.445
>                     HighSpanNear        2.22      (5.4%)        2.25      
> (4.2%)    1.3% (  -7% -   11%) 0.398
>                 HighSloppyPhrase        5.27      (7.8%)        5.34      
> (9.0%)    1.3% ( -14% -   19%) 0.622
>              LowIntervalsOrdered       17.88      (4.8%)       18.21      
> (3.1%)    1.9% (  -5% -   10%) 0.144
>             BrowseDateTaxoFacets        6.51     (14.4%)        6.65     
> (17.4%)    2.3% ( -25% -   39%) 0.652
>        BrowseDayOfYearTaxoFacets        6.52     (14.4%)        6.68     
> (17.7%)    2.5% ( -25% -   40%) 0.624
>              MedIntervalsOrdered       14.43      (7.8%)       14.80      
> (4.5%)    2.6% (  -9% -   16%) 0.205
>                        OrHighLow      158.48      (3.2%)      162.94      
> (4.2%)    2.8% (  -4% -   10%) 0.017
>             HighIntervalsOrdered        1.56      (9.4%)        1.60      
> (5.2%)    3.0% ( -10% -   19%) 0.215
>                        OrHighMed       65.32      (4.2%)       71.62      
> (4.1%)    9.6% (   1% -   18%) 0.000
>                       OrHighHigh       14.04      (4.5%)       15.68      
> (3.9%)   11.7% (   3% -   21%) 0.000
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-10639) WANDScorer performs better without two-phase

Reply via email to