[jira] [Commented] (LUCENE-10639) WANDScorer performs better without two-phase

Adrien Grand (Jira) Sat, 02 Jul 2022 06:48:06 -0700


    [ 
https://issues.apache.org/jira/browse/LUCENE-10639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17561751#comment-17561751
 ]


Adrien Grand commented on LUCENE-10639:
---------------------------------------

I suspected there was some overhead to two-phase iteration but not as much as 
this. Two-phase iteration doesn't aim at improving the performance of queries 
on their own, but when combined with other queries through conjunctions: 
conjunctions make sure to reach agreement across approximations before they 
proceed with the match phase. This is the feature that makes Lucene perform 
better than other search libraries on the query `+"the who" +uk` at 
https://tantivy-search.github.io/bench/, because Lucene makes sure that 
documents contain all of "the", "who" and "uk" before it starts checking 
positions. I would also expect two-phase iteration to help on [AndMedOrHighHigh 
on nightly 
benchmarks|https://home.apache.org/~mikemccand/lucenebench/AndMedOrHighHigh.html]
 since WANDScorer will do less work to return the next candidate on or beyond 
the lead doc ID produced by the "Med" term.

> WANDScorer performs better without two-phase
> --------------------------------------------
>
>                 Key: LUCENE-10639
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10639
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/search
>            Reporter: Greg Miller
>            Priority: Major
>
> After looking at the recent improvement [~jpountz] made to WAND scoring in 
> LUCENE-10634, which does additional work during match confirmation to not 
> confirm a match who's score wouldn't be competitive, I wanted to see how 
> performance would shift if we squashed the two-phase iteration completely and 
> only returned true matches (that were also known to be competitive by score) 
> in the "approximation" phase. I was a bit surprised to find that luceneutil 
> benchmarks (run with {{{}wikimediumall{}}}), improves significantly on some 
> disjunction tasks and doesn't show significant regressions anywhere else.
> Note that I used LUCENE-10634 as a baseline, and built my candidate change on 
> top of that. The diff can be seen here: 
> [DIFF|https://github.com/gsmiller/lucene/compare/b2d46440998fe4a972e8cc8c948580111359ed0f..c5bab794c92dbc66e70f9389948c1bdfe9b45231]
> A simple conclusion here might be that we shouldn't do two-phase iteration in 
> WANDScorer, but I'm pretty sure that's not right. I wonder if what's really 
> going on is that we're under-estimating the cost of confirming a match? Right 
> now we just return the tail size as the cost. While the cost of confirming a 
> match is proportional to the tail size, the actual work involved can be quite 
> significant (having to advance tail iterators to new blocks and decompress 
> them). I wonder if the WAND second phase is being run too early on 
> approximate candidates, and if less-expensive, (and even possibly more 
> restrictive?), second phases could/should be running first?
> I'm raising this here as more of a curiosity to see if it sparks ideas on how 
> to move forward. Again, I'm not proposing we do away with two-phase 
> iteration, but it seems we might be able to improve things. Maybe I'll 
> explore changing the cost heuristic next. Also, maybe there's some different 
> benchmarking that would be useful here that I may not be familiar with?
> Benchmark results on wikimediumall:
> {code:java}
>                             TaskQPS baseline      StdDevQPS candidate      
> StdDev                Pct diff p-value
>             HighTermTitleBDVSort       22.52     (18.9%)       21.66     
> (15.6%)   -3.8% ( -32% -   37%) 0.485
>                          Prefix3        9.38      (9.2%)        9.09     
> (10.6%)   -3.1% ( -20% -   18%) 0.326
>                HighTermMonthSort       25.37     (16.0%)       24.87     
> (17.1%)   -2.0% ( -30% -   37%) 0.710
>             MedTermDayTaxoFacets        9.62      (4.2%)        9.51      
> (4.1%)   -1.2% (  -9% -    7%) 0.368
>                       TermDTSort       74.69     (18.0%)       74.13     
> (18.2%)   -0.7% ( -31% -   43%) 0.897
>            HighTermDayOfYearSort       52.64     (16.1%)       52.32     
> (15.4%)   -0.6% ( -27% -   36%) 0.903
>            BrowseMonthTaxoFacets        8.64     (19.1%)        8.59     
> (19.8%)   -0.6% ( -33% -   47%) 0.926
>             BrowseDateSSDVFacets        0.86      (9.5%)        0.86     
> (13.1%)   -0.4% ( -20% -   24%) 0.914
>                         PKLookup      147.18      (3.9%)      146.66      
> (3.3%)   -0.3% (  -7% -    7%) 0.759
>        BrowseDayOfYearSSDVFacets        3.47      (4.5%)        3.45      
> (4.8%)   -0.3% (  -9% -    9%) 0.822
>                         Wildcard       36.36      (4.4%)       36.26      
> (5.2%)   -0.3% (  -9% -    9%) 0.866
>            BrowseMonthSSDVFacets        4.15     (12.7%)        4.13     
> (12.8%)   -0.3% ( -22% -   28%) 0.950
>          AndHighMedDayTaxoFacets       15.21      (2.7%)       15.18      
> (2.9%)   -0.2% (  -5% -    5%) 0.819
>                           Fuzzy1       68.33      (1.8%)       68.22      
> (2.0%)   -0.2% (  -3% -    3%) 0.783
>           OrHighMedDayTaxoFacets        2.90      (4.1%)        2.89      
> (4.0%)   -0.1% (  -7% -    8%) 0.930
>                        MedPhrase       52.81      (2.3%)       52.76      
> (1.8%)   -0.1% (  -4% -    4%) 0.878
>                          Respell       36.80      (1.9%)       36.78      
> (1.9%)   -0.1% (  -3% -    3%) 0.933
>                           Fuzzy2       63.06      (1.9%)       63.05      
> (2.1%)   -0.0% (  -3% -    4%) 0.971
>                        LowPhrase       74.60      (1.9%)       74.61      
> (1.8%)    0.0% (  -3% -    3%) 0.987
>         AndHighHighDayTaxoFacets        4.54      (2.3%)        4.55      
> (2.0%)    0.0% (  -4% -    4%) 0.960
>                       HighPhrase      353.13      (2.6%)      353.28      
> (2.5%)    0.0% (  -4% -    5%) 0.958
>                    OrNotHighHigh      761.72      (4.0%)      762.48      
> (3.6%)    0.1% (  -7% -    8%) 0.935
>                     OrHighNotLow     1129.94      (4.1%)     1131.56      
> (3.6%)    0.1% (  -7% -    8%) 0.906
>                          LowTerm     1315.90      (2.9%)     1318.61      
> (2.5%)    0.2% (  -5% -    5%) 0.810
>                           IntNRQ      192.33      (2.8%)      192.93      
> (2.3%)    0.3% (  -4% -    5%) 0.701
>                      LowSpanNear       23.60      (2.2%)       23.68      
> (1.6%)    0.3% (  -3% -    4%) 0.592
>                     OrNotHighMed      867.21      (2.3%)      870.27      
> (2.8%)    0.4% (  -4% -    5%) 0.664
>      BrowseRandomLabelSSDVFacets        2.53      (1.6%)        2.54      
> (1.9%)    0.4% (  -3% -    3%) 0.494
>                       AndHighMed      105.33      (4.5%)      105.83      
> (4.6%)    0.5% (  -8% -    9%) 0.739
>                         HighTerm     1030.35      (5.7%)     1035.54      
> (5.9%)    0.5% ( -10% -   12%) 0.783
>                  MedSloppyPhrase       41.07      (3.0%)       41.28      
> (2.9%)    0.5% (  -5% -    6%) 0.581
>                       AndHighLow      287.51      (3.2%)      289.03      
> (4.3%)    0.5% (  -6% -    8%) 0.657
>                     OrHighNotMed      910.71      (3.9%)      915.93      
> (4.1%)    0.6% (  -7% -    8%) 0.651
>                      AndHighHigh       28.96      (5.0%)       29.15      
> (5.3%)    0.6% (  -9% -   11%) 0.695
>                     OrNotHighLow      679.21      (2.7%)      683.68      
> (4.1%)    0.7% (  -6% -    7%) 0.551
>                          MedTerm     1425.49      (4.8%)     1435.41      
> (5.1%)    0.7% (  -8% -   11%) 0.657
>                      MedSpanNear        8.74      (3.0%)        8.80      
> (2.8%)    0.7% (  -4% -    6%) 0.448
>      BrowseRandomLabelTaxoFacets        6.11     (14.4%)        6.16     
> (15.2%)    0.7% ( -25% -   35%) 0.875
>                    OrHighNotHigh      674.18      (4.1%)      679.40      
> (4.5%)    0.8% (  -7% -    9%) 0.569
>                  LowSloppyPhrase        5.08      (3.3%)        5.12      
> (3.5%)    0.8% (  -5% -    7%) 0.445
>                     HighSpanNear        2.22      (5.4%)        2.25      
> (4.2%)    1.3% (  -7% -   11%) 0.398
>                 HighSloppyPhrase        5.27      (7.8%)        5.34      
> (9.0%)    1.3% ( -14% -   19%) 0.622
>              LowIntervalsOrdered       17.88      (4.8%)       18.21      
> (3.1%)    1.9% (  -5% -   10%) 0.144
>             BrowseDateTaxoFacets        6.51     (14.4%)        6.65     
> (17.4%)    2.3% ( -25% -   39%) 0.652
>        BrowseDayOfYearTaxoFacets        6.52     (14.4%)        6.68     
> (17.7%)    2.5% ( -25% -   40%) 0.624
>              MedIntervalsOrdered       14.43      (7.8%)       14.80      
> (4.5%)    2.6% (  -9% -   16%) 0.205
>                        OrHighLow      158.48      (3.2%)      162.94      
> (4.2%)    2.8% (  -4% -   10%) 0.017
>             HighIntervalsOrdered        1.56      (9.4%)        1.60      
> (5.2%)    3.0% ( -10% -   19%) 0.215
>                        OrHighMed       65.32      (4.2%)       71.62      
> (4.1%)    9.6% (   1% -   18%) 0.000
>                       OrHighHigh       14.04      (4.5%)       15.68      
> (3.9%)   11.7% (   3% -   21%) 0.000
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-10639) WANDScorer performs better without two-phase

Reply via email to