[ https://issues.apache.org/jira/browse/LUCENE-10639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17561751#comment-17561751 ]
Adrien Grand commented on LUCENE-10639: --------------------------------------- I suspected there was some overhead to two-phase iteration but not as much as this. Two-phase iteration doesn't aim at improving the performance of queries on their own, but when combined with other queries through conjunctions: conjunctions make sure to reach agreement across approximations before they proceed with the match phase. This is the feature that makes Lucene perform better than other search libraries on the query `+"the who" +uk` at https://tantivy-search.github.io/bench/, because Lucene makes sure that documents contain all of "the", "who" and "uk" before it starts checking positions. I would also expect two-phase iteration to help on [AndMedOrHighHigh on nightly benchmarks|https://home.apache.org/~mikemccand/lucenebench/AndMedOrHighHigh.html] since WANDScorer will do less work to return the next candidate on or beyond the lead doc ID produced by the "Med" term. > WANDScorer performs better without two-phase > -------------------------------------------- > > Key: LUCENE-10639 > URL: https://issues.apache.org/jira/browse/LUCENE-10639 > Project: Lucene - Core > Issue Type: Improvement > Components: core/search > Reporter: Greg Miller > Priority: Major > > After looking at the recent improvement [~jpountz] made to WAND scoring in > LUCENE-10634, which does additional work during match confirmation to not > confirm a match who's score wouldn't be competitive, I wanted to see how > performance would shift if we squashed the two-phase iteration completely and > only returned true matches (that were also known to be competitive by score) > in the "approximation" phase. I was a bit surprised to find that luceneutil > benchmarks (run with {{{}wikimediumall{}}}), improves significantly on some > disjunction tasks and doesn't show significant regressions anywhere else. > Note that I used LUCENE-10634 as a baseline, and built my candidate change on > top of that. The diff can be seen here: > [DIFF|https://github.com/gsmiller/lucene/compare/b2d46440998fe4a972e8cc8c948580111359ed0f..c5bab794c92dbc66e70f9389948c1bdfe9b45231] > A simple conclusion here might be that we shouldn't do two-phase iteration in > WANDScorer, but I'm pretty sure that's not right. I wonder if what's really > going on is that we're under-estimating the cost of confirming a match? Right > now we just return the tail size as the cost. While the cost of confirming a > match is proportional to the tail size, the actual work involved can be quite > significant (having to advance tail iterators to new blocks and decompress > them). I wonder if the WAND second phase is being run too early on > approximate candidates, and if less-expensive, (and even possibly more > restrictive?), second phases could/should be running first? > I'm raising this here as more of a curiosity to see if it sparks ideas on how > to move forward. Again, I'm not proposing we do away with two-phase > iteration, but it seems we might be able to improve things. Maybe I'll > explore changing the cost heuristic next. Also, maybe there's some different > benchmarking that would be useful here that I may not be familiar with? > Benchmark results on wikimediumall: > {code:java} > TaskQPS baseline StdDevQPS candidate > StdDev Pct diff p-value > HighTermTitleBDVSort 22.52 (18.9%) 21.66 > (15.6%) -3.8% ( -32% - 37%) 0.485 > Prefix3 9.38 (9.2%) 9.09 > (10.6%) -3.1% ( -20% - 18%) 0.326 > HighTermMonthSort 25.37 (16.0%) 24.87 > (17.1%) -2.0% ( -30% - 37%) 0.710 > MedTermDayTaxoFacets 9.62 (4.2%) 9.51 > (4.1%) -1.2% ( -9% - 7%) 0.368 > TermDTSort 74.69 (18.0%) 74.13 > (18.2%) -0.7% ( -31% - 43%) 0.897 > HighTermDayOfYearSort 52.64 (16.1%) 52.32 > (15.4%) -0.6% ( -27% - 36%) 0.903 > BrowseMonthTaxoFacets 8.64 (19.1%) 8.59 > (19.8%) -0.6% ( -33% - 47%) 0.926 > BrowseDateSSDVFacets 0.86 (9.5%) 0.86 > (13.1%) -0.4% ( -20% - 24%) 0.914 > PKLookup 147.18 (3.9%) 146.66 > (3.3%) -0.3% ( -7% - 7%) 0.759 > BrowseDayOfYearSSDVFacets 3.47 (4.5%) 3.45 > (4.8%) -0.3% ( -9% - 9%) 0.822 > Wildcard 36.36 (4.4%) 36.26 > (5.2%) -0.3% ( -9% - 9%) 0.866 > BrowseMonthSSDVFacets 4.15 (12.7%) 4.13 > (12.8%) -0.3% ( -22% - 28%) 0.950 > AndHighMedDayTaxoFacets 15.21 (2.7%) 15.18 > (2.9%) -0.2% ( -5% - 5%) 0.819 > Fuzzy1 68.33 (1.8%) 68.22 > (2.0%) -0.2% ( -3% - 3%) 0.783 > OrHighMedDayTaxoFacets 2.90 (4.1%) 2.89 > (4.0%) -0.1% ( -7% - 8%) 0.930 > MedPhrase 52.81 (2.3%) 52.76 > (1.8%) -0.1% ( -4% - 4%) 0.878 > Respell 36.80 (1.9%) 36.78 > (1.9%) -0.1% ( -3% - 3%) 0.933 > Fuzzy2 63.06 (1.9%) 63.05 > (2.1%) -0.0% ( -3% - 4%) 0.971 > LowPhrase 74.60 (1.9%) 74.61 > (1.8%) 0.0% ( -3% - 3%) 0.987 > AndHighHighDayTaxoFacets 4.54 (2.3%) 4.55 > (2.0%) 0.0% ( -4% - 4%) 0.960 > HighPhrase 353.13 (2.6%) 353.28 > (2.5%) 0.0% ( -4% - 5%) 0.958 > OrNotHighHigh 761.72 (4.0%) 762.48 > (3.6%) 0.1% ( -7% - 8%) 0.935 > OrHighNotLow 1129.94 (4.1%) 1131.56 > (3.6%) 0.1% ( -7% - 8%) 0.906 > LowTerm 1315.90 (2.9%) 1318.61 > (2.5%) 0.2% ( -5% - 5%) 0.810 > IntNRQ 192.33 (2.8%) 192.93 > (2.3%) 0.3% ( -4% - 5%) 0.701 > LowSpanNear 23.60 (2.2%) 23.68 > (1.6%) 0.3% ( -3% - 4%) 0.592 > OrNotHighMed 867.21 (2.3%) 870.27 > (2.8%) 0.4% ( -4% - 5%) 0.664 > BrowseRandomLabelSSDVFacets 2.53 (1.6%) 2.54 > (1.9%) 0.4% ( -3% - 3%) 0.494 > AndHighMed 105.33 (4.5%) 105.83 > (4.6%) 0.5% ( -8% - 9%) 0.739 > HighTerm 1030.35 (5.7%) 1035.54 > (5.9%) 0.5% ( -10% - 12%) 0.783 > MedSloppyPhrase 41.07 (3.0%) 41.28 > (2.9%) 0.5% ( -5% - 6%) 0.581 > AndHighLow 287.51 (3.2%) 289.03 > (4.3%) 0.5% ( -6% - 8%) 0.657 > OrHighNotMed 910.71 (3.9%) 915.93 > (4.1%) 0.6% ( -7% - 8%) 0.651 > AndHighHigh 28.96 (5.0%) 29.15 > (5.3%) 0.6% ( -9% - 11%) 0.695 > OrNotHighLow 679.21 (2.7%) 683.68 > (4.1%) 0.7% ( -6% - 7%) 0.551 > MedTerm 1425.49 (4.8%) 1435.41 > (5.1%) 0.7% ( -8% - 11%) 0.657 > MedSpanNear 8.74 (3.0%) 8.80 > (2.8%) 0.7% ( -4% - 6%) 0.448 > BrowseRandomLabelTaxoFacets 6.11 (14.4%) 6.16 > (15.2%) 0.7% ( -25% - 35%) 0.875 > OrHighNotHigh 674.18 (4.1%) 679.40 > (4.5%) 0.8% ( -7% - 9%) 0.569 > LowSloppyPhrase 5.08 (3.3%) 5.12 > (3.5%) 0.8% ( -5% - 7%) 0.445 > HighSpanNear 2.22 (5.4%) 2.25 > (4.2%) 1.3% ( -7% - 11%) 0.398 > HighSloppyPhrase 5.27 (7.8%) 5.34 > (9.0%) 1.3% ( -14% - 19%) 0.622 > LowIntervalsOrdered 17.88 (4.8%) 18.21 > (3.1%) 1.9% ( -5% - 10%) 0.144 > BrowseDateTaxoFacets 6.51 (14.4%) 6.65 > (17.4%) 2.3% ( -25% - 39%) 0.652 > BrowseDayOfYearTaxoFacets 6.52 (14.4%) 6.68 > (17.7%) 2.5% ( -25% - 40%) 0.624 > MedIntervalsOrdered 14.43 (7.8%) 14.80 > (4.5%) 2.6% ( -9% - 16%) 0.205 > OrHighLow 158.48 (3.2%) 162.94 > (4.2%) 2.8% ( -4% - 10%) 0.017 > HighIntervalsOrdered 1.56 (9.4%) 1.60 > (5.2%) 3.0% ( -10% - 19%) 0.215 > OrHighMed 65.32 (4.2%) 71.62 > (4.1%) 9.6% ( 1% - 18%) 0.000 > OrHighHigh 14.04 (4.5%) 15.68 > (3.9%) 11.7% ( 3% - 21%) 0.000 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org