[ https://issues.apache.org/jira/browse/LUCENE-10639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17561772#comment-17561772 ]
Greg Miller commented on LUCENE-10639: -------------------------------------- {quote}I suspected there was some overhead to two-phase iteration but not as much as this. {quote} Right. Yeah, I guess I was so surprised by the performance shift that I assumed there must be an interesting second-phase happening. But from what you're saying, it sounds like these {{OrHighLow/Med/High}} tasks aren't doing that. And that the performance change is purely some side-effect of running the two phases instead of doing all the checks in the first phase. I should have dug into what these tasks are doing. {quote}Hotspot was not always able to optimize "if (liveDocs == null)" checks {quote} Interesting. Seems worth a shot. Thanks for the quick thoughts! > WANDScorer performs better without two-phase > -------------------------------------------- > > Key: LUCENE-10639 > URL: https://issues.apache.org/jira/browse/LUCENE-10639 > Project: Lucene - Core > Issue Type: Improvement > Components: core/search > Reporter: Greg Miller > Priority: Major > > After looking at the recent improvement [~jpountz] made to WAND scoring in > LUCENE-10634, which does additional work during match confirmation to not > confirm a match who's score wouldn't be competitive, I wanted to see how > performance would shift if we squashed the two-phase iteration completely and > only returned true matches (that were also known to be competitive by score) > in the "approximation" phase. I was a bit surprised to find that luceneutil > benchmarks (run with {{{}wikimediumall{}}}), improves significantly on some > disjunction tasks and doesn't show significant regressions anywhere else. > Note that I used LUCENE-10634 as a baseline, and built my candidate change on > top of that. The diff can be seen here: > [DIFF|https://github.com/gsmiller/lucene/compare/b2d46440998fe4a972e8cc8c948580111359ed0f..c5bab794c92dbc66e70f9389948c1bdfe9b45231] > A simple conclusion here might be that we shouldn't do two-phase iteration in > WANDScorer, but I'm pretty sure that's not right. I wonder if what's really > going on is that we're under-estimating the cost of confirming a match? Right > now we just return the tail size as the cost. While the cost of confirming a > match is proportional to the tail size, the actual work involved can be quite > significant (having to advance tail iterators to new blocks and decompress > them). I wonder if the WAND second phase is being run too early on > approximate candidates, and if less-expensive, (and even possibly more > restrictive?), second phases could/should be running first? > I'm raising this here as more of a curiosity to see if it sparks ideas on how > to move forward. Again, I'm not proposing we do away with two-phase > iteration, but it seems we might be able to improve things. Maybe I'll > explore changing the cost heuristic next. Also, maybe there's some different > benchmarking that would be useful here that I may not be familiar with? > Benchmark results on wikimediumall: > {code:java} > TaskQPS baseline StdDevQPS candidate > StdDev Pct diff p-value > HighTermTitleBDVSort 22.52 (18.9%) 21.66 > (15.6%) -3.8% ( -32% - 37%) 0.485 > Prefix3 9.38 (9.2%) 9.09 > (10.6%) -3.1% ( -20% - 18%) 0.326 > HighTermMonthSort 25.37 (16.0%) 24.87 > (17.1%) -2.0% ( -30% - 37%) 0.710 > MedTermDayTaxoFacets 9.62 (4.2%) 9.51 > (4.1%) -1.2% ( -9% - 7%) 0.368 > TermDTSort 74.69 (18.0%) 74.13 > (18.2%) -0.7% ( -31% - 43%) 0.897 > HighTermDayOfYearSort 52.64 (16.1%) 52.32 > (15.4%) -0.6% ( -27% - 36%) 0.903 > BrowseMonthTaxoFacets 8.64 (19.1%) 8.59 > (19.8%) -0.6% ( -33% - 47%) 0.926 > BrowseDateSSDVFacets 0.86 (9.5%) 0.86 > (13.1%) -0.4% ( -20% - 24%) 0.914 > PKLookup 147.18 (3.9%) 146.66 > (3.3%) -0.3% ( -7% - 7%) 0.759 > BrowseDayOfYearSSDVFacets 3.47 (4.5%) 3.45 > (4.8%) -0.3% ( -9% - 9%) 0.822 > Wildcard 36.36 (4.4%) 36.26 > (5.2%) -0.3% ( -9% - 9%) 0.866 > BrowseMonthSSDVFacets 4.15 (12.7%) 4.13 > (12.8%) -0.3% ( -22% - 28%) 0.950 > AndHighMedDayTaxoFacets 15.21 (2.7%) 15.18 > (2.9%) -0.2% ( -5% - 5%) 0.819 > Fuzzy1 68.33 (1.8%) 68.22 > (2.0%) -0.2% ( -3% - 3%) 0.783 > OrHighMedDayTaxoFacets 2.90 (4.1%) 2.89 > (4.0%) -0.1% ( -7% - 8%) 0.930 > MedPhrase 52.81 (2.3%) 52.76 > (1.8%) -0.1% ( -4% - 4%) 0.878 > Respell 36.80 (1.9%) 36.78 > (1.9%) -0.1% ( -3% - 3%) 0.933 > Fuzzy2 63.06 (1.9%) 63.05 > (2.1%) -0.0% ( -3% - 4%) 0.971 > LowPhrase 74.60 (1.9%) 74.61 > (1.8%) 0.0% ( -3% - 3%) 0.987 > AndHighHighDayTaxoFacets 4.54 (2.3%) 4.55 > (2.0%) 0.0% ( -4% - 4%) 0.960 > HighPhrase 353.13 (2.6%) 353.28 > (2.5%) 0.0% ( -4% - 5%) 0.958 > OrNotHighHigh 761.72 (4.0%) 762.48 > (3.6%) 0.1% ( -7% - 8%) 0.935 > OrHighNotLow 1129.94 (4.1%) 1131.56 > (3.6%) 0.1% ( -7% - 8%) 0.906 > LowTerm 1315.90 (2.9%) 1318.61 > (2.5%) 0.2% ( -5% - 5%) 0.810 > IntNRQ 192.33 (2.8%) 192.93 > (2.3%) 0.3% ( -4% - 5%) 0.701 > LowSpanNear 23.60 (2.2%) 23.68 > (1.6%) 0.3% ( -3% - 4%) 0.592 > OrNotHighMed 867.21 (2.3%) 870.27 > (2.8%) 0.4% ( -4% - 5%) 0.664 > BrowseRandomLabelSSDVFacets 2.53 (1.6%) 2.54 > (1.9%) 0.4% ( -3% - 3%) 0.494 > AndHighMed 105.33 (4.5%) 105.83 > (4.6%) 0.5% ( -8% - 9%) 0.739 > HighTerm 1030.35 (5.7%) 1035.54 > (5.9%) 0.5% ( -10% - 12%) 0.783 > MedSloppyPhrase 41.07 (3.0%) 41.28 > (2.9%) 0.5% ( -5% - 6%) 0.581 > AndHighLow 287.51 (3.2%) 289.03 > (4.3%) 0.5% ( -6% - 8%) 0.657 > OrHighNotMed 910.71 (3.9%) 915.93 > (4.1%) 0.6% ( -7% - 8%) 0.651 > AndHighHigh 28.96 (5.0%) 29.15 > (5.3%) 0.6% ( -9% - 11%) 0.695 > OrNotHighLow 679.21 (2.7%) 683.68 > (4.1%) 0.7% ( -6% - 7%) 0.551 > MedTerm 1425.49 (4.8%) 1435.41 > (5.1%) 0.7% ( -8% - 11%) 0.657 > MedSpanNear 8.74 (3.0%) 8.80 > (2.8%) 0.7% ( -4% - 6%) 0.448 > BrowseRandomLabelTaxoFacets 6.11 (14.4%) 6.16 > (15.2%) 0.7% ( -25% - 35%) 0.875 > OrHighNotHigh 674.18 (4.1%) 679.40 > (4.5%) 0.8% ( -7% - 9%) 0.569 > LowSloppyPhrase 5.08 (3.3%) 5.12 > (3.5%) 0.8% ( -5% - 7%) 0.445 > HighSpanNear 2.22 (5.4%) 2.25 > (4.2%) 1.3% ( -7% - 11%) 0.398 > HighSloppyPhrase 5.27 (7.8%) 5.34 > (9.0%) 1.3% ( -14% - 19%) 0.622 > LowIntervalsOrdered 17.88 (4.8%) 18.21 > (3.1%) 1.9% ( -5% - 10%) 0.144 > BrowseDateTaxoFacets 6.51 (14.4%) 6.65 > (17.4%) 2.3% ( -25% - 39%) 0.652 > BrowseDayOfYearTaxoFacets 6.52 (14.4%) 6.68 > (17.7%) 2.5% ( -25% - 40%) 0.624 > MedIntervalsOrdered 14.43 (7.8%) 14.80 > (4.5%) 2.6% ( -9% - 16%) 0.205 > OrHighLow 158.48 (3.2%) 162.94 > (4.2%) 2.8% ( -4% - 10%) 0.017 > HighIntervalsOrdered 1.56 (9.4%) 1.60 > (5.2%) 3.0% ( -10% - 19%) 0.215 > OrHighMed 65.32 (4.2%) 71.62 > (4.1%) 9.6% ( 1% - 18%) 0.000 > OrHighHigh 14.04 (4.5%) 15.68 > (3.9%) 11.7% ( 3% - 21%) 0.000 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org