[ https://issues.apache.org/jira/browse/LUCENE-9125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17017978#comment-17017978 ]
Bruno Roustant commented on LUCENE-9125: ---------------------------------------- In the benchmark above I used by error wikimedium10k (I edited to mention that). Here is the benchmark for wikimediumall: Task QPS trunk StdDev QPS patch StdDev Pct diff OrHighNotHigh 769.84 (4.8%) 756.84 (5.0%) -1.7% ( -10% - 8%) OrNotHighLow 664.03 (4.2%) 653.64 (3.4%) -1.6% ( -8% - 6%) OrNotHighMed 574.56 (3.0%) 566.90 (2.5%) -1.3% ( -6% - 4%) MedTerm 1373.80 (3.9%) 1359.30 (5.1%) -1.1% ( -9% - 8%) AndHighHigh 19.84 (3.6%) 19.67 (2.9%) -0.9% ( -7% - 5%) AndHighLow 474.49 (2.9%) 470.36 (3.6%) -0.9% ( -7% - 5%) Fuzzy1 69.27 (10.7%) 68.75 (11.0%) -0.7% ( -20% - 23%) OrNotHighHigh 569.30 (3.4%) 565.26 (5.0%) -0.7% ( -8% - 7%) MedPhrase 36.97 (2.4%) 36.76 (2.7%) -0.6% ( -5% - 4%) HighTerm 1133.65 (4.2%) 1128.30 (4.3%) -0.5% ( -8% - 8%) OrHighLow 227.08 (2.9%) 226.24 (3.3%) -0.4% ( -6% - 6%) OrHighHigh 24.17 (2.6%) 24.08 (2.4%) -0.4% ( -5% - 4%) Prefix3 25.30 (3.8%) 25.22 (3.7%) -0.3% ( -7% - 7%) OrHighMed 48.26 (3.1%) 48.11 (3.1%) -0.3% ( -6% - 6%) LowTerm 1087.75 (3.4%) 1084.44 (3.3%) -0.3% ( -6% - 6%) AndHighMed 69.62 (3.9%) 69.44 (4.1%) -0.3% ( -7% - 7%) HighSloppyPhrase 15.11 (2.6%) 15.08 (2.6%) -0.2% ( -5% - 5%) Respell 43.34 (2.0%) 43.28 (2.3%) -0.1% ( -4% - 4%) OrHighNotLow 666.79 (3.4%) 665.98 (4.9%) -0.1% ( -8% - 8%) HighSpanNear 8.21 (1.8%) 8.20 (2.0%) -0.1% ( -3% - 3%) HighIntervalsOrdered 14.46 (1.2%) 14.45 (1.4%) -0.1% ( -2% - 2%) HighPhrase 333.99 (3.3%) 333.74 (3.9%) -0.1% ( -7% - 7%) MedSpanNear 12.08 (1.8%) 12.07 (2.0%) -0.1% ( -3% - 3%) LowPhrase 481.10 (2.5%) 481.14 (3.4%) 0.0% ( -5% - 6%) MedSloppyPhrase 6.78 (2.9%) 6.78 (2.9%) 0.0% ( -5% - 6%) PKLookup 157.80 (2.5%) 157.83 (2.5%) 0.0% ( -4% - 5%) LowSpanNear 21.48 (2.1%) 21.48 (2.3%) 0.0% ( -4% - 4%) OrHighNotMed 590.59 (3.9%) 591.21 (3.8%) 0.1% ( -7% - 8%) BrowseMonthTaxoFacets 1.06 (1.1%) 1.06 (0.9%) 0.1% ( -1% - 2%) LowSloppyPhrase 40.57 (2.1%) 40.63 (2.2%) 0.1% ( -4% - 4%) IntNRQ 124.31 (4.2%) 124.53 (4.9%) 0.2% ( -8% - 9%) BrowseDateTaxoFacets 1.00 (1.0%) 1.00 (0.7%) 0.2% ( -1% - 1%) BrowseDayOfYearTaxoFacets 0.99 (0.9%) 1.00 (0.7%) 0.2% ( -1% - 1%) HighTermDayOfYearSort 18.57 (6.2%) 18.62 (6.0%) 0.3% ( -11% - 13%) BrowseMonthSSDVFacets 4.38 (1.0%) 4.40 (0.9%) 0.4% ( -1% - 2%) BrowseDayOfYearSSDVFacets 3.92 (0.7%) 3.94 (0.7%) 0.5% ( 0% - 1%) Wildcard 52.17 (4.0%) 52.47 (5.0%) 0.6% ( -8% - 9%) Fuzzy2 57.57 (9.5%) 58.32 (9.3%) 1.3% ( -16% - 22%) HighTermMonthSort 40.51 (14.2%) 41.47 (13.9%) 2.4% ( -22% - 35%) {quote}There's an option for lucene-util to format the output for JIRA {quote} Last time I used this option Jira interpreted some tags and the resulting display was not better than this basic one. {quote}Looking at the results you posted, the optimization seems fairly invisible {quote} Yes. The change optimizes the construction only of the CompiledAutomaton, so this is a tiny part of the fuzzy query execution. {quote}that's 4.7% of "noise" {quote} Yes, there is noise. I tried baseline vs baseline and got the same noise. Maybe with wikimediumall this time there is less noise. > Improve Automaton.step() with binary search and introduce Automaton.next() > -------------------------------------------------------------------------- > > Key: LUCENE-9125 > URL: https://issues.apache.org/jira/browse/LUCENE-9125 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Bruno Roustant > Assignee: Bruno Roustant > Priority: Major > Fix For: 8.5 > > Time Spent: 40m > Remaining Estimate: 0h > > Implement the existing todo in Automaton.step() (lookup a transition from a > source state depending on a given label) to use binary search since the > transitions are sorted. > Introduce new method Automaton.next() to optimize iteration & lookup over all > the transitions of a state. This will be used in RunAutomaton constructor and > in MinimizationOperations.minimize(). -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org