[ https://issues.apache.org/jira/browse/LUCENE-9049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16979422#comment-16979422 ]
Jack Conradson commented on LUCENE-9049: ---------------------------------------- [~bruno.roustant] I have uploaded a patch. It passes both precommit and test locally for me. I used luceneutil to run benchmarks using wikimedium10m for this change to ensure that direct-addressing did indeed make cachedRootArcs redundant. From what I can tell the numbers look reasonable -- TaskQPS baseline StdDevQPS candidate StdDev Pct diff LowTerm 1682.82 (3.9%) 1657.84 (5.3%) -1.5% ( -10% - 7%) MedTerm 1339.95 (4.0%) 1321.72 (3.4%) -1.4% ( -8% - 6%) Fuzzy2 40.25 (7.4%) 39.80 (7.3%) -1.1% ( -14% - 14%) HighTermDayOfYearSort 68.44 (10.7%) 67.87 (10.6%) -0.8% ( -20% - 22%) OrNotHighMed 553.72 (5.2%) 550.82 (5.1%) -0.5% ( -10% - 10%) OrNotHighLow 670.08 (5.2%) 668.20 (5.2%) -0.3% ( -10% - 10%) HighPhrase 383.50 (1.8%) 382.51 (2.1%) -0.3% ( -4% - 3%) AndHighLow 748.51 (4.1%) 746.82 (5.0%) -0.2% ( -8% - 9%) Prefix3 128.89 (4.3%) 128.66 (4.6%) -0.2% ( -8% - 9%) PKLookup 190.04 (1.5%) 189.89 (1.6%) -0.1% ( -3% - 3%) Wildcard 125.05 (4.1%) 125.02 (3.9%) -0.0% ( -7% - 8%) OrNotHighHigh 642.63 (6.1%) 642.55 (5.3%) -0.0% ( -10% - 12%) OrHighNotLow 683.21 (7.1%) 683.29 (6.1%) 0.0% ( -12% - 14%) HighSpanNear 30.22 (2.5%) 30.23 (2.5%) 0.0% ( -4% - 5%) BrowseDayOfYearTaxoFacets 10190.78 (2.3%) 10194.55 (1.7%) 0.0% ( -3% - 4%) AndHighHigh 77.77 (2.0%) 77.81 (1.9%) 0.0% ( -3% - 4%) BrowseDateTaxoFacets 3.61 (1.4%) 3.62 (1.4%) 0.1% ( -2% - 2%) OrHighMed 78.14 (2.5%) 78.25 (2.8%) 0.1% ( -4% - 5%) AndHighMed 206.06 (2.4%) 206.45 (2.6%) 0.2% ( -4% - 5%) OrHighLow 560.24 (5.1%) 561.72 (5.2%) 0.3% ( -9% - 11%) MedSpanNear 37.18 (2.2%) 37.31 (1.9%) 0.4% ( -3% - 4%) IntNRQ 133.13 (1.5%) 133.65 (1.6%) 0.4% ( -2% - 3%) HighIntervalsOrdered 53.77 (2.0%) 53.98 (1.8%) 0.4% ( -3% - 4%) LowSloppyPhrase 91.00 (2.8%) 91.36 (3.1%) 0.4% ( -5% - 6%) HighTerm 1675.67 (5.1%) 1683.33 (4.9%) 0.5% ( -9% - 11%) BrowseMonthTaxoFacets 10228.66 (2.0%) 10277.31 (1.3%) 0.5% ( -2% - 3%) OrHighNotHigh 650.24 (6.0%) 653.55 (6.4%) 0.5% ( -11% - 13%) Respell 93.92 (1.8%) 94.48 (2.1%) 0.6% ( -3% - 4%) MedSloppyPhrase 41.38 (2.7%) 41.63 (3.0%) 0.6% ( -5% - 6%) BrowseDayOfYearSSDVFacets 14.25 (7.4%) 14.34 (7.5%) 0.6% ( -13% - 16%) LowSpanNear 39.78 (2.3%) 40.03 (1.7%) 0.6% ( -3% - 4%) OrHighHigh 37.28 (2.4%) 37.53 (2.6%) 0.7% ( -4% - 5%) HighSloppyPhrase 40.95 (4.6%) 41.25 (4.7%) 0.7% ( -8% - 10%) MedPhrase 68.73 (2.9%) 69.28 (2.8%) 0.8% ( -4% - 6%) LowPhrase 404.79 (3.0%) 409.52 (2.8%) 1.2% ( -4% - 7%) OrHighNotMed 674.63 (5.3%) 683.79 (5.7%) 1.4% ( -9% - 13%) BrowseMonthSSDVFacets 15.97 (5.5%) 16.19 (3.3%) 1.4% ( -6% - 10%) Fuzzy1 102.56 (11.7%) 104.14 (11.2%) 1.5% ( -19% - 27%) HighTermMonthSort 209.12 (15.9%) 212.36 (13.5%) 1.6% ( -23% - 36%) > Remove FST cachedRootArcs now redundant with direct-addressing > -------------------------------------------------------------- > > Key: LUCENE-9049 > URL: https://issues.apache.org/jira/browse/LUCENE-9049 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Bruno Roustant > Priority: Major > Attachments: LUCENE-9049.patch > > > With LUCENE-8920 FST most often encodes top level nodes with > direct-addressing (instead of array for binary search). This probably made > the cachedRootArcs redundant. So they should be removed, and this will reduce > the code. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org