[ https://issues.apache.org/jira/browse/LUCENE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497616#comment-17497616 ]
Greg Miller commented on LUCENE-10438: -------------------------------------- I experimented with this a bit for taxo- and ssdv-faceting but didn't get particularly far. I quickly discovered that {{luceneutil}} doesn't seem to exercise the {{Facets#getSpecificValue}} code path, which is where I think the optimization opportunity might be. To do this though, I had to defer counting to an "on demand" approach instead of counting during initialization. The good news is that this change doesn't seem to have regressed the existing benchmark tasks (see below). I think the next steps here are to augment {{luceneutil}} to exercise {{getSpecificValue}} so we can measure impact. I'll see if I can find some time to poke into that, but if anyone else is interested in getting involved, feel free to jump in! {code:java} TaskQPS baseline StdDevQPS candidate StdDev Pct diff p-value BrowseMonthSSDVFacets 16.42 (27.7%) 15.16 (24.5%) -7.7% ( -46% - 61%) 0.354 OrHighMedDayTaxoFacets 6.38 (6.7%) 6.28 (6.4%) -1.5% ( -13% - 12%) 0.463 TermDTSort 93.64 (12.5%) 92.45 (11.9%) -1.3% ( -22% - 26%) 0.742 HighTermTitleBDVSort 142.12 (14.2%) 140.36 (13.0%) -1.2% ( -24% - 30%) 0.773 MedTermDayTaxoFacets 38.39 (4.2%) 37.92 (4.1%) -1.2% ( -9% - 7%) 0.356 OrHighHigh 42.40 (4.6%) 42.04 (3.5%) -0.9% ( -8% - 7%) 0.510 HighTermMonthSort 104.42 (18.0%) 103.57 (17.0%) -0.8% ( -30% - 41%) 0.882 Prefix3 270.23 (7.9%) 268.54 (11.0%) -0.6% ( -18% - 19%) 0.837 OrHighMed 79.38 (4.5%) 79.00 (3.6%) -0.5% ( -8% - 7%) 0.709 HighSpanNear 18.50 (2.4%) 18.43 (2.4%) -0.4% ( -5% - 4%) 0.586 IntNRQ 135.21 (0.5%) 134.77 (1.6%) -0.3% ( -2% - 1%) 0.371 OrNotHighLow 1056.43 (2.7%) 1055.39 (3.2%) -0.1% ( -5% - 5%) 0.916 PKLookup 169.34 (3.5%) 169.19 (3.6%) -0.1% ( -6% - 7%) 0.937 AndHighMedDayTaxoFacets 34.87 (1.8%) 34.85 (1.9%) -0.0% ( -3% - 3%) 0.939 OrNotHighMed 930.52 (3.9%) 930.70 (4.0%) 0.0% ( -7% - 8%) 0.988 Wildcard 93.02 (4.9%) 93.05 (6.7%) 0.0% ( -10% - 12%) 0.984 LowTerm 1992.53 (5.1%) 1993.41 (4.3%) 0.0% ( -8% - 9%) 0.976 AndHighHigh 52.14 (4.9%) 52.17 (4.0%) 0.1% ( -8% - 9%) 0.969 HighSloppyPhrase 27.70 (4.0%) 27.72 (3.8%) 0.1% ( -7% - 8%) 0.933 HighTermDayOfYearSort 82.23 (13.3%) 82.35 (14.7%) 0.2% ( -24% - 32%) 0.973 OrNotHighHigh 923.35 (3.6%) 925.08 (4.8%) 0.2% ( -7% - 8%) 0.889 AndHighHighDayTaxoFacets 19.09 (2.3%) 19.16 (1.9%) 0.3% ( -3% - 4%) 0.622 LowSloppyPhrase 28.20 (2.4%) 28.31 (2.6%) 0.4% ( -4% - 5%) 0.624 LowSpanNear 11.96 (3.9%) 12.01 (2.5%) 0.4% ( -5% - 7%) 0.666 LowPhrase 241.84 (4.3%) 242.98 (4.0%) 0.5% ( -7% - 9%) 0.721 MedSpanNear 22.00 (3.3%) 22.11 (2.0%) 0.5% ( -4% - 6%) 0.568 BrowseDayOfYearSSDVFacets 12.00 (15.6%) 12.06 (14.4%) 0.5% ( -25% - 36%) 0.909 MedPhrase 20.64 (4.9%) 20.75 (4.4%) 0.6% ( -8% - 10%) 0.709 Fuzzy2 60.95 (1.7%) 61.29 (1.8%) 0.6% ( -2% - 4%) 0.304 HighPhrase 19.65 (4.8%) 19.77 (4.3%) 0.6% ( -8% - 10%) 0.678 MedSloppyPhrase 30.43 (2.3%) 30.63 (2.3%) 0.7% ( -3% - 5%) 0.354 Fuzzy1 67.61 (1.6%) 68.07 (2.0%) 0.7% ( -2% - 4%) 0.246 OrHighNotMed 1150.70 (3.7%) 1159.51 (3.7%) 0.8% ( -6% - 8%) 0.516 OrHighLow 745.90 (2.9%) 751.76 (1.7%) 0.8% ( -3% - 5%) 0.292 OrHighNotHigh 898.58 (4.1%) 906.01 (4.7%) 0.8% ( -7% - 9%) 0.551 OrHighNotLow 1349.46 (3.4%) 1361.12 (4.0%) 0.9% ( -6% - 8%) 0.463 Respell 46.64 (1.9%) 47.06 (2.0%) 0.9% ( -2% - 4%) 0.152 AndHighMed 164.73 (5.7%) 166.39 (4.4%) 1.0% ( -8% - 11%) 0.531 BrowseDateSSDVFacets 2.40 (7.2%) 2.43 (7.7%) 1.1% ( -12% - 17%) 0.643 BrowseRandomLabelSSDVFacets 9.15 (2.4%) 9.25 (2.6%) 1.1% ( -3% - 6%) 0.150 AndHighLow 873.95 (4.0%) 885.61 (2.2%) 1.3% ( -4% - 7%) 0.192 BrowseMonthTaxoFacets 28.32 (24.4%) 28.70 (24.5%) 1.4% ( -38% - 66%) 0.860 HighIntervalsOrdered 7.13 (4.9%) 7.24 (3.1%) 1.6% ( -6% - 10%) 0.219 LowIntervalsOrdered 116.22 (4.2%) 118.09 (2.7%) 1.6% ( -5% - 8%) 0.148 HighTerm 2742.56 (6.1%) 2794.10 (4.1%) 1.9% ( -7% - 12%) 0.251 MedIntervalsOrdered 55.87 (5.3%) 56.93 (3.4%) 1.9% ( -6% - 11%) 0.173 MedTerm 1679.85 (6.3%) 1718.62 (5.2%) 2.3% ( -8% - 14%) 0.206 BrowseRandomLabelTaxoFacets 18.04 (17.3%) 18.59 (17.9%) 3.0% ( -27% - 46%) 0.585 BrowseDateTaxoFacets 21.38 (21.2%) 22.94 (22.7%) 7.3% ( -30% - 65%) 0.295 BrowseDayOfYearTaxoFacets 21.40 (21.5%) 23.01 (23.2%) 7.5% ( -30% - 66%) 0.286 {code} > Leverage Weight#count in lucene/facets > -------------------------------------- > > Key: LUCENE-10438 > URL: https://issues.apache.org/jira/browse/LUCENE-10438 > Project: Lucene - Core > Issue Type: Task > Components: modules/facet > Reporter: Adrien Grand > Assignee: Greg Miller > Priority: Minor > > The facet module could leverage Weight#count in order to give fast counts for > the browsing use-case? -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org