[ https://issues.apache.org/jira/browse/LUCENE-10250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17447645#comment-17447645 ]
Greg Miller commented on LUCENE-10250: -------------------------------------- I can't think of any reason off the top of my head that SSDV facet counting couldn't support hierarchical dimensions but here are a few placed I'd suggest digging into: # SortedSetDocValuesFacetField, which is used to add these fields at indexing time, appears to only support a single "flat" value, so that would need some thought (along with code in FacetsConfig that helps in the indexing). # I _think_ DefaultSortedSetDocValuesReaderState has some baked in assumptions around "flat" data. I would poke around in there as a first stop to see if there's anything fundamentally preventing the extension to hierarchies. I'd poked around this code a fair amount a few months back so I'll see if I can refresh my memory a bit more and will add some additional info here if I come up with something. > Add hierarchical labels to SSDV facets > -------------------------------------- > > Key: LUCENE-10250 > URL: https://issues.apache.org/jira/browse/LUCENE-10250 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Marc D'Mello > Priority: Major > Labels: discussion > > Hi all, > I recently [added a new benchmarking > task|https://github.com/mikemccand/luceneutil/issues/141] to {{luceneutil}} > to count facets on a random word chosen from each document which would give > us a very high cardinality facet benchmarking compared to the faceting > benchmarks we already had. After being merged, [~mikemccand] pointed out some > [interesting > results|https://home.apache.org/~mikemccand/lucenebench/BrowseRandomLabelTaxoFacets.html] > in the nightly benchmarks where the {{BrowseRandomLabelSSDVFacets}} task was > much faster than the {{BrowseRandomLabelTaxoFacets}} task. > I was thinking that using SSDV facets instead of taxonomy facets for our use > case at Amazon Product Search could potentially lead to some increases in QPS > and decreases in index size, but the issue is we use hierarchical labels, and > as I understand it, SSDV faceting only supports a 2 level hierarchy as of > today. This leads to my question of why is there a limitation like this on > SSDV facets? Is hierarchical labels just a feature that hasn't been > implemented in SSDV facets yet, or is there some more complex reason that we > can't add hierarchical labels to SSDV facets? > Thanks! -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org