Re: [I] Leverage sparse doc value indexes for range and value facet collection [lucene]

via GitHub Thu, 27 Mar 2025 13:13:16 -0700


gsmiller commented on issue #14406:
URL: https://github.com/apache/lucene/issues/14406#issuecomment-2759371300


   > Out of curiosity, is it common for the union of the configured ranges to 
only match a small subset of the index? I would naively expect users to want to 
collect stats about all their data, so there would be one open-ended range as a 
"case else" and such an optimization would never kick in in practice?
   
   Agreed it seems rare/unlikely but I have one such example of this we 
currently do in Amazon's product search engine. We support a "Customer Reviews" 
filter today that allows customers to filter by a product's "star rating", but 
the only option we currently expose to customers is "4 stars & up" (while star 
ratings for products run from 1 - 5 stars). We leverage faceting to compute 
aggregations over products with a review value of `4-`, and don't care about 
docs with < 4 stars (for the purpose of faceting).
   
   The problem with this example is that we're also faceting on other fields 
for the same match set, so even if we exposed a competitive iterator for the 
reviews case, it wouldn't actually let us skip anything in practice (at least 
not in the sandbox faceting module where we're computing many aggregations 
while collecting). But I suppose our traditional faceting implementation could 
leverage a sparse index directly to potentially skip over some blocks of 
documents.
   
   Agreed it's an unlikely optimization to have much value in practice :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [I] Leverage sparse doc value indexes for range and value facet collection [lucene]

Reply via email to