gsmiller commented on issue #14406: URL: https://github.com/apache/lucene/issues/14406#issuecomment-2759371300
> Out of curiosity, is it common for the union of the configured ranges to only match a small subset of the index? I would naively expect users to want to collect stats about all their data, so there would be one open-ended range as a "case else" and such an optimization would never kick in in practice? Agreed it seems rare/unlikely but I have one such example of this we currently do in Amazon's product search engine. We support a "Customer Reviews" filter today that allows customers to filter by a product's "star rating", but the only option we currently expose to customers is "4 stars & up" (while star ratings for products run from 1 - 5 stars). We leverage faceting to compute aggregations over products with a review value of `4-`, and don't care about docs with < 4 stars (for the purpose of faceting). The problem with this example is that we're also faceting on other fields for the same match set, so even if we exposed a competitive iterator for the reviews case, it wouldn't actually let us skip anything in practice (at least not in the sandbox faceting module where we're computing many aggregations while collecting). But I suppose our traditional faceting implementation could leverage a sparse index directly to potentially skip over some blocks of documents. Agreed it's an unlikely optimization to have much value in practice :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org