jpountz commented on issue #13188: URL: https://github.com/apache/lucene/issues/13188#issuecomment-2085447524
At first sight I don't like the fact that this seems to plug in a whole new way of doing things. Either you don't use a star tree index and you do things the usual way with filters and collectors, or you want to use a star tree index and then you need to craft queries in a very specific way if you want to be able to take advantage of the optimization for aggregations. Since this optimization is about aggregating data, I'd like this to mostly require changes on the collector side. It would be somewhat less efficient, but an alternative I'm contemplating would consist of the following: - Queries can optionally match ranges of doc IDs at once. - A new `LeafCollector#collectRange(int docIdStart, int docIdEnd)` API for these queries to use. - Doc values formats don't know about dimensions, but create pre-aggregates for blocks of N (e.g. 128) doc IDs. - A sum aggregate could be implemented by asking doc values for the sum of all blocks that are contained within the range of doc IDs passed to `LeafCollector#collectRange`. - Users are responsible for configuring an index sort that is likely to allow queries to match ranges of doc IDs at once, typically by sorting on these dimensions in some order. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org