jainankitk commented on PR #14439: URL: https://github.com/apache/lucene/pull/14439#issuecomment-2794969733
> I didn't mean to imply that the two solutions are the same, apologies if that's how it came across. Not at all. Even I was initially confused with skipper logic, only after spending some time realized this approach is slightly different. So, thanks for reiterating the question. > I think you could start in HistogramCollector.getLeafCollector ([code](https://github.com/apache/lucene/blob/4957766fcee52c534d786e3948fadf6d36c9779f/lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/plain/histograms/HistogramCollector.java#L50)). Right now we throw an exception if the field we're using isn't doc values ([code](https://github.com/apache/lucene/blob/4957766fcee52c534d786e3948fadf6d36c9779f/lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/plain/histograms/HistogramCollector.java#L59)). Currently, `Collector` doesn't need to be aware of the `Query` itself. They are designed to collect individual docId or using `DocIdStream` from the scorer. But this `CustomCollector`, does not need the scorer to provide documents, but can `BulkCollect` documents, assuming `MATCH_ALL` or `PointRangeQuery` (where `PointRangeQuery.field == histogram.field`). Otherwise, it should fallback to traditional methods for collecting matching documents. > At a higher level, I'm curious if you had a use-case in mind. This optimization can be applied to following use cases: * Number of sale based on the price range (0-50, 50-100, 100-250,.....) * Number of visits on website for each day in a month Just as a data point, this change helped us improve date histogram latency from 5168 ms to 160 ms (~32x!!) for big5 workload in OpenSearch -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org