Re: [PR] Enable collectors to take advantage of pre-aggregated data. [lucene]

2025-04-05 Thread via GitHub
gf2121 commented on code in PR #14401: URL: https://github.com/apache/lucene/pull/14401#discussion_r2019735302 ## lucene/test-framework/src/java/org/apache/lucene/tests/search/AssertingLeafCollector.java: ## @@ -50,6 +50,14 @@ public void collect(DocIdStream stream) throws IOExc

Re: [PR] Enable collectors to take advantage of pre-aggregated data. [lucene]

2025-04-04 Thread via GitHub
jpountz commented on PR #14401: URL: https://github.com/apache/lucene/pull/14401#issuecomment-2766473986 It is unexpected indeed! I'll fix this and add a CHANGES entry. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

Re: [PR] Enable collectors to take advantage of pre-aggregated data. [lucene]

2025-03-31 Thread via GitHub
jpountz merged PR #14401: URL: https://github.com/apache/lucene/pull/14401 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [PR] Enable collectors to take advantage of pre-aggregated data. [lucene]

2025-03-31 Thread via GitHub
gf2121 commented on PR #14401: URL: https://github.com/apache/lucene/pull/14401#issuecomment-2765918007 The change of https://github.com/apache/lucene/pull/14421 is also included, which seems not expected? -- This is an automated message from the Apache Git Service. To respond to the mess

Re: [PR] Enable collectors to take advantage of pre-aggregated data. [lucene]

2025-03-28 Thread via GitHub
gsmiller commented on PR #14401: URL: https://github.com/apache/lucene/pull/14401#issuecomment-2761941694 I prefer `collectRange` as well to make usage a little less error-prone. I don't have a strong opinion though. -- This is an automated message from the Apache Git Service. To respond

Re: [PR] Enable collectors to take advantage of pre-aggregated data. [lucene]

2025-03-28 Thread via GitHub
jpountz commented on PR #14401: URL: https://github.com/apache/lucene/pull/14401#issuecomment-2761666414 Any opinion on `collect(int min, int max)` vs. `collectRange(int min, int max)`? I leaned towards `collectRange` since we already have `collect(int doc)` and it wouldn't be obvious from

Re: [PR] Enable collectors to take advantage of pre-aggregated data. [lucene]

2025-03-28 Thread via GitHub
jpountz commented on PR #14401: URL: https://github.com/apache/lucene/pull/14401#issuecomment-2761651163 Ah, that's right. We have a good number of queries that are already covered, in my opinion the next natural step is to look into making ranges collect ranges when any clause would collec

Re: [PR] Enable collectors to take advantage of pre-aggregated data. [lucene]

2025-03-28 Thread via GitHub
gsmiller commented on PR #14401: URL: https://github.com/apache/lucene/pull/14401#issuecomment-2761478694 > I don't think so, or rather taking advantage of range collection shouldn't help more than what https://github.com/apache/lucene/pull/14273 does with RangeDocIdStream? My thinki

Re: [PR] Enable collectors to take advantage of pre-aggregated data. [lucene]

2025-03-28 Thread via GitHub
jpountz commented on code in PR #14401: URL: https://github.com/apache/lucene/pull/14401#discussion_r2018638902 ## lucene/core/src/java/org/apache/lucene/search/RangeDocIdStream.java: ## @@ -0,0 +1,44 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more

Re: [PR] Enable collectors to take advantage of pre-aggregated data. [lucene]

2025-03-28 Thread via GitHub
gf2121 commented on code in PR #14401: URL: https://github.com/apache/lucene/pull/14401#discussion_r2018024992 ## lucene/core/src/java/org/apache/lucene/search/LeafCollector.java: ## @@ -83,6 +84,21 @@ public interface LeafCollector { */ void collect(int doc) throws IOExc

Re: [PR] Enable collectors to take advantage of pre-aggregated data. [lucene]

2025-03-27 Thread via GitHub
jpountz commented on PR #14401: URL: https://github.com/apache/lucene/pull/14401#issuecomment-2759494063 > This would also benefit https://github.com/apache/lucene/pull/14273 I don't think so, or rather taking advantage of range collection shouldn't help more than what #14273 does wit

Re: [PR] Enable collectors to take advantage of pre-aggregated data. [lucene]

2025-03-27 Thread via GitHub
gsmiller commented on PR #14401: URL: https://github.com/apache/lucene/pull/14401#issuecomment-2758696779 It makes sense to me to expose the idea of doc range collection as a first-class API on leaf collectors for the reasons you outlined above. This would also benefit #14273 as well right?

Re: [PR] Enable collectors to take advantage of pre-aggregated data. [lucene]

2025-03-25 Thread via GitHub
jpountz commented on PR #14401: URL: https://github.com/apache/lucene/pull/14401#issuecomment-2751630823 @epotyom You may be interested in this, this allows computing aggregates in sub-linear time respective to the number of matching docs. -- This is an automated message from the Apache G