jpountz merged PR #14273:
URL: https://github.com/apache/lucene/pull/14273
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
gsmiller commented on code in PR #14273:
URL: https://github.com/apache/lucene/pull/14273#discussion_r2018769975
##
lucene/core/src/java/org/apache/lucene/util/FixedBitSet.java:
##
@@ -204,6 +205,40 @@ public int cardinality() {
return Math.toIntExact(tot);
}
+ /**
+
jpountz commented on code in PR #14273:
URL: https://github.com/apache/lucene/pull/14273#discussion_r2018661757
##
lucene/core/src/java/org/apache/lucene/util/FixedBitSet.java:
##
@@ -204,6 +205,40 @@ public int cardinality() {
return Math.toIntExact(tot);
}
+ /**
+
gsmiller commented on code in PR #14273:
URL: https://github.com/apache/lucene/pull/14273#discussion_r2017767190
##
lucene/core/src/java/org/apache/lucene/search/DISIDocIdStream.java:
##
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
gsmiller commented on code in PR #14273:
URL: https://github.com/apache/lucene/pull/14273#discussion_r2017721363
##
lucene/core/src/java/org/apache/lucene/util/FixedBitSet.java:
##
@@ -204,6 +205,40 @@ public int cardinality() {
return Math.toIntExact(tot);
}
+ /**
+
jpountz commented on code in PR #14273:
URL: https://github.com/apache/lucene/pull/14273#discussion_r2017547274
##
lucene/core/src/java/org/apache/lucene/search/BitSetDocIdStream.java:
##
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
jpountz commented on code in PR #14273:
URL: https://github.com/apache/lucene/pull/14273#discussion_r2017556183
##
lucene/core/src/java/org/apache/lucene/search/BitSetDocIdStream.java:
##
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
jpountz commented on code in PR #14273:
URL: https://github.com/apache/lucene/pull/14273#discussion_r2017552582
##
lucene/core/src/java/org/apache/lucene/search/BooleanScorer.java:
##
@@ -207,8 +164,32 @@ private void scoreWindowIntoBitSetAndReplay(
acceptDocs.applyMask(m
jpountz commented on code in PR #14273:
URL: https://github.com/apache/lucene/pull/14273#discussion_r2017546602
##
lucene/core/src/java/org/apache/lucene/search/DISIDocIdStream.java:
##
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+
jpountz commented on code in PR #14273:
URL: https://github.com/apache/lucene/pull/14273#discussion_r2017544601
##
lucene/core/src/java/org/apache/lucene/search/DocIdStream.java:
##
@@ -34,12 +33,34 @@ protected DocIdStream() {}
* Iterate over doc IDs contained in this strea
gsmiller commented on code in PR #14273:
URL: https://github.com/apache/lucene/pull/14273#discussion_r2017239514
##
lucene/core/src/java/org/apache/lucene/search/DISIDocIdStream.java:
##
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
jpountz commented on PR #14273:
URL: https://github.com/apache/lucene/pull/14273#issuecomment-2758283399
I played with the geonames dataset, by filtering out docs that don't have a
value for the `elevation` field (2.3M docs left), enabling index sorting on the
`elevation` field and computin
jpountz commented on PR #14273:
URL: https://github.com/apache/lucene/pull/14273#issuecomment-2755991200
I'll try to run some simple benchmarks next.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
jpountz commented on PR #14273:
URL: https://github.com/apache/lucene/pull/14273#issuecomment-2755985826
It should be ready for review now. Now that `DocIdStream` has become more
sophisticated, I extracted impls to proper classes that could be better tested.
This causes some diffs in our bo
jpountz commented on code in PR #14273:
URL: https://github.com/apache/lucene/pull/14273#discussion_r2014552652
##
lucene/core/src/java/org/apache/lucene/search/DocIdStream.java:
##
@@ -34,12 +33,35 @@ protected DocIdStream() {}
* Iterate over doc IDs contained in this strea
jpountz commented on PR #14273:
URL: https://github.com/apache/lucene/pull/14273#issuecomment-2754999433
> If we have a skipper, I think we ought to also be able to use competitive
iterators to jump over blocks of docs we know we won't collect based on their
values?
This is correct.
gsmiller commented on PR #14273:
URL: https://github.com/apache/lucene/pull/14273#issuecomment-2754588145
> I like the idea! Looks like we can do similar trick for range facets and
long values facets?
I _think_ we could optimize these use-cases even further by potentially
skipping ov
gsmiller commented on code in PR #14273:
URL: https://github.com/apache/lucene/pull/14273#discussion_r2014273684
##
lucene/core/src/java/org/apache/lucene/search/DocIdStream.java:
##
@@ -34,12 +33,35 @@ protected DocIdStream() {}
* Iterate over doc IDs contained in this stre
jpountz commented on PR #14273:
URL: https://github.com/apache/lucene/pull/14273#issuecomment-2752610585
Quick update: we now have more queries that collect hits using
`collect(DocIdStream)`, which makes this optimization more appealing.
--
This is an automated message from the Apache Git
gsmiller commented on PR #14273:
URL: https://github.com/apache/lucene/pull/14273#issuecomment-2751652120
+1 to this optimization. Love the idea!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to t
jpountz commented on PR #14273:
URL: https://github.com/apache/lucene/pull/14273#issuecomment-2678921217
> Looks like we can do similar trick for range facets and long values facets?
This is right.
--
This is an automated message from the Apache Git Service.
To respond to the messag
epotyom commented on code in PR #14273:
URL: https://github.com/apache/lucene/pull/14273#discussion_r1967691866
##
lucene/core/src/java/org/apache/lucene/search/DocIdStream.java:
##
@@ -34,12 +33,35 @@ protected DocIdStream() {}
* Iterate over doc IDs contained in this strea
jpountz commented on PR #14273:
URL: https://github.com/apache/lucene/pull/14273#issuecomment-2674888106
@epotyom You may be interested in taking a look.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
jpountz opened a new pull request, #14273:
URL: https://github.com/apache/lucene/pull/14273
This attempts to generalize the `IndexSearcher#count` optimization from PR
#12415 to histogram facets by introducing specialization for counting the
number of matching docs in a range of doc IDs.
24 matches
Mail list logo