jpountz commented on PR #14365:
URL: https://github.com/apache/lucene/pull/14365#issuecomment-2744460409
I pushed an annotation
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific commen
jpountz commented on PR #14365:
URL: https://github.com/apache/lucene/pull/14365#issuecomment-273990
Hurray!
- https://benchmarks.mikemccandless.com/TermDayOfYearSort.html
- https://benchmarks.mikemccandless.com/TermDTSort.html
--
This is an automated message from the Apache Gi
gf2121 merged PR #14365:
URL: https://github.com/apache/lucene/pull/14365
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apac
jpountz commented on code in PR #14365:
URL: https://github.com/apache/lucene/pull/14365#discussion_r2004331440
##
lucene/core/src/java/org/apache/lucene/util/DocIdSetBuilder.java:
##
@@ -47,6 +47,8 @@ public sealed interface BulkAdder permits FixedBitSetAdder,
BufferAdder {
gf2121 commented on PR #14365:
URL: https://github.com/apache/lucene/pull/14365#issuecomment-2737667396
> I remember playing with calling BulkAdder#grow on the estimated number of
matching points (to upgrade to a bitset immediately instead of waiting for docs
to be collected) a while back a
jpountz commented on PR #14365:
URL: https://github.com/apache/lucene/pull/14365#issuecomment-2736720587
Interesting. I remember playing with calling `BulkAdder#grow` on the
estimated number of matching points (to upgrade to a bitset immediately instead
of waiting for docs to be collected)
gf2121 commented on PR #14365:
URL: https://github.com/apache/lucene/pull/14365#issuecomment-2736218538
I run some benchmarks to find out the major reason:
**Baseline**: main branch
**Candidate**: collecting docs greater than maxDocVisited into bitset
(instead of `DocIdSetBuilder
gf2121 commented on PR #14365:
URL: https://github.com/apache/lucene/pull/14365#issuecomment-2735314110
Thanks for running benchmark, the speed up is great!
> Skipping these doc IDs looks like it hurts vectorization, I played with
disabling these if statements locally and get a good s
jpountz commented on PR #14365:
URL: https://github.com/apache/lucene/pull/14365#issuecomment-2734597577
Maybe we should stop only adding doc IDs to the `BulkAdder` if they are
greater than the max collected doc so far. Skipping these doc IDs looks like it
hurts vectorization, I played with
gf2121 commented on PR #14365:
URL: https://github.com/apache/lucene/pull/14365#issuecomment-2733285724
I'm seeing even results on `wikimediumall`
```
TaskQPS baseline StdDevQPS
my_modified_version StdDevPct diff p-value
gf2121 opened a new pull request, #14365:
URL: https://github.com/apache/lucene/pull/14365
(no comment)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-ma
jpountz commented on code in PR #14365:
URL: https://github.com/apache/lucene/pull/14365#discussion_r1999318407
##
lucene/core/src/java/org/apache/lucene/search/comparators/NumericComparator.java:
##
@@ -251,6 +252,30 @@ public void visit(int docID, byte[] packedValue) {
12 matches
Mail list logo