derreisende77 commented on issue #13959:
URL: https://github.com/apache/lucene/issues/13959#issuecomment-2438658215
I made some tests with Ubuntu 24.10:
JDK 23: 9.9 seconds
JDK 22: 1.4 seconds
--
This is an automated message from the Apache Git Service.
To respond to the message, ple
github-actions[bot] commented on PR #13888:
URL: https://github.com/apache/lucene/pull/13888#issuecomment-2439076449
This PR has not had activity in the past 2 weeks, labeling it as stale. If
the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you
for your contributi
github-actions[bot] commented on PR #13860:
URL: https://github.com/apache/lucene/pull/13860#issuecomment-2439076472
This PR has not had activity in the past 2 weeks, labeling it as stale. If
the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you
for your contributi
github-actions[bot] commented on PR #13893:
URL: https://github.com/apache/lucene/pull/13893#issuecomment-2439076438
This PR has not had activity in the past 2 weeks, labeling it as stale. If
the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you
for your contributi
rmuir commented on PR #13958:
URL: https://github.com/apache/lucene/pull/13958#issuecomment-2438973598
maybe its a bug that it doesnt work on your mac either. because elsewhere
they have code that looks like it is supposed to be doing this stuff:
https://github.com/openjdk/jdk/blob/f1a9a8d2
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1817415010
##
lucene/benchmark-jmh/src/java/org/apache/lucene/benchmark/jmh/VectorUtilBenchmark.java:
##
@@ -84,6 +91,76 @@ public void init() {
floatsA[i] = random.nextFl
rmuir commented on PR #13958:
URL: https://github.com/apache/lucene/pull/13958#issuecomment-2438947715
For these uses of vectormask you are ok with AVX2 (so just use existing
FAST_INTEGER_VECTORS check):
https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/x86.ad#L1597-L160
rmuir commented on PR #13958:
URL: https://github.com/apache/lucene/pull/13958#issuecomment-2438944785
https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/aarch64/aarch64_vector.ad#L280-L283
--
This is an automated message from the Apache Git Service.
To respond to the message, ple
rmuir commented on PR #13958:
URL: https://github.com/apache/lucene/pull/13958#issuecomment-2438925486
you are using VectorMask, only use this where implemented in HW (AVX-512 and
ARM SVE).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log
jpountz commented on PR #13958:
URL: https://github.com/apache/lucene/pull/13958#issuecomment-2438919587
I ran this PR on my Mac laptop (M3), where this gives a massive slowdown, I
imagine because some of the vector operations I'm using are emulated. I need to
find what to check against in
jpountz commented on PR #13958:
URL: https://github.com/apache/lucene/pull/13958#issuecomment-2438911637
And I seem to be getting a better speedup by using `trueCount()` instead of
`firstTrue()`:
```
TaskQPS baseline StdDevQPS
my_modified_version
derreisende77 commented on issue #13959:
URL: https://github.com/apache/lucene/issues/13959#issuecomment-2438907870
@benwtrent I have JProfiler but I am not really experienced in using it - or
profiling at all.
I made two runs on macOS and made screenshots from the hotspot page.
JD
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1817385236
##
lucene/benchmark-jmh/src/java/org/apache/lucene/benchmark/jmh/VectorUtilBenchmark.java:
##
@@ -84,6 +91,76 @@ public void init() {
floatsA[i] = random.nextFl
jpountz commented on PR #13958:
URL: https://github.com/apache/lucene/pull/13958#issuecomment-2438737799
Specializing `ImpactsDISI#nextDoc()` helped get rid of the slowdown:
```
TaskQPS baseline StdDevQPS
my_modified_version StdDev
benwtrent commented on issue #13959:
URL: https://github.com/apache/lucene/issues/13959#issuecomment-2438673337
@derreisende77 do you have profiling of the two different runs? Maybe
through async-profiler? It would be interesting to see where the time is being
spent.
--
This is an automa
benwtrent commented on PR #13525:
URL: https://github.com/apache/lucene/pull/13525#issuecomment-2438671673
Hey @vigyasharma there is a lot of good work here.
I am going to shift my focus and see about how I can help here more fully.
What are the next steps?
I am guessing handl
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1817245059
##
lucene/core/src/java/org/apache/lucene/codecs/lucene99/OffHeapQuantizedByteVectorValues.java:
##
@@ -146,6 +146,7 @@ public float getScoreCorrectionConstant(int tar
jpountz merged PR #13944:
URL: https://github.com/apache/lucene/pull/13944
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
derreisende77 opened a new issue, #13959:
URL: https://github.com/apache/lucene/issues/13959
### Description
I am using Lucene in my app for several years happily with JDKs up to 22.
My use case searches through film data and Lucene can return fairly huge
result sets to my app - wh
HoustonPutman commented on code in PR #13914:
URL: https://github.com/apache/lucene/pull/13914#discussion_r1810967900
##
lucene/facet/src/java/org/apache/lucene/facet/range/DynamicRangeUtil.java:
##
@@ -202,66 +208,83 @@ public SegmentOutput(int hitsLength) {
* is used t
jpountz opened a new pull request, #13958:
URL: https://github.com/apache/lucene/pull/13958
PR #13692 tried to speed up advancing by using branchless binary search, but
while this yielded a speedup on my machine, this yielded a slowdown on nightly
benchmarks.
This PR tries a differen
msokolov commented on PR #13872:
URL: https://github.com/apache/lucene/pull/13872#issuecomment-2437752945
> Can you clarify which allocation is the problematic one, and where it's
done on the indexing path?
See Ben's comments from ~2 weeks ago where he calls out the problem of
overal
ljak commented on PR #13944:
URL: https://github.com/apache/lucene/pull/13944#issuecomment-2438170606
Done. Thanks for reviewing!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comm
jpountz commented on PR #13948:
URL: https://github.com/apache/lucene/pull/13948#issuecomment-2437732473
In my experience, binary doc values are more often used to encode structured
data, such as maps that help build scoring signals, geo shapes, etc. than
actual binary content, so this chan
msokolov commented on code in PR #13872:
URL: https://github.com/apache/lucene/pull/13872#discussion_r1816770842
##
lucene/core/src/java21/org/apache/lucene/internal/vectorization/Lucene99MemorySegmentByteVectorScorerSupplier.java:
##
@@ -112,20 +96,20 @@ static final class Cosi
msokolov commented on PR #13872:
URL: https://github.com/apache/lucene/pull/13872#issuecomment-2437853233
Maybe we could add a `RandomVectorScorer.setTarget(int node)` method that
would only be implemented by the Scorers returned from ScorerSuppliers?
--
This is an automated message from
original-brownbear commented on PR #13864:
URL: https://github.com/apache/lucene/pull/13864#issuecomment-2437848161
yea that's cool sorry forgot about this one, we for starters just store the
offsets in a more compact form that'll help already. I'll open a PR once I find
a little time :)
original-brownbear closed pull request #13864: Make DirectMonotonicReader.Meta
more compact
URL: https://github.com/apache/lucene/pull/13864
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the speci
msokolov commented on PR #13872:
URL: https://github.com/apache/lucene/pull/13872#issuecomment-2437836935
Yes, OK I now see quite a bit of this is a "preexisting condition" and maybe
not exacerbated by this change. We are still creating more scratch arrays than
we did before though, I think
ChrisHegarty commented on code in PR #13872:
URL: https://github.com/apache/lucene/pull/13872#discussion_r1816687000
##
lucene/core/src/java21/org/apache/lucene/internal/vectorization/Lucene99MemorySegmentByteVectorScorerSupplier.java:
##
@@ -112,20 +96,20 @@ static final class
benwtrent commented on PR #13872:
URL: https://github.com/apache/lucene/pull/13872#issuecomment-2437820707
I think a "merging scorer" would be good. The only place the "scorer
supplier" is used is during graph building.
My initial concern with a "mutable scorer" is that it would also
ChrisHegarty commented on code in PR #13872:
URL: https://github.com/apache/lucene/pull/13872#discussion_r1816669062
##
lucene/core/src/java21/org/apache/lucene/internal/vectorization/Lucene99MemorySegmentByteVectorScorerSupplier.java:
##
@@ -112,20 +96,20 @@ static final class
ChrisHegarty commented on code in PR #13872:
URL: https://github.com/apache/lucene/pull/13872#discussion_r1816669062
##
lucene/core/src/java21/org/apache/lucene/internal/vectorization/Lucene99MemorySegmentByteVectorScorerSupplier.java:
##
@@ -112,20 +96,20 @@ static final class
ChrisHegarty commented on PR #13872:
URL: https://github.com/apache/lucene/pull/13872#issuecomment-2437761782
> that we instead have a mutable Scorer that can accept a new target vector.
Yes, that is something that I've noodled on for a while now too - a scorer
that accepts two ords,
jpountz commented on PR #13864:
URL: https://github.com/apache/lucene/pull/13864#issuecomment-2437765755
Sorry, I don't feel good about relying on `paddingBitsNeeded` on the read
path. I suggest we close this PR, IMO the better fix would be to change the way
we store terms dictionaries to r
ChrisHegarty commented on code in PR #13872:
URL: https://github.com/apache/lucene/pull/13872#discussion_r1816669062
##
lucene/core/src/java21/org/apache/lucene/internal/vectorization/Lucene99MemorySegmentByteVectorScorerSupplier.java:
##
@@ -112,20 +96,20 @@ static final class
jpountz commented on PR #13872:
URL: https://github.com/apache/lucene/pull/13872#issuecomment-2437740226
Can you clarify which allocation is the problematic one, and where it's done
on the indexing path?
--
This is an automated message from the Apache Git Service.
To respond to the messag
jpountz commented on PR #13951:
URL: https://github.com/apache/lucene/pull/13951#issuecomment-2437616406
> I couldn't think of a clean way to integrate the two... but I'll give it
some more thought
For what it's worth, these classes are package-private, so we can feel free
to change
jpountz merged PR #13954:
URL: https://github.com/apache/lucene/pull/13954
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
benwtrent closed issue #13946: TestCommonTermsQuery.testMinShouldMatch test
failure
URL: https://github.com/apache/lucene/issues/13946
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific co
benwtrent merged PR #13953:
URL: https://github.com/apache/lucene/pull/13953
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.a
jpountz commented on PR #13944:
URL: https://github.com/apache/lucene/pull/13944#issuecomment-2437549776
Can you add an entry to `lucene/CHANGES.txt` under version 10.1.0? Then I'll
merge.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log o
jpountz merged PR #13950:
URL: https://github.com/apache/lucene/pull/13950
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
jpountz merged PR #13955:
URL: https://github.com/apache/lucene/pull/13955
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
jpountz merged PR #13956:
URL: https://github.com/apache/lucene/pull/13956
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
jpountz opened a new pull request, #13957:
URL: https://github.com/apache/lucene/pull/13957
`LeafSimScorer` is a specialization of a `SimScorer` for a given segment. It
doesn't add much value, but benchmarks suggest that it adds measurable overhead
to queries sorted by score.
Here is
iverase commented on code in PR #13948:
URL: https://github.com/apache/lucene/pull/13948#discussion_r1816323891
##
lucene/core/src/java/org/apache/lucene/store/RandomAccessInputDataInput.java:
##
@@ -0,0 +1,104 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under on
iverase commented on code in PR #13948:
URL: https://github.com/apache/lucene/pull/13948#discussion_r1816322441
##
lucene/core/src/java/org/apache/lucene/store/RandomAccessInputDataInput.java:
##
@@ -0,0 +1,104 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under on
iverase commented on code in PR #13948:
URL: https://github.com/apache/lucene/pull/13948#discussion_r1816321990
##
lucene/core/src/java/org/apache/lucene/index/BinaryDocValues.java:
##
@@ -33,4 +34,15 @@ protected BinaryDocValues() {}
* @return binary value
*/
public
vsop-479 commented on PR #13915:
URL: https://github.com/apache/lucene/pull/13915#issuecomment-2437267763
I will close it, since it is insignificant.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
vsop-479 closed pull request #13915: Early reset scratchBytes in
Lucene90BlockTreeTermsWriter.compileIndex.
URL: https://github.com/apache/lucene/pull/13915
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
ljak commented on PR #13944:
URL: https://github.com/apache/lucene/pull/13944#issuecomment-2435609721
Ha, I see. Could we say that the new `List orderedQueries` would have
the same behavior that `Query[] disjuncts` before
https://github.com/apache/lucene/pull/110/files ? If yes, I presume i
mikemccand commented on code in PR #13950:
URL: https://github.com/apache/lucene/pull/13950#discussion_r1814888763
##
lucene/core/src/java/org/apache/lucene/search/BooleanQuery.java:
##
@@ -87,6 +87,28 @@ public Builder add(BooleanClause clause) {
return this;
}
+
shatejas commented on issue #13920:
URL: https://github.com/apache/lucene/issues/13920#issuecomment-2435944343
> @shatejas I think all the required details are present, so are you going
to raise a PR for this?
Yeah I am working on it, I have the changes and I am trying to figure out
jpountz commented on PR #13944:
URL: https://github.com/apache/lucene/pull/13944#issuecomment-2435611867
Yes, exactly.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To un
jpountz commented on code in PR #13950:
URL: https://github.com/apache/lucene/pull/13950#discussion_r1815173658
##
lucene/core/src/java/org/apache/lucene/search/BooleanQuery.java:
##
@@ -136,20 +158,20 @@ public List clauses() {
}
/** Return the collection of queries for
yugushihuang commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2435780436
We have measured performance using
[knnPerfTest.py](https://github.com/mikemccand/luceneutil/blob/main/src/python/knnPerfTest.py)
in lucene util with this PR
[commit](https://githu
jpountz commented on code in PR #13899:
URL: https://github.com/apache/lucene/pull/13899#discussion_r1815247300
##
lucene/core/src/java/org/apache/lucene/search/IndexSortSortedNumericDocValuesRangeQuery.java:
##
@@ -186,10 +186,44 @@ public boolean isCacheable(LeafReaderContext
LuXugang merged PR #13899:
URL: https://github.com/apache/lucene/pull/13899
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.ap
LuXugang closed issue #13890: Check ahead of time if the `count` can be obtained
URL: https://github.com/apache/lucene/issues/13890
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific commen
60 matches
Mail list logo