shubhamvishu commented on issue #12675:
URL: https://github.com/apache/lucene/issues/12675#issuecomment-1763281638
@jpountz I have raised a PR #12682 with the fix to
`MultiSimilarity.MultiSimScorer` and some other candidate scorers I could find
with similar issue.
--
This is an automated
shubhamvishu opened a new pull request, #12682:
URL: https://github.com/apache/lucene/pull/12682
### Description
Addresses #12675 . Along with `MultiSimilarity.MultiSimScorer` found some
others candidate scorer implementations for this fix.
--
This is an automated message f
rmuir commented on PR #12681:
URL: https://github.com/apache/lucene/pull/12681#issuecomment-1763220499
This also makes this test reproducible from random seed regardless of the
hardware, as `SPECIES_PREFERRED` is not used at all in tests. From a test
perspective, it is like a forbidden-api.
rmuir commented on PR #12681:
URL: https://github.com/apache/lucene/pull/12681#issuecomment-1763217937
for me, when investigating a modification, this works easily enough:
```console
$ for bits in 128 256 512; do ./gradlew -p lucene/core test --tests
TestVectorUtilSupport -Dtests.
rmuir commented on PR #12681:
URL: https://github.com/apache/lucene/pull/12681#issuecomment-1763212967
@uschindler I did the 'fast integer vectors' override differently, and
configured the build to randomize the vector size used for testing.
So it still does the same thing it was doin
rmuir commented on PR #12681:
URL: https://github.com/apache/lucene/pull/12681#issuecomment-1763199813
I tried it out, making species `final` instead of `static final`.
performance completely falls apart, slower than scalar impl even. it is a
non-option... We should keep everything here sta
rmuir commented on PR #12681:
URL: https://github.com/apache/lucene/pull/12681#issuecomment-1763190528
> This can be done in the same way like the "testMode" flag, we should just
extend it to cover more cases. You could also pass an override for the bit size
instead of true/false.
>
uschindler commented on PR #12681:
URL: https://github.com/apache/lucene/pull/12681#issuecomment-1763180967
> We have to think about testing. I don't want to rely upon various hardware
for correctness. I think there's a way to alter the code so that we can test
the correctness of everything
rmuir commented on PR #12681:
URL: https://github.com/apache/lucene/pull/12681#issuecomment-1763173911
Here's the diff of just the commit for this change:
https://github.com/apache/lucene/pull/12681/commits/3ec9c26d672262762f4213c827699bf735409eeb
--
This is an automated message from the
rmuir opened a new pull request, #12681:
URL: https://github.com/apache/lucene/pull/12681
This builds on https://github.com/apache/lucene/pull/12680 so please review
that one first to make it easier. The advantage there is we split out vector
kernels into smaller manageable methods, making
shubhamvishu commented on code in PR #12679:
URL: https://github.com/apache/lucene/pull/12679#discussion_r1359624510
##
lucene/core/src/java/org/apache/lucene/search/AbstractRnnVectorQuery.java:
##
Review Comment:
Lets add some tests for these going forward?
--
This is
shubhamvishu commented on PR #12679:
URL: https://github.com/apache/lucene/pull/12679#issuecomment-1763170985
Thanks for adding this @kaivalnp! The idea makes sense to me, looking
forward to the benchmarks results. I left some minor comments. Sharing some
thoughts below :
1. Is it ri
shubhamvishu commented on code in PR #12679:
URL: https://github.com/apache/lucene/pull/12679#discussion_r1359606449
##
lucene/core/src/java/org/apache/lucene/search/AbstractRnnVectorQuery.java:
##
@@ -0,0 +1,105 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under
zhaih commented on code in PR #12657:
URL: https://github.com/apache/lucene/pull/12657#discussion_r1359606481
##
lucene/core/src/java/org/apache/lucene/codecs/lucene95/IncrementalHnswGraphMerger.java:
##
@@ -0,0 +1,189 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF)
zhaih commented on code in PR #12657:
URL: https://github.com/apache/lucene/pull/12657#discussion_r1359590587
##
lucene/core/src/java/org/apache/lucene/codecs/lucene95/IncrementalHnswGraphMerger.java:
##
@@ -0,0 +1,189 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF)
zhaih commented on code in PR #12651:
URL: https://github.com/apache/lucene/pull/12651#discussion_r1359572233
##
lucene/core/src/java/org/apache/lucene/util/hnsw/OnHeapHnswGraph.java:
##
@@ -163,45 +185,66 @@ public NodesIterator getNodesOnLevel(int level) {
if (level == 0)
zhaih commented on code in PR #12651:
URL: https://github.com/apache/lucene/pull/12651#discussion_r1359568778
##
lucene/core/src/java/org/apache/lucene/util/hnsw/OnHeapHnswGraph.java:
##
@@ -40,31 +41,39 @@ public final class OnHeapHnswGraph extends HnswGraph
implements Account
rmuir commented on PR #12680:
URL: https://github.com/apache/lucene/pull/12680#issuecomment-1763090784
cosine() ones cleaned up now too. I don't see perf issue with the array:
guess this whole shebang relies on escape analysis anyway.
--
This is an automated message from the Apache Git Se
Shibi-bala commented on issue #12637:
URL: https://github.com/apache/lucene/issues/12637#issuecomment-1763086904
Yeah exactly. I'd say `userData` isn't metadata so it should get replaced as
well.
--
This is an automated message from the Apache Git Service.
To respond to the message, pleas
shubhamvishu commented on code in PR #12671:
URL: https://github.com/apache/lucene/pull/12671#discussion_r1359519018
##
lucene/core/src/java/org/apache/lucene/search/DoubleValuesSource.java:
##
@@ -43,6 +40,9 @@
* {@link #fromScorer(Scorable)} and passing the resulting DoubleV
shubhamvishu commented on PR #12671:
URL: https://github.com/apache/lucene/pull/12671#issuecomment-1763076574
Thanks @gsmiller for the review! My motivation behind this refactoring was
[this
comment](https://github.com/apache/lucene/pull/12548#discussion_r1357027508)
from Mike which indica
zhaih merged PR #12678:
URL: https://github.com/apache/lucene/pull/12678
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apach
rmuir commented on PR #12632:
URL: https://github.com/apache/lucene/pull/12632#issuecomment-1763058511
@benwtrent it isn't a panama thing. these functions are 32-bit (they return
`int` and `float`). There is no hope for these getting faster, I just hope you
understand that.
msfroh commented on issue #12032:
URL: https://github.com/apache/lucene/issues/12032#issuecomment-1763058013
I was looking into this, and the fundamental problem seems to be that the
underlying drillsideways scoring implementations (`doQueryFirstScoring`,
`doDrillDownAdvanceScoring`, and `d
rmuir closed issue #12621: Make `byte[]` vector comparisons faster! (if
possible)
URL: https://github.com/apache/lucene/issues/12621
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comm
rmuir commented on issue #12621:
URL: https://github.com/apache/lucene/issues/12621#issuecomment-1763056704
From my analysis, code being generated is correct. recommend to explore
half-float instead for better performance and space tradeoffs.
--
This is an automated message from the Apach
rmuir opened a new pull request, #12680:
URL: https://github.com/apache/lucene/pull/12680
Now that we have integrated benchmarks, it is easier to take care of this
code.
This is pretty straightforward change:
* split out vectorized loops to avoid huge methods (especially integer
msfroh commented on issue #12637:
URL: https://github.com/apache/lucene/issues/12637#issuecomment-1763041651
I was curious about this one, and whether it is a bug or intentional.
I noticed that the `IndexWriter` constructor that calls
`SegmentInfos.replace()` has a comment saying:
mayya-sharipova commented on code in PR #12582:
URL: https://github.com/apache/lucene/pull/12582#discussion_r1359477741
##
lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99ScalarQuantizedVectorsWriter.java:
##
@@ -0,0 +1,782 @@
+/*
+ * Licensed to the Apache Softwa
benwtrent commented on PR #12632:
URL: https://github.com/apache/lucene/pull/12632#issuecomment-1762993804
Thank y'all so much for digging into this @rmuir @gf2121 @ChrisHegarty
@uschindler !
Maybe one day Panama Vector will mature into allow us to do nicer things
with `byte` compari
benwtrent commented on code in PR #12651:
URL: https://github.com/apache/lucene/pull/12651#discussion_r1359458954
##
lucene/core/src/java/org/apache/lucene/util/hnsw/OnHeapHnswGraph.java:
##
@@ -163,45 +185,66 @@ public NodesIterator getNodesOnLevel(int level) {
if (level =
rmuir merged PR #12632:
URL: https://github.com/apache/lucene/pull/12632
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apach
rmuir commented on PR #12632:
URL: https://github.com/apache/lucene/pull/12632#issuecomment-1762984112
I'm gonna merge this but we should continue to explore the intel case. Not
sure what we can do there though.
--
This is an automated message from the Apache Git Service.
To respond to th
rmuir merged PR #12667:
URL: https://github.com/apache/lucene/pull/12667
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apach
benwtrent commented on issue #12579:
URL: https://github.com/apache/lucene/issues/12579#issuecomment-1762970584
@kaivalnp one other thing to think about is
https://weaviate.io/blog/weaviate-1-20-release#autocut
I wonder if we could do something similar by dynamically adjusting the
"t
benwtrent commented on issue #12579:
URL: https://github.com/apache/lucene/issues/12579#issuecomment-1762966693
@kaivalnp yes, `KnnCollector` should be used for something like this :).
Glad its useful!
One of the tricky things I can see is that its possible that the bottom
layer entr
gsmiller commented on code in PR #12671:
URL: https://github.com/apache/lucene/pull/12671#discussion_r1359402306
##
lucene/core/src/java/org/apache/lucene/search/VectorSimilarityValuesSource.java:
##
@@ -32,6 +33,52 @@ public VectorSimilarityValuesSource(String fieldName) {
kaivalnp opened a new pull request, #12679:
URL: https://github.com/apache/lucene/pull/12679
### Description
Background in #12579
Add support for getting "all vectors within a radius" as opposed to getting
the "topK closest vectors" in the current system
### Consideratio
kaivalnp commented on issue #12579:
URL: https://github.com/apache/lucene/issues/12579#issuecomment-1762822602
Thanks @msokolov, this nicely summarizes what I'm trying to say!
> https://typesense.org/docs/0.25.0/api/vector-search.html#distance-threshold
I took a look here: and [
uschindler commented on issue #12307:
URL: https://github.com/apache/lucene/issues/12307#issuecomment-1762822051
If you want to create a classical classpath application that can be started
with `java -jar application.jar` the correct way is to *NOT* package everything
into a fat `applicatio
uschindler commented on issue #12307:
URL: https://github.com/apache/lucene/issues/12307#issuecomment-1762818363
> @uschindler If fat JARs are not supported or recommended with Lucene, what
_is_ the recommended way to deploy a project incorporating Lucene? I cannot
find any resources on thi
uschindler commented on code in PR #12677:
URL: https://github.com/apache/lucene/pull/12677#discussion_r1359206745
##
lucene/core/src/java/org/apache/lucene/internal/vectorization/VectorizationProvider.java:
##
@@ -120,24 +122,22 @@ static VectorizationProvider lookup(boolean te
uschindler commented on code in PR #12677:
URL: https://github.com/apache/lucene/pull/12677#discussion_r1359206745
##
lucene/core/src/java/org/apache/lucene/internal/vectorization/VectorizationProvider.java:
##
@@ -120,24 +122,22 @@ static VectorizationProvider lookup(boolean te
dweiss commented on code in PR #12677:
URL: https://github.com/apache/lucene/pull/12677#discussion_r1359206557
##
lucene/core/src/java/org/apache/lucene/internal/vectorization/VectorizationProvider.java:
##
@@ -120,24 +122,22 @@ static VectorizationProvider lookup(boolean testMo
44 matches
Mail list logo