vigyasharma commented on PR #13525:
URL: https://github.com/apache/lucene/pull/13525#issuecomment-2466073115
I tried to find some blogs and benchmarks on other library implementations.
Astra Db, Vespa, faiss and nmslib, all seem to support multi-vectors in some
form.
From what I can
msokolov commented on PR #13872:
URL: https://github.com/apache/lucene/pull/13872#issuecomment-2465450838
Yeah I think I will stop working on this. Where I've ended up is with
suspicions about the ScorerSupplier/Scorer setup. Because we allocate Scorers
*a lot* we rely on that being a chea
benchaplin commented on issue #13724:
URL: https://github.com/apache/lucene/issues/13724#issuecomment-2465431906
I've been interested in working on some HNSW stuff so I took a shot at this.
I added only the two checks you mentioned to start, but I'm planning to look
through some issues and
benchaplin opened a new pull request, #13984:
URL: https://github.com/apache/lucene/pull/13984
### Description
This is a first pass on #13724 - I've added the following checks:
- verify neighbor list is in order
- verify no duplicate neighbors
The code was adapted from
[
ChrisHegarty commented on PR #13872:
URL: https://github.com/apache/lucene/pull/13872#issuecomment-2465408658
> [the last
commit](https://github.com/apache/lucene/pull/13872/commits/ef13bade9b3ab8eb5de35e7297a9468ec45fc745)
demonstrates re-using the resource-intensive bits (RandomVectorScor
msokolov commented on code in PR #13872:
URL: https://github.com/apache/lucene/pull/13872#discussion_r1834763594
##
lucene/core/src/java/org/apache/lucene/index/FloatVectorValues.java:
##
@@ -118,6 +123,9 @@ public Floats vectors() {
public float[] get(int ord) throws
msokolov commented on PR #13872:
URL: https://github.com/apache/lucene/pull/13872#issuecomment-2465298380
[the last
commit](https://github.com/apache/lucene/pull/13872/commits/ef13bade9b3ab8eb5de35e7297a9468ec45fc745)
demonstrates re-using the resource-intensive bits (RandomVectorScorer and
jpountz opened a new issue, #13983:
URL: https://github.com/apache/lucene/issues/13983
### Description
I was looking at the `toString()` of a `KnnFloatVectorQuery` and got
surprised that it was not pre-filtered, when I later discovered that it
actually was, `toString()` was just omit
LuXugang opened a new pull request, #13982:
URL: https://github.com/apache/lucene/pull/13982
### Description
The only diff between `doIterate1 ` and `doIterate2 ` is `bb < b.length()`
in `doIterate1 `, but this condition seems always true, so should we remove it?
--
This is an auto
ChrisHegarty commented on PR #13872:
URL: https://github.com/apache/lucene/pull/13872#issuecomment-2465028558
>As a side note, I see that IndexInput is Closeable -- but I don't think we
ever close() the ones we allocate? Is this a problem?
Not a problem per se. The creator of the scor
krickert commented on PR #13525:
URL: https://github.com/apache/lucene/pull/13525#issuecomment-2464836106
I would love to see a single knn field that supports multiple vectors.
Right now I feel like doing the embedded docs or a child collection to handle
these use cases feel a little too
msokolov commented on PR #13872:
URL: https://github.com/apache/lucene/pull/13872#issuecomment-2464749787
OK; re allocating vectors in supplier, yes that can't work because of
aliasing across threads. I also found we do not adequately test this case.
There are *many* things in here that ar
benwtrent commented on PR #13651:
URL: https://github.com/apache/lucene/pull/13651#issuecomment-2464679186
@ShashwatShivam I don't think there is a "memory column" provided anywhere.
I simply looked at the individual file sizes (veb, vex) and summed their sizes
together.
--
This is an au
ChrisHegarty commented on PR #13872:
URL: https://github.com/apache/lucene/pull/13872#issuecomment-2464638368
I am now completely confused by this change. And I think that my last commit
broke the threading model - even though all tests pass!. I'm not saying that we
should not do some refac
ChrisHegarty commented on PR #13872:
URL: https://github.com/apache/lucene/pull/13872#issuecomment-2464599913
Looking again in a bit more detail, I dislike that it is not possible to get
access to the vector values without *always* creating at least one slice.
Previously this was not necess
dungba88 opened a new pull request, #13981:
URL: https://github.com/apache/lucene/pull/13981
### Description
This PR changes the input of vector reader from IndexInput to
RandomAccessInput, as we mostly random-access the vectors with seek() followed
by readFloats().
One thing
ChrisHegarty commented on PR #13872:
URL: https://github.com/apache/lucene/pull/13872#issuecomment-2464539084
@msokolov ok, this is that I was thinking -
[f1e0007](https://github.com/apache/lucene/pull/13872/commits/f1e0007b15245db3a048ebf5bf340e70929f21b7).
There might be other places too,
ChrisHegarty commented on PR #13872:
URL: https://github.com/apache/lucene/pull/13872#issuecomment-2464371689
The threading and reuse model in the current API is subtle, but works well,
as it mostly minimises the potential for garbage.
Looking at this again, it's almost like anywhere
18 matches
Mail list logo