seanmacavaney closed pull request #13635: Add AbstractKnnVectorQuery.seed for
seeded HNSW
URL: https://github.com/apache/lucene/pull/13635
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specifi
seanmacavaney commented on PR #13635:
URL: https://github.com/apache/lucene/pull/13635#issuecomment-2602464296
closing due to #14084
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific c
github-actions[bot] commented on PR #13635:
URL: https://github.com/apache/lucene/pull/13635#issuecomment-2569959328
This PR has not had activity in the past 2 weeks, labeling it as stale. If
the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you
for your contributi
benwtrent opened a new pull request, #14084:
URL: https://github.com/apache/lucene/pull/14084
This is a continuation of https://github.com/apache/lucene/pull/13635
### Description
This PR addresses #13634.
The main changes are in:
- A new "seeded" focused knn collecto
seanmacavaney commented on PR #13635:
URL: https://github.com/apache/lucene/pull/13635#issuecomment-2556076932
Hey @benwtrent -- it's been on my todo list to get back to this, but I've
gotten bogged down with a bunch of other stuff.
If you're willing, please do go ahead and refactor :
benwtrent commented on PR #13635:
URL: https://github.com/apache/lucene/pull/13635#issuecomment-2554686592
@seanmacavaney do you still want this contributed to Apache Lucene? Its
excellent work and I don't want it dying on the vine.
If its ok with you, I plan on refactoring it (with a
github-actions[bot] commented on PR #13635:
URL: https://github.com/apache/lucene/pull/13635#issuecomment-2448727288
This PR has not had activity in the past 2 weeks, labeling it as stale. If
the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you
for your contributi
seanmacavaney commented on PR #13635:
URL: https://github.com/apache/lucene/pull/13635#issuecomment-2417194138
Not at all-- thanks a lot for the help @benwtrent! I totally agree with the
proposed changes and it's clear how to move forward on this. I'm just occupied
with several other priori
benwtrent commented on PR #13635:
URL: https://github.com/apache/lucene/pull/13635#issuecomment-2417118146
Hey @seanmacavaney didn't want this to die on the vine. I think with some
refactoring and adding new experimental queries, this could be a nice
experimental feature for vector search.
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1778438135
##
lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java:
##
@@ -156,6 +196,55 @@ private TopDocs getLeafResults(
}
}
+ private DocId
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1778462654
##
lucene/core/src/test/org/apache/lucene/document/TestManyKnnDocs.java:
##
@@ -46,27 +54,139 @@ public void testLargeSegment() throws Exception {
mp.setMaxMer
benwtrent commented on PR #13635:
URL: https://github.com/apache/lucene/pull/13635#issuecomment-2379026129
Thinking more and more, I do not like the idea of adding to the leaf
function definition.
But, this does seem useful. I think we can attach it to kNN collectors.
I don't
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1778417770
##
lucene/core/src/test/org/apache/lucene/search/BaseKnnVectorQueryTestCase.java:
##
@@ -608,6 +614,94 @@ public void testRandomWithFilter() throws IOException {
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1778399477
##
lucene/core/src/test/org/apache/lucene/search/BaseKnnVectorQueryTestCase.java:
##
@@ -608,6 +614,94 @@ public void testRandomWithFilter() throws IOException {
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1778468665
##
lucene/core/src/test/org/apache/lucene/document/TestManyKnnDocs.java:
##
@@ -46,27 +54,139 @@ public void testLargeSegment() throws Exception {
mp.setMaxMer
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1778463376
##
lucene/core/src/test/org/apache/lucene/document/TestManyKnnDocs.java:
##
@@ -46,27 +54,139 @@ public void testLargeSegment() throws Exception {
mp.setMaxMer
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1778453629
##
lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java:
##
@@ -156,6 +196,49 @@ private TopDocs getLeafResults(
}
}
+ private DocId
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1778436381
##
lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java:
##
@@ -156,6 +196,55 @@ private TopDocs getLeafResults(
}
}
+ private DocId
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1778422918
##
lucene/core/src/test/org/apache/lucene/search/BaseKnnVectorQueryTestCase.java:
##
@@ -608,6 +614,94 @@ public void testRandomWithFilter() throws IOException {
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1778419593
##
lucene/core/src/test/org/apache/lucene/search/BaseKnnVectorQueryTestCase.java:
##
@@ -608,6 +614,94 @@ public void testRandomWithFilter() throws IOException {
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1778412763
##
lucene/core/src/test/org/apache/lucene/search/BaseKnnVectorQueryTestCase.java:
##
@@ -608,6 +614,94 @@ public void testRandomWithFilter() throws IOException {
seanmacavaney commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1778379250
##
lucene/backward-codecs/src/java/org/apache/lucene/backward_codecs/lucene94/Lucene94HnswVectorsReader.java:
##
@@ -283,11 +289,17 @@ public void search(String
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1778218742
##
lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java:
##
@@ -156,6 +196,55 @@ private TopDocs getLeafResults(
}
}
+ private DocId
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1778283453
##
lucene/test-framework/src/java/org/apache/lucene/tests/index/MismatchedLeafReader.java:
##
@@ -68,6 +71,28 @@ public CacheHelper getReaderCacheHelper() {
re
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1778255372
##
lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99HnswVectorsReader.java:
##
@@ -247,7 +248,12 @@ public ByteVectorValues getByteVectorValues(String
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1778254222
##
lucene/backward-codecs/src/java/org/apache/lucene/backward_codecs/lucene94/Lucene94HnswVectorsReader.java:
##
@@ -283,11 +289,17 @@ public void search(String fie
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1778226954
##
lucene/core/src/java/org/apache/lucene/search/KnnByteVectorQuery.java:
##
@@ -133,4 +149,18 @@ public int hashCode() {
public byte[] getTargetCopy() {
re
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1778226954
##
lucene/core/src/java/org/apache/lucene/search/KnnByteVectorQuery.java:
##
@@ -133,4 +149,18 @@ public int hashCode() {
public byte[] getTargetCopy() {
re
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1778221739
##
lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java:
##
@@ -156,6 +196,55 @@ private TopDocs getLeafResults(
}
}
+ private DocId
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1778215992
##
lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java:
##
@@ -156,6 +196,55 @@ private TopDocs getLeafResults(
}
}
+ private DocId
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1778215267
##
lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java:
##
@@ -156,6 +196,55 @@ private TopDocs getLeafResults(
}
}
+ private DocId
seanmacavaney commented on PR #13635:
URL: https://github.com/apache/lucene/pull/13635#issuecomment-2376563025
Thanks for the comments @cpoerschke. That should be them all addressed.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to Gi
seanmacavaney commented on PR #13635:
URL: https://github.com/apache/lucene/pull/13635#issuecomment-2376566988
Thanks @benwtrent!
> One thing that still bugs me is attaching yet another parameter to a
common method. Especially a parameter that will be rarely used (my gut reaction
is
github-actions[bot] commented on PR #13635:
URL: https://github.com/apache/lucene/pull/13635#issuecomment-2375493422
This PR has not had activity in the past 2 weeks, labeling it as stale. If
the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you
for your contributi
benwtrent commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1755241203
##
lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphSearcher.java:
##
@@ -70,6 +72,44 @@ public static void search(
search(scorer, knnCollector, graph,
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1755202681
##
lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphSearcher.java:
##
@@ -70,6 +72,44 @@ public static void search(
search(scorer, knnCollector, graph
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1755190170
##
lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphSearcher.java:
##
@@ -70,6 +72,44 @@ public static void search(
search(scorer, knnCollector, graph
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1755183703
##
lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphSearcher.java:
##
@@ -70,6 +72,44 @@ public static void search(
search(scorer, knnCollector, graph
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1755173124
##
lucene/core/src/java/org/apache/lucene/index/LeafReader.java:
##
@@ -295,7 +302,13 @@ public final TopDocs searchNearestVectors(
* @lucene.experimental
*
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1755172655
##
lucene/core/src/java/org/apache/lucene/index/LeafReader.java:
##
@@ -251,7 +252,13 @@ public final PostingsEnum postings(Term term) throws
IOException {
* @
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1755169292
##
lucene/core/src/java/org/apache/lucene/index/ByteVectorValues.java:
##
@@ -86,4 +86,16 @@ public static void checkField(LeafReader in, String field) {
* @ret
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1755168432
##
lucene/core/src/java/org/apache/lucene/index/FloatVectorValues.java:
##
@@ -87,4 +87,16 @@ public static void checkField(LeafReader in, String field) {
* @re
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1755161198
##
lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java:
##
@@ -156,6 +196,58 @@ private TopDocs getLeafResults(
}
}
+ private DocId
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1755158934
##
lucene/core/src/java/org/apache/lucene/search/KnnFloatVectorQuery.java:
##
@@ -136,4 +152,18 @@ public int hashCode() {
public float[] getTargetCopy() {
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1755158656
##
lucene/core/src/java/org/apache/lucene/search/KnnByteVectorQuery.java:
##
@@ -133,4 +149,18 @@ public int hashCode() {
public byte[] getTargetCopy() {
re
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1754517493
##
lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java:
##
@@ -491,4 +594,44 @@ public int hashCode() {
classHash(), contextIdentit
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1754492064
##
lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java:
##
@@ -156,6 +196,58 @@ private TopDocs getLeafResults(
}
}
+ private DocId
benwtrent commented on PR #13635:
URL: https://github.com/apache/lucene/pull/13635#issuecomment-2342018139
I haven't been able to review fully, but will soon.
I like this idea. Do you have any scripts & test data where you have tested
that this Lucene implementation works and gives a
benwtrent commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1712183885
##
lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphSearcher.java:
##
@@ -70,6 +72,43 @@ public static void search(
search(scorer, knnCollector, graph,
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1707450580
##
lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java:
##
@@ -156,6 +196,44 @@ private TopDocs getLeafResults(
}
}
+ private DocId
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1707422464
##
lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java:
##
@@ -491,4 +580,44 @@ public int hashCode() {
classHash(), contextIdentit
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1707421233
##
lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java:
##
@@ -156,6 +196,44 @@ private TopDocs getLeafResults(
}
}
+ private DocId
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1707372290
##
lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphSearcher.java:
##
@@ -70,6 +72,43 @@ public static void search(
search(scorer, knnCollector, graph
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1707371348
##
lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphSearcher.java:
##
@@ -70,6 +72,43 @@ public static void search(
search(scorer, knnCollector, graph
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1707338145
##
lucene/core/src/java/org/apache/lucene/search/KnnByteVectorQuery.java:
##
@@ -72,14 +72,30 @@ public KnnByteVectorQuery(String field, byte[] target, int
k) {
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1707073496
##
lucene/core/src/java/org/apache/lucene/codecs/KnnVectorsReader.java:
##
@@ -82,9 +83,16 @@ protected KnnVectorsReader() {}
* @param knnCollector a KnnResults
cpoerschke commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1707074017
##
lucene/core/src/java/org/apache/lucene/codecs/KnnVectorsReader.java:
##
@@ -110,9 +118,16 @@ public abstract void search(
* @param knnCollector a KnnResults
seanmacavaney commented on PR #13635:
URL: https://github.com/apache/lucene/pull/13635#issuecomment-2271353524
> I could see it being very nice, or behaving poorly depending on the seed
query (which, I guess is expected).
We could probably predict whether a seed set is good or bad bas
seanmacavaney commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1705579214
##
lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphSearcher.java:
##
@@ -70,6 +72,43 @@ public static void search(
search(scorer, knnCollector, gr
seanmacavaney commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1705575607
##
lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java:
##
@@ -156,6 +189,44 @@ private TopDocs getLeafResults(
}
}
+ private Do
benwtrent commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1705418758
##
lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java:
##
@@ -156,6 +189,44 @@ private TopDocs getLeafResults(
}
}
+ private DocIdS
benwtrent commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1705409803
##
lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphSearcher.java:
##
@@ -70,6 +72,43 @@ public static void search(
search(scorer, knnCollector, graph,
seanmacavaney opened a new pull request, #13635:
URL: https://github.com/apache/lucene/pull/13635
### Description
This PR addresses #13634.
The main changes are in:
- `AbstractKnnVectorQuery`, which adds a `seed` field. It scores this query
if provided, and passes these see
63 matches
Mail list logo