Re: [PR] Re-use information from graph traversal during exact search [lucene]

2024-02-28 Thread via GitHub
kaivalnp closed pull request #12820: Re-use information from graph traversal during exact search URL: https://github.com/apache/lucene/pull/12820 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] Re-use information from graph traversal during exact search [lucene]

2024-02-22 Thread via GitHub
kaivalnp commented on PR #12820: URL: https://github.com/apache/lucene/pull/12820#issuecomment-1959568841 Thanks for checking @benwtrent! We primarily improve cases of using a high topK + a selective filter (good rate of fallback, large number of duplicate computations). I notice \~5%

Re: [PR] Re-use information from graph traversal during exact search [lucene]

2024-02-21 Thread via GitHub
benwtrent commented on PR #12820: URL: https://github.com/apache/lucene/pull/12820#issuecomment-1957923215 I have done some more benchmarking and there isn't really a significant improvement. This is over 500k, 1024 vectors. Getting the nearest 500 neighbors. Baseline ``` late

Re: [PR] Re-use information from graph traversal during exact search [lucene]

2024-01-08 Thread via GitHub
github-actions[bot] commented on PR #12820: URL: https://github.com/apache/lucene/pull/12820#issuecomment-1880899839 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contributi

Re: [PR] Re-use information from graph traversal during exact search [lucene]

2023-12-07 Thread via GitHub
benwtrent commented on code in PR #12820: URL: https://github.com/apache/lucene/pull/12820#discussion_r1419374932 ## lucene/core/src/java/org/apache/lucene/search/AbstractKnnCollector.java: ## @@ -66,4 +69,19 @@ public final int k() { @Override public abstract TopDocs to

Re: [PR] Re-use information from graph traversal during exact search [lucene]

2023-12-07 Thread via GitHub
benwtrent commented on code in PR #12820: URL: https://github.com/apache/lucene/pull/12820#discussion_r1419363698 ## lucene/core/src/java/org/apache/lucene/search/AbstractKnnCollector.java: ## @@ -66,4 +69,19 @@ public final int k() { @Override public abstract TopDocs to

Re: [PR] Re-use information from graph traversal during exact search [lucene]

2023-12-07 Thread via GitHub
benwtrent commented on code in PR #12820: URL: https://github.com/apache/lucene/pull/12820#discussion_r1419363698 ## lucene/core/src/java/org/apache/lucene/search/AbstractKnnCollector.java: ## @@ -66,4 +69,19 @@ public final int k() { @Override public abstract TopDocs to

Re: [PR] Re-use information from graph traversal during exact search [lucene]

2023-11-20 Thread via GitHub
kaivalnp commented on PR #12820: URL: https://github.com/apache/lucene/pull/12820#issuecomment-1819930817 Yes, the restrictive filter will cause more fallbacks to `#exactSearch`, and the high `topK` will mean more visitation = saving more on duplicate work > So we see a 5-10% improvem

Re: [PR] Re-use information from graph traversal during exact search [lucene]

2023-11-20 Thread via GitHub
kaivalnp commented on code in PR #12820: URL: https://github.com/apache/lucene/pull/12820#discussion_r1399830085 ## lucene/join/src/java/org/apache/lucene/search/join/DiversifyingChildrenByteKnnVectorQuery.java: ## @@ -158,59 +95,4 @@ public int hashCode() { result = 31 * r

Re: [PR] Re-use information from graph traversal during exact search [lucene]

2023-11-20 Thread via GitHub
kaivalnp commented on code in PR #12820: URL: https://github.com/apache/lucene/pull/12820#discussion_r1399829891 ## lucene/core/src/java/org/apache/lucene/search/KnnFloatVectorQuery.java: ## @@ -76,11 +73,11 @@ public KnnFloatVectorQuery(String field, float[] target, int k, Que

Re: [PR] Re-use information from graph traversal during exact search [lucene]

2023-11-20 Thread via GitHub
kaivalnp commented on code in PR #12820: URL: https://github.com/apache/lucene/pull/12820#discussion_r1399829461 ## lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java: ## @@ -171,33 +181,23 @@ protected TopDocs exactSearch(LeafReaderContext context, DocId

Re: [PR] Re-use information from graph traversal during exact search [lucene]

2023-11-20 Thread via GitHub
kaivalnp commented on code in PR #12820: URL: https://github.com/apache/lucene/pull/12820#discussion_r1399829328 ## lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java: ## @@ -155,14 +159,20 @@ protected boolean match(int doc) { } } - protected a

Re: [PR] Re-use information from graph traversal during exact search [lucene]

2023-11-20 Thread via GitHub
kaivalnp commented on code in PR #12820: URL: https://github.com/apache/lucene/pull/12820#discussion_r1399829104 ## lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java: ## @@ -109,32 +109,36 @@ private TopDocs getLeafResults(LeafReaderContext ctx, Weight f

Re: [PR] Re-use information from graph traversal during exact search [lucene]

2023-11-20 Thread via GitHub
vigyasharma commented on code in PR #12820: URL: https://github.com/apache/lucene/pull/12820#discussion_r1399533915 ## lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java: ## @@ -155,14 +159,20 @@ protected boolean match(int doc) { } } - protecte

Re: [PR] Re-use information from graph traversal during exact search [lucene]

2023-11-17 Thread via GitHub
kaivalnp commented on PR #12820: URL: https://github.com/apache/lucene/pull/12820#issuecomment-1816720340 Thanks @jpountz! I realised something from your comment: My current implementation has a flaw, because it cannot handle the [`OrdinalTranslatedKnnCollector`](https://github.com/ka

Re: [PR] Re-use information from graph traversal during exact search [lucene]

2023-11-16 Thread via GitHub
jpountz commented on PR #12820: URL: https://github.com/apache/lucene/pull/12820#issuecomment-1815358559 This is an interesting idea. Ideally we would figure out up-front whether it's best to use the graph or not, but I can also imagine that we can't always make the right decision there, so

[PR] Re-use information from graph traversal during exact search [lucene]

2023-11-16 Thread via GitHub
kaivalnp opened a new pull request, #12820: URL: https://github.com/apache/lucene/pull/12820 ### Description In KNN queries with a pre-filter, we first perform an approximate graph search and then [fallback](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/l