Re: [I] Multi-value Support for KnnVectorField [lucene]

2024-05-13 Thread via GitHub
benwtrent commented on issue #12313: URL: https://github.com/apache/lucene/issues/12313#issuecomment-2108168850 > Not sure I understand option 3. Are you thinking that graph has different types of edges b/w documents based on diff. similarity functions? So if you were using max similarity y

Re: [I] Multi-value Support for KnnVectorField [lucene]

2024-05-13 Thread via GitHub
vigyasharma commented on issue #12313: URL: https://github.com/apache/lucene/issues/12313#issuecomment-2108154488 @benwtrent I like the idea of having documents be vertices in the graph, with an API that let's you iterate/access the different vectors per doc. It would have an indexing time

Re: [I] Multi-value Support for KnnVectorField [lucene]

2024-05-13 Thread via GitHub
benwtrent commented on issue #12313: URL: https://github.com/apache/lucene/issues/12313#issuecomment-2107660631 @vigyasharma @krickert There are a couple of ways to implement this natively in Lucene. 1. Have each individual vector be a connection in the graph with some resolution ba

Re: [I] Multi-value Support for KnnVectorField [lucene]

2024-05-13 Thread via GitHub
vigyasharma commented on issue #12313: URL: https://github.com/apache/lucene/issues/12313#issuecomment-2107568883 > In another scenario, the results would just return the top doc and not repeat it. I believe this is what the parent-block join implementation for vector values does cur

Re: [I] Multi-value Support for KnnVectorField [lucene]

2024-05-11 Thread via GitHub
krickert commented on issue #12313: URL: https://github.com/apache/lucene/issues/12313#issuecomment-2105745361 I was thinking about this and thought this would be cool with a few different use cases for a multi-valued vector: 1. The multi-values are treated the same as the single valu

Re: [I] Multi-value Support for KnnVectorField [lucene]

2024-04-09 Thread via GitHub
benwtrent commented on issue #12313: URL: https://github.com/apache/lucene/issues/12313#issuecomment-2045722371 > if the aggregation is max, would we need to compute distance between n x n vectors and then take the max? Correct, I would even want flexibility between what was used to

Re: [I] Multi-value Support for KnnVectorField [lucene]

2024-04-09 Thread via GitHub
vigyasharma commented on issue #12313: URL: https://github.com/apache/lucene/issues/12313#issuecomment-2045706297 @benwtrent : ++, I've been thinking on similar lines in the context of e-commerce type applications where different vectors represent different aspects of a document. The scorer

Re: [I] Multi-value Support for KnnVectorField [lucene]

2024-04-09 Thread via GitHub
benwtrent commented on issue #12313: URL: https://github.com/apache/lucene/issues/12313#issuecomment-2045604262 I do think things like `ColBERT` would benefit from having multiple vectors for a single document field. One crazy idea I had (others have probably already thought of this,

Re: [I] Multi-value Support for KnnVectorField [lucene]

2024-04-09 Thread via GitHub
alessandrobenedetti commented on issue #12313: URL: https://github.com/apache/lucene/issues/12313#issuecomment-2044685997 Hi, I gave a talk about this at Berlin Buzzwords where I touched on the motivations: https://www.youtube.com/watch?v=KhL0NrGj0uE In short: - multi-valued vectors

Re: [I] Multi-value Support for KnnVectorField [lucene]

2024-04-08 Thread via GitHub
vigyasharma commented on issue #12313: URL: https://github.com/apache/lucene/issues/12313#issuecomment-2044076948 What are some use-cases for multi-valued vectors that are not easily supported using parent-child block joins? I'd like to contribute here, trying to understand our main

Re: [I] Multi-value Support for KnnVectorField [lucene]

2023-11-29 Thread via GitHub
alessandrobenedetti commented on issue #12313: URL: https://github.com/apache/lucene/issues/12313#issuecomment-1831534483 Hi @david-sitsky, the multi-valued vectors in Lucene's contribution is now paused for lack of fundings. I'll resume it from my side if and when I get some sponsors :)

Re: [I] Multi-value Support for KnnVectorField [lucene]

2023-11-28 Thread via GitHub
david-sitsky commented on issue #12313: URL: https://github.com/apache/lucene/issues/12313#issuecomment-1831197772 > The key issue is document collection. Right now, the `topK` is limited to only `topK` children documents. Really, what you want is the `topK` parent documents based on childr