jimczi commented on PR #13525: URL: https://github.com/apache/lucene/pull/13525#issuecomment-2478490713
> One reason to not add this would be if it makes the single vector setup hard to evolve. I'd like to understand if (and how) this is happening, and think on how we can address those concerns. I believe we should carefully consider the approach to adding multi-vector support through an aggregate function. From the outset, we assume that multi-vectors should be scored together, which is an important principle. Moreover, the default aggregate function proposed in the PR relies on brute force, which is not practical for any indexing setup. My concern is that this proposal doesn’t truly add support for independent multi-vectors. Instead, it introduces a block of vectors that must be scored together, which feels like a workaround rather than a comprehensive solution. This approach doesn’t address the key challenges of implementing true multi-vector support in the codec. The root issue is that the current KNN codec assumes the number of vectors is bounded by a single integer, a limitation that needs to be addressed first. Removing this constraint is a complex task but essential for properly supporting multi-vectors. Once that foundation is in place, adding support for setups like ColBERT should become relatively straightforward. Finally, while the max-sim function proposed in this PR may work as a ranking function, it isn’t suitable for indexing any documents. A true solution should allow for independent multi-vectors to be queried and scored flexibly without these constraints. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org