alessandrobenedetti commented on PR #12314: URL: https://github.com/apache/lucene/pull/12314#issuecomment-1556882171
Hi > @alessandrobenedetti thank you for kick starting this! > > You are absolutely correct, this is a large, but pivotal and necessary change for vector search in Lucene. I have not yet reviewed or looked over the entire design, but two things stuck out to me immediately. > > 1. To simplify the review, design, testing, etc. could you simplify and reduce scope? Meaning, for the first iteration of this, Lucene only supports `max` and is used by default for multi-vector fields. This should: > > * remove the need for the additional search parameter (can be added as a part 2 of this feature in the future). > * reduce testing requirements > * reduce user overhead as the closest document based on the closest vector is returned. This behavior parallels nicely with a single vector field. > 2. Declaring a vector field as "multi-vector" seems...weird and has a very blast radius on the entire code base. It seems to me that the vector codec should seamlessly handle multiple vectors per field, just like it seamlessly handles sparse vector encoding. Though, thinking about it more, it seems difficult to support both sparse & multi-vector values in this way. All that said, I am curious about to your reasoning here. Hi @benwtrent, thanks for the initial feedback! Let me take a look, I'll proceed with a first iterative pass of simplification now that "it works" . I'll re-assess if we really need the multi-valued-vector flag for the field (when I started the works, it seemed necessary to me, but maybe I can find a better way). I'll also check how simpler it becomes with just 'MAX'. Cheers -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org