benwtrent commented on PR #12314:
URL: https://github.com/apache/lucene/pull/12314#issuecomment-1554999300

   @alessandrobenedetti thank you for kick starting this!
   
   You are absolutely correct, this is a large, but pivotal and necessary 
change for vector search in Lucene. I have not yet reviewed or looked over the 
entire design, but two things stuck out to me immediately.
   
   1. To simplify the review, design, testing, etc. could you simplify and 
reduce scope? Meaning, for the first iteration of this, Lucene only supports 
`max` and is used by default for multi-vector fields. This should:
       - remove the need for the additional search parameter (can be added as a 
part 2 of this feature in the future).
       - reduce testing requirements
       - reduce user overhead as the closest document based on the closest 
vector is returned. This behavior parallels nicely with a single vector field.
   2. Declaring a vector field as "multi-vector" seems...weird and has a very 
blast radius on the entire code base. It seems to me that the vector codec 
should seamlessly handle multiple vectors per field, just like it seamlessly 
handles sparse vector encoding. Though, thinking about it more, it seems 
difficult to support both sparse & multi-vector values in this way. All that 
said, I am curious about to your reasoning here.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to