vigyasharma commented on PR #14173:
URL: https://github.com/apache/lucene/pull/14173#issuecomment-2751872315

   > Another option I was pondering is adding a new field type dedicated to 
multi-valued vectors.
   
   I tried this in my first stab at this issue 
(https://github.com/apache/lucene/pull/13525). IIRC, one concern with a 
separate field, was that it limits users from converting their previously 
single-valued fields to multi-valued vectors later if they need to. And since 
single-valued is a base case of multi-valued, why would anyone even use the 
single valued fields.
   The idea in this PR was to treat single-valued as an optimization over 
multi-valued vectors, that can be turned on/off by a flag in stored metadata.
   
   FWIW, the PR (#13525) has pieces to use the separate field, and shows the 
extent of duplication across classes (it's not very much). I had only added 
support for ColBERT style dependent multi-vectors, but that can be extended 
with the independent vector pieces in this PR.
   
   ..
   
   > Before even exploring this, I want to better check the current parent join 
approach i.e. native multi-valued, needs to bring advantages (mostly being 
faster in retrieving top-K 'parent' documents), 
   
   Agreed. The next step for this PR is to benchmark parent-join runs and see 
an improvement, esp. in cases where we need query time scoring on top of all 
the vector values.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to