vigyasharma commented on PR #14173: URL: https://github.com/apache/lucene/pull/14173#issuecomment-2751872315
> Another option I was pondering is adding a new field type dedicated to multi-valued vectors. I tried this in my first stab at this issue (https://github.com/apache/lucene/pull/13525). IIRC, one concern with a separate field, was that it limits users from converting their previously single-valued fields to multi-valued vectors later if they need to. And since single-valued is a base case of multi-valued, why would anyone even use the single valued fields. The idea in this PR was to treat single-valued as an optimization over multi-valued vectors, that can be turned on/off by a flag in stored metadata. FWIW, the PR (#13525) has pieces to use the separate field, and shows the extent of duplication across classes (it's not very much). I had only added support for ColBERT style dependent multi-vectors, but that can be extended with the independent vector pieces in this PR. .. > Before even exploring this, I want to better check the current parent join approach i.e. native multi-valued, needs to bring advantages (mostly being faster in retrieving top-K 'parent' documents), Agreed. The next step for this PR is to benchmark parent-join runs and see an improvement, esp. in cases where we need query time scoring on top of all the vector values. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org