alessandrobenedetti commented on PR #14173:
URL: https://github.com/apache/lucene/pull/14173#issuecomment-2751006045

   > > do you confirm that, according to your knowledge, any relevant and 
active work toward multi-valued vectors in Lucene is effectively aggregated 
here?
   > 
   > @alessandrobenedetti I think so. This is the latest stab at it.
   > 
   > > Main concern is still related to ordinals to become long as far as I can 
see :)
   > 
   > Indeed, I just don't see how Lucene can actually support multi-value 
vectors without switching to long ordinals for the vectors. Otherwise, we 
enforce some limitation on the number of vectors per segment, or some 
limitation on the number of vectors per doc (e.g. every doc can only have 
256/65535 vectors).
   > 
   > Making HNSW indexing & merging ~2x (given other constants, it might not be 
exactly 2x, maybe a little less) more expensive for heap usage is a pretty 
steep cost. Especially for something I am not sure how many folks will actually 
use.
   
   I agree, I don't think it makes sense to deteriorate single-valued 
performance at all (didn't investigate that, but I trust your judgement in 
terms of the int->long ordinal impact, in case you want me to double check let 
me know).
   
   Another option I was pondering is adding a new field type dedicated to 
multi-valued vectors.
   Sure, there will be tons of classes to "duplicate" and make multi-valued 
compliant, but I believe we'll be able to re-use most of the code, so a huge 
number of classes but not a massive new code quantity (hopefully).
   Before even exploring this, I want to better check the current parent join 
approach i.e. native multi-valued, needs to bring advantages (mostly being 
faster in retrieving top-K 'parent' documents), if not, it won't make much 
sense to do this huge amount of work.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to