krickert commented on issue #12313:
URL: https://github.com/apache/lucene/issues/12313#issuecomment-2105745361

   I was thinking about this and thought this would be cool with a few 
different use cases for a multi-valued vector:
   
   1. The multi-values are treated the same as the single value, except once 
it's found to be a nearest K, it won't repeat.  For example: Doc A has vectors 
A1, A2, and A3.  Doc2 has vectors B1 bad B2.  Then we have a Doc3 with C1. A 
vector search is performed, and the K'th nearest return:
   A1
   A2
   C1
   B2
   B1
   A3
   
   In one scenerio, the search results would be the same as above, and the docs 
would repeat.  
   
   In another scenario, the results would just return the top doc and not 
repeat it.  So a KNN result would be:
   Doc1 (A1 won)
   Doc3 (C1 won)
   Doc2 (B2 won)
   
   ... 
   
   In another option, we can look into indexing the vectors where we get an 
average, min, or max between each dimension and just index the avg, min, or 
max.  For some reason, I think this might be a bit weird since you can do these 
calculations at index time.  But just a thought...
   
   Are any of the suggestions similar to what I'm suggesting?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to