[GitHub] [lucene] alessandrobenedetti commented on pull request #12314: Multi-value support for KnnVectorField

via GitHub Thu, 25 May 2023 08:09:34 -0700


alessandrobenedetti commented on PR #12314:
URL: https://github.com/apache/lucene/pull/12314#issuecomment-1563074046


   > Would be nice to get yours and other's ideas on them:
   > 
   > 1. Is the main usage for breaking a long text into several paragraphs? Or 
is it also to search across several different fields (e.g. [embedding_of_title, 
embedding_of_content]) where we create a single graph for several fields?
   I would say the first, I have seen examples of that, especially when the 
large language model chosen by the customer has a limit in input tokens smaller 
than the document length for the customer.
   A single graph over several fields would be the equivalent of a catch-all 
field for the lexical search, it was not my primary focus but should be doable 
once we have multi-valued fields.
   > 2. Would it be possible to retrieve which vector was the closest match? 
For example, if we break a long text into paragraphs and want to highlight 
which paragraph was the closest match. This could be crucial for some use cases.
   I agree, that can be a nice addition to the explain!
   
   I guess attaching metadata to vectors is a different story, but I agree it 
could be a good idea!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] alessandrobenedetti commented on pull request #12314: Multi-value support for KnnVectorField

Reply via email to