kaivalnp opened a new issue, #14758:
URL: https://github.com/apache/lucene/issues/14758

   ### Description
   
   For use-cases of searching different subsets of vectors in the index, where 
a non-trivial portion of vectors across fields are overlapping.
   
   This could be done today by:
   1. Indexing all vectors in a single field and using query-time 
pre-filtering, but it can become expensive if the filter is restrictive (and 
large parts of the graph have to be traversed to collect accepted docs).
   2. Creating separate fields for groups of vectors, but it could increase 
index size for a large number of overlapping vectors across subsets.
   
   Implementation wise, we'd have multiple HNSW graphs per-field (backed by the 
same raw vectors), each identifiable by an ID (perhaps an integer or string). 
Each document in this field would specify a set of IDs along with its raw 
vector, and separate HNSW graphs would be created for each ID. Similarly the 
query would specify an ID to search in.
   
   Wanted to check how common the use-case is, and if adding such a field would 
help..


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to