Re: [I] [Feature] Add support for passing extra information with KNNVectorField [lucene]

via GitHub Tue, 18 Feb 2025 13:37:37 -0800


navneet1v commented on issue #14247:
URL: https://github.com/apache/lucene/issues/14247#issuecomment-2666981898


   > I remember (but I don't remember where) seeing someone doing multi-tenant 
vector search by using a flat vector index and enabling index sorting on the 
tenant ID. Then vector search can't take advantage of an advanced structure 
like HNSW, but the I/O access pattern is disk-friendly, so if each tenant isn't 
too large on its own the performance may be acceptable.
   > 
   > In general, I'm not a fan of this proposal of enabling creating multiple 
KNN indexes via some user-provided tenant/cluster ID. This looks like working 
around the fact that vector-search is currently not good at pre-filtering. I'd 
rather look into how we can make vector search better at pre-filtering (either 
with HNSW or something else).
   
   
   Thanks @jpountz for your thoughts. I do agree filtering has its limitations 
and that with HNSW and this where one to solve the problem building separate 
graphs was one option and then attaching them as a single unit. I will think 
over this more to see if there is a better way, or if we can fuse in the tenant 
information in the vector itself, resulting in vectors belonging to 1 tenant 
are somehow being closer.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [I] [Feature] Add support for passing extra information with KNNVectorField [lucene]

Reply via email to