msokolov opened a new issue, #13468:
URL: https://github.com/apache/lucene/issues/13468

   ### Description
   
   There are use cases where we want to store medium-dimensional vectors (ie 
embedding space vectors from ML models), retrieve them, compute distances among 
them, and perform KNN search, but we don't want to HNSW or any other special 
index-time support. If we search, we'll do it using an index scan. For example 
this could happen if we partition the index by some key and then rank the 
resulting documents by their vector distance. Currently if you make a 
`KnnFloatVectorField` or a `KnnByteVectorField` you get an HNSW graph even if 
you don't want it. We have all the tools to support this use case, but the API 
doesn't allow it.
   
   My question is how should the API look? I started to familiarize myself with 
the flat vectors support we now have and I see it was done so we now 
KnnVectorsFormat and FlatVectorsFormat as separate formats that do not share 
anny common ancestor. I wonder what you all would think about folding 
FlatVectorsFormat in to KnnVectorsFormat? The only difference today is the 
search() method, which I would like to support over flat vectors. Otherwise I 
guess we could add search to FlatVectorsFormat?? But in that case how would we 
select this format for a field? I'd rather avoid plumbing a whole new format 
through IndexWriter when it is effectively a flavor of a format we already 
have. But I may be missing the rationale behind this format forking ... was 
there some discussion about it you could point me to - I might have been 
sleeping, sorry!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to