Re: [PR] [Draft] Support Multi-Vector HNSW Search via Flat Vector Storage [lucene]

via GitHub Fri, 21 Mar 2025 15:09:28 -0700


vigyasharma commented on PR #14173:
URL: https://github.com/apache/lucene/pull/14173#issuecomment-2744562872


   Thanks for looking into this PR @alessandrobenedetti , this is the latest 
iteration on multi-vector support.
   
   It does build on the same central idea of assigning a unique ordinal to each 
vector and mapping multiple ordinals to a single doc. I tried a few other 
approaches, but this one seemed cleanest.
   
   I think the key difference over #12314 , are changes to store metadata that 
lets us map multiple ordinals to a single doc. This is implemented in 
`MultiVectorOrdConfiguration` using `DirectMonotonicWriter/Reader`. For every 
doc, I maintain the ordinal of its first vector (`baseOrdinal`) along with no. 
of vectors in the doc, and use these to do the `ordToDoc` mapping for vectors. 
I didn't fully understand how this was done in your orginal PR, specifically 
how it mapped an ordinal back to its docId, given we can have variable no. of 
vectors per doc. Maybe I missed something. If you had a simpler implementation, 
I'm happy to circle back to it.
   
   I also added an `allVectorValues()` API to `Byte|FloatVectorValues`, which I 
think will help during query time. Other that this, the changes are mostly 
around integrating multi-vector support and will likely have a lot of overlap.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] [Draft] Support Multi-Vector HNSW Search via Flat Vector Storage [lucene]

Reply via email to