benwtrent commented on issue #14681:
URL: https://github.com/apache/lucene/issues/14681#issuecomment-2996401932

   > if the FP vectors are not stored in memory we have noticed that the graph 
structure is overall pretty lean and can fit on the JVM heap pretty easily, 
even on low heaps.
   
   This is true for quantized HNSW as well. 
   
   > Moreover, another important factor is that of efficient IO access for 
scoring FP vectors even for the non quantized use cases.
   
   What are the improvements there? Is it adjusting the IO patterns for scoring 
or is it because rescoring utilizes LVQ? (e.g. scalar quantization centered on 
a centroid...which is a simpler version of the OptimizedScalarQuantizer in 
Lucene right now)
   
   > Concurrency - jVector allows for concurrent build of graph index
   
   So does Lucene.
   
   > A few months back I noticed significantly more IO access in the Lucene 
HNSW format than the jVector version at the time.
   
   I would be happy to see those numbers.
   
   I do think there are things that Lucene can learn from JVector. It would be 
way better for the Lucene community as a whole to "do the hard thing" and 
improve Lucene directly instead of adding yet another plugin that does yet 
another graph based vector search impl. 
   
   If we see that JVector is just better than quantized HNSW in every way, why 
not simply sunset HNSW and switch to vamana?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to