benwtrent commented on issue #14681: URL: https://github.com/apache/lucene/issues/14681#issuecomment-2996401932
> if the FP vectors are not stored in memory we have noticed that the graph structure is overall pretty lean and can fit on the JVM heap pretty easily, even on low heaps. This is true for quantized HNSW as well. > Moreover, another important factor is that of efficient IO access for scoring FP vectors even for the non quantized use cases. What are the improvements there? Is it adjusting the IO patterns for scoring or is it because rescoring utilizes LVQ? (e.g. scalar quantization centered on a centroid...which is a simpler version of the OptimizedScalarQuantizer in Lucene right now) > Concurrency - jVector allows for concurrent build of graph index So does Lucene. > A few months back I noticed significantly more IO access in the Lucene HNSW format than the jVector version at the time. I would be happy to see those numbers. I do think there are things that Lucene can learn from JVector. It would be way better for the Lucene community as a whole to "do the hard thing" and improve Lucene directly instead of adding yet another plugin that does yet another graph based vector search impl. If we see that JVector is just better than quantized HNSW in every way, why not simply sunset HNSW and switch to vamana? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org