mikemccand commented on PR #14178: URL: https://github.com/apache/lucene/pull/14178#issuecomment-2845297912
> > slower the 11 segment case is (6.5 vs .9 msec) -- maybe the search is not concurrent across segments? > > Yes, it is sequential -- we're not passing an executor [here](https://github.com/mikemccand/luceneutil/blob/a75b8a4d3a8c146f3ff7e6695ab6bbf2e34e4b90/src/main/knn/KnnGraphTester.java#L851) for concurrency Ahh ... hmm, we should add an option to use concurrency. I'll open a luceneutil issue! > Also, would be great if you could post Lucene HNSW benchmarks for comparison! +1, good idea! Here: ``` Results: recall latency(ms) nDoc topK fanout maxConn beamWidth quantized index(s) index_docs/s force_merge(\ s) num_segments index_size(MB) vec_disk(MB) vec_RAM(MB) indexType 0.962 7.251 500000 100 50 32 200 no 142.24 3515.16 0.00 8 1496.86 1464.844 1464.844 HNSW 0.891 1.644 500000 100 50 32 200 no 140.98 3546.68 179.58 1 1501.01 1464.844 1464.844 HNSW ``` Recall is quite close -- that's good (shows the HNSW impls are pretty close I think, when using the same hyperparameters). The latency is quite a bit slower for Lucene single segment, but then multi-segment is only a bit slower (not proportionally so) ... it's as if there is a high one-time cost for running the search in Lucene, somehow? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org