kaivalnp commented on PR #14874: URL: https://github.com/apache/lucene/pull/14874#issuecomment-3033793168
Okay I ran things _slightly_ differently for 300d vectors. All runs are without `-reindex`, but I'm deleting the index between runs of `main` and this PR to create a fresh one `main` (run 1, no pre-existing index) ``` recall latency(ms) netCPU avgCpuCount nDoc topK fanout maxConn beamWidth quantized index(s) index_docs/s force_merge(s) num_segments index_size(MB) vec_disk(MB) vec_RAM(MB) indexType 0.808 0.635 0.635 1.000 200000 100 50 64 250 no 38.55 5188.07 0.00 1 64.00 228.882 228.882 HNSW ``` `main` (run 2, use same index as run 1) ``` recall latency(ms) netCPU avgCpuCount nDoc topK fanout maxConn beamWidth quantized index(s) index_docs/s force_merge(s) num_segments index_size(MB) vec_disk(MB) vec_RAM(MB) indexType 0.808 0.635 0.634 0.999 200000 100 50 64 250 no 0.00 Infinity 0.12 1 64.00 228.882 228.882 HNSW ``` This PR (run 3, no pre-existing index) ``` recall latency(ms) netCPU avgCpuCount nDoc topK fanout maxConn beamWidth quantized index(s) index_docs/s force_merge(s) num_segments index_size(MB) vec_disk(MB) vec_RAM(MB) indexType 0.809 0.977 0.978 1.001 200000 100 50 64 250 no 40.11 4985.91 0.00 1 63.95 228.882 228.882 HNSW ``` This PR (run 4, use same index as run 3) ``` recall latency(ms) netCPU avgCpuCount nDoc topK fanout maxConn beamWidth quantized index(s) index_docs/s force_merge(s) num_segments index_size(MB) vec_disk(MB) vec_RAM(MB) indexType 0.809 0.491 0.491 1.000 200000 100 50 64 250 no 0.00 Infinity 0.12 1 63.95 228.882 228.882 HNSW ``` The results seem consistent with what we see above, the changes in this PR cause a regression if used immediately after indexing, but a speedup if used on an existing index? I'm confused :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org