kaivalnp commented on PR #14178: URL: https://github.com/apache/lucene/pull/14178#issuecomment-2895246147
Rebased the PR to incorporate recent changes (including the optimistic collection based on pro-rating) --- Single-segment search has no impact as expected: Lucene: ``` recall latency (ms) nDoc topK fanout maxConn beamWidth quantized visited index s index docs/s force merge s num segments index size (MB) vec disk (MB) vec RAM (MB) 0.812 0.731 200000 100 50 32 200 no 1392 153.38 1303.96 0.01 1 236.93 228.882 228.882 ``` Faiss: ``` recall latency (ms) nDoc topK fanout maxConn beamWidth quantized visited index s index docs/s force merge s num segments index size (MB) vec disk (MB) vec RAM (MB) 0.811 0.568 200000 100 50 32 200 no 0 133.12 1502.37 0.01 1 511.97 228.882 228.882 ``` --- Multi-segment search is a bit tricky now, because we collect a different number of results from each segment based on its size -- but the `efSearch` parameter is set independently (from the index factory string) Lucene: ``` recall latency (ms) nDoc topK fanout maxConn beamWidth quantized visited index s index docs/s num segments index size (MB) vec disk (MB) vec RAM (MB) 0.880 2.110 200000 100 50 32 200 no 9632 88.03 2271.85 6 235.10 228.882 228.882 ``` Faiss: ``` recall latency (ms) nDoc topK fanout maxConn beamWidth quantized visited index s index docs/s num segments index size (MB) vec disk (MB) vec RAM (MB) 0.894 1.731 200000 100 50 32 200 no 0 83.34 2399.69 4 511.99 228.882 228.882 ``` (speedup is <20% even with fewer segments) --- In the future, we could try to expose a fixed set of parameters from Lucene and construct the index factory string programmatically to incorporate these caveats better.. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org