benwtrent commented on PR #14304: URL: https://github.com/apache/lucene/pull/14304#issuecomment-2713553719
On GCP, there isn't much difference. I wouldn't expect there to be a huge amount of difference as the dominate cost is the vector comparisons not the quantization. I haven't tested with "flat" yet. BASELINE ``` recall latency (ms) nDoc topK fanout maxConn beamWidth quantized visited index s index docs/s force merge s num segments index size (MB) vec disk (MB) vec RAM (MB) 0.961 2.910 200000 100 50 64 250 7 bits 6677 111.44 1794.70 79.03 1 997.58 976.563 195.313 ``` CANDIDATE ``` recall latency (ms) nDoc topK fanout maxConn beamWidth quantized visited index s index docs/s force merge s num segments index size (MB) vec disk (MB) vec RAM (MB) 0.960 2.460 200000 100 50 64 250 7 bits 6527 110.99 1801.98 76.68 1 997.55 976.563 195.313 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org