benwtrent commented on PR #13651: URL: https://github.com/apache/lucene/pull/13651#issuecomment-2427907718
Here is some Lucene Util Benchmarking. Some of these numbers actually contradict some of my previous benchmarking for int4. Which is frustrating, I wonder what I did wrong then or now. Or of float32 got faster between then and now :) Regardless, this shows that bit quantization is generally as fast as int4 search or faster and you can get good recall with some oversampling. Combining with the 32x reduction in space its pretty nice. The oversampling rates were `[1, 1.5, 2, 3, 4, 5]`. HNSW params `m=16,efsearch=100`. `Recall@100`. ## Cohere v2 1M | quantization | Index Time | Force Merge time | Mem Required | |-------------------|------------|------------------|--------------| | 1 bit | 395.18 | 411.67 | 175.9MB | | 4 bit (compress) | 1877.47 | 491.13 | 439.7MB | | 7 bit | 500.59 | 820.53 | 833.9MB | | raw | 493.44 | 792.04 | 3132.8MB |  ## Cohere v3 1M 1M Cohere v3 1024 | quantization | Index Time | Force Merge time | Mem Required | |-------------------|------------|------------------|--------------| | 1 bit | 338.97 | 342.61 | 208MB | | 4 bit (compress) | 1113.06 | 5490.36 | 578MB | | 7 bit | 437.63 | 744.12 | 1094MB | | raw | 408.75 | 798.11 | 4162MB |  # e5Small | quantization | Index Time | Force Merge time | Mem Required | |-------------------|------------|------------------|--------------| | 1 bit | 161.84 | 42.37 | 57.6MB | | 4 bit (compress) | 665.54 | 660.33 | 123.2MB | | 7 bit | 267.13 | 89.99 | 219.6MB | | raw | 249.26 | 77.81 | 793.5MB |  -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org