benwtrent commented on PR #12582: URL: https://github.com/apache/lucene/pull/12582#issuecomment-1750806182
I did some benchmarks with the current change, using 400k Cohere embeddings searching over 1k vectors. This was on GCP `c3-standard-8` (Intel Sapphire Rapids), on this machine `byte[]` comparisons run in similar time to `float[]`. Index buffer: 128MB Force merged to one segment. I noticed that force-merging was about 35% faster. For query latency: | query | recall | latency | |---------------|--------|---------| | raw | 0.845 | 0.31ms | | quantized @10 | 0.800 | 0.23ms | | quantized @20 | 0.862 | 0.29 | Disk usage: - raw: 1172MB - quantized: 295MB -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org