kaivalnp commented on PR #15285:
URL: https://github.com/apache/lucene/pull/15285#issuecomment-3372217030
Ran some luceneutil benchmarks on Cohere vectors, 768d for various vector
similarities x quantization bits:
#### `dot_product`
`main`
```
recall latency(ms) netCPU avgCpuCount nDoc topK fanout maxConn
beamWidth quantized visited index(s) index_docs/s force_merge(s)
num_segments index_size(MB) vec_disk(MB) vec_RAM(MB) indexType
0.641 0.675 0.666 0.987 200000 100 50 32
250 1 bits 5101 10.74 18627.18 20.85 1
624.45 606.918 20.981 HNSW
0.878 1.170 1.161 0.992 200000 100 50 32
250 4 bits 4662 12.20 16398.82 23.07 1
678.09 662.231 76.294 HNSW
0.915 1.517 1.505 0.992 200000 100 50 32
250 7 bits 4605 12.58 15896.99 31.01 1
751.27 735.474 149.536 HNSW
0.915 1.523 1.515 0.995 200000 100 50 32
250 8 bits 4570 11.64 17180.65 18.18 1
751.17 735.474 149.536 HNSW
```
This PR
```
recall latency(ms) netCPU avgCpuCount nDoc topK fanout maxConn
beamWidth quantized visited index(s) index_docs/s force_merge(s)
num_segments index_size(MB) vec_disk(MB) vec_RAM(MB) indexType
0.641 0.678 0.668 0.985 200000 100 50 32
250 1 bits 5064 10.83 18467.22 21.32 1
624.43 606.918 20.981 HNSW
0.876 1.140 1.131 0.992 200000 100 50 32
250 4 bits 4660 11.67 17132.09 23.35 1
678.10 662.231 76.294 HNSW
0.914 1.514 1.504 0.993 200000 100 50 32
250 7 bits 4575 12.34 16208.77 18.19 1
751.21 735.474 149.536 HNSW
0.916 1.576 1.566 0.994 200000 100 50 32
250 8 bits 4580 12.32 16229.81 18.29 1
751.23 735.474 149.536 HNSW
```
#### `mip`
`main`
```
recall latency(ms) netCPU avgCpuCount nDoc topK fanout maxConn
beamWidth quantized visited index(s) index_docs/s force_merge(s)
num_segments index_size(MB) vec_disk(MB) vec_RAM(MB) indexType
0.640 0.754 0.745 0.988 200000 100 50 32
250 1 bits 5076 11.12 17987.23 20.55 1
624.43 606.918 20.981 HNSW
0.877 1.174 1.165 0.992 200000 100 50 32
250 4 bits 4645 11.95 16737.80 24.10 1
678.11 662.231 76.294 HNSW
0.912 1.566 1.557 0.994 200000 100 50 32
250 7 bits 4573 11.96 16723.81 18.21 1
751.21 735.474 149.536 HNSW
0.916 1.509 1.500 0.994 200000 100 50 32
250 8 bits 4578 12.18 16416.32 18.29 1
751.19 735.474 149.536 HNSW
```
This PR
```
recall latency(ms) netCPU avgCpuCount nDoc topK fanout maxConn
beamWidth quantized visited index(s) index_docs/s force_merge(s)
num_segments index_size(MB) vec_disk(MB) vec_RAM(MB) indexType
0.641 0.709 0.700 0.987 200000 100 50 32
250 1 bits 5080 11.68 17120.36 20.85 1
624.44 606.918 20.981 HNSW
0.877 1.191 1.182 0.992 200000 100 50 32
250 4 bits 4654 11.61 17232.47 22.12 1
678.11 662.231 76.294 HNSW
0.914 1.527 1.518 0.994 200000 100 50 32
250 7 bits 4585 12.27 16306.56 18.17 1
751.22 735.474 149.536 HNSW
0.915 1.541 1.532 0.994 200000 100 50 32
250 8 bits 4582 11.70 17091.10 18.30 1
751.22 735.474 149.536 HNSW
```
#### `euclidean`
`main`
```
recall latency(ms) netCPU avgCpuCount nDoc topK fanout maxConn
beamWidth quantized visited index(s) index_docs/s force_merge(s)
num_segments index_size(MB) vec_disk(MB) vec_RAM(MB) indexType
0.691 0.625 0.615 0.984 200000 100 50 32
250 1 bits 4723 9.64 20751.19 17.36 1
615.12 606.918 20.981 HNSW
0.906 0.993 0.979 0.986 200000 100 50 32
250 4 bits 4413 10.70 18698.58 21.10 1
669.73 662.231 76.294 HNSW
0.948 1.361 1.353 0.994 200000 100 50 32
250 7 bits 4389 12.22 16369.29 25.86 1
743.24 735.474 149.536 HNSW
0.950 1.335 1.326 0.993 200000 100 50 32
250 8 bits 4387 11.31 17691.29 25.83 1
743.26 735.474 149.536 HNSW
```
This PR
```
recall latency(ms) netCPU avgCpuCount nDoc topK fanout maxConn
beamWidth quantized visited index(s) index_docs/s force_merge(s)
num_segments index_size(MB) vec_disk(MB) vec_RAM(MB) indexType
0.692 0.628 0.618 0.984 200000 100 50 32
250 1 bits 4741 10.19 19627.09 17.71 1
615.11 606.918 20.981 HNSW
0.905 0.987 0.977 0.990 200000 100 50 32
250 4 bits 4416 10.46 19118.63 20.92 1
669.72 662.231 76.294 HNSW
0.949 1.396 1.387 0.994 200000 100 50 32
250 7 bits 4395 12.06 16579.62 25.65 1
743.22 735.474 149.536 HNSW
0.951 1.332 1.316 0.988 200000 100 50 32
250 8 bits 4382 12.03 16629.25 25.74 1
743.24 735.474 149.536 HNSW
```
#### `cosine`
`main`
```
recall latency(ms) netCPU avgCpuCount nDoc topK fanout maxConn
beamWidth quantized visited index(s) index_docs/s force_merge(s)
num_segments index_size(MB) vec_disk(MB) vec_RAM(MB) indexType
0.656 0.641 0.632 0.986 200000 100 50 32
250 1 bits 4996 10.17 19663.75 17.60 1
616.88 606.918 20.981 HNSW
0.889 1.078 1.069 0.992 200000 100 50 32
250 4 bits 4603 10.64 18793.46 23.01 1
671.76 662.231 76.294 HNSW
0.944 1.438 1.429 0.994 200000 100 50 32
250 7 bits 4537 12.14 16477.18 27.64 1
745.81 735.474 149.536 HNSW
0.948 1.459 1.450 0.994 200000 100 50 32
250 8 bits 4524 11.83 16913.32 27.53 1
745.93 735.474 149.536 HNSW
```
This PR
```
recall latency(ms) netCPU avgCpuCount nDoc topK fanout maxConn
beamWidth quantized visited index(s) index_docs/s force_merge(s)
num_segments index_size(MB) vec_disk(MB) vec_RAM(MB) indexType
0.657 0.644 0.635 0.986 200000 100 50 32
250 1 bits 5006 10.30 19411.82 17.96 1
616.85 606.918 20.981 HNSW
0.888 0.994 0.985 0.991 200000 100 50 32
250 4 bits 4565 11.39 17556.18 22.29 1
671.74 662.231 76.294 HNSW
0.945 1.422 1.413 0.994 200000 100 50 32
250 7 bits 4522 11.72 17064.85 27.42 1
745.81 735.474 149.536 HNSW
0.948 1.442 1.433 0.994 200000 100 50 32
250 8 bits 4514 11.94 16746.21 26.94 1
745.94 735.474 149.536 HNSW
```
Except for one outlier (`dot_product`, `main`, `force_merge(s)`), all values
appear to be within \~5% of each other
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]