benwtrent commented on PR #14160: URL: https://github.com/apache/lucene/pull/14160#issuecomment-2640589065
OK, the current implementation is about as good as I can figure it. - We explore greater than neighbor-neighbors if we gathered < maxConn/4 vectors to score - We will explore at MAX maxConn*maxConn total vectors However, one thing that bothers me is that increasing `k` doesn't guarantee better results. This indicates to me that we take erroneous paths when the score threshold is low (e.g. we haven't gathered enough results). ``` 1M cohere 16 maxConn 100 efConstruction recall latency(ms) nDoc topK fanout visited selectivity 0.717 1.340 1000000 100 0 1385 0.050 0.755 1.790 1000000 100 20 1680 0.050 0.775 1.950 1000000 100 40 1854 0.050 0.786 2.270 1000000 100 60 2023 0.050 0.825 2.560 1000000 100 80 2283 0.050 0.841 3.160 1000000 100 100 2523 0.050 0.859 4.030 1000000 100 120 2795 0.050 0.859 3.810 1000000 100 140 2989 0.050 0.896 4.220 1000000 100 160 3270 0.050 0.880 4.550 1000000 100 180 3561 0.050 0.906 4.670 1000000 100 200 3705 0.050 0.888 4.810 1000000 100 220 3981 0.050 0.921 4.820 1000000 100 240 4157 0.050 0.896 5.700 1000000 100 260 4364 0.050 0.925 6.140 1000000 100 280 4672 0.050 0.920 5.380 1000000 100 300 4870 0.050 0.906 7.050 1000000 100 320 5190 0.050 0.914 7.390 1000000 100 340 5303 0.050 0.920 7.570 1000000 100 360 5450 0.050 0.919 7.420 1000000 100 380 5784 0.050 0.923 7.550 1000000 100 400 5997 0.050 0.939 8.080 1000000 100 420 6217 0.050 0.936 7.170 1000000 100 440 6202 0.050 0.936 8.550 1000000 100 460 6627 0.050 0.921 11.800 1000000 100 480 6949 0.050 0.916 9.630 1000000 100 500 6957 0.050 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org