benwtrent commented on PR #14160:
URL: https://github.com/apache/lucene/pull/14160#issuecomment-2640589065

   OK, the current implementation is about as good as I can figure it.
   
    - We explore greater than neighbor-neighbors if we gathered < maxConn/4 
vectors to score
    - We will explore at MAX maxConn*maxConn total vectors
   
   However, one thing that bothers me is that increasing `k` doesn't guarantee 
better results. This indicates to me that we take erroneous paths when the 
score threshold is low (e.g. we haven't gathered enough results). 
   
   ```
   1M cohere 16 maxConn 100 efConstruction
   recall  latency(ms)     nDoc  topK  fanout  visited  selectivity
    0.717        1.340  1000000   100       0     1385        0.050
    0.755        1.790  1000000   100      20     1680        0.050
    0.775        1.950  1000000   100      40     1854        0.050
    0.786        2.270  1000000   100      60     2023        0.050
    0.825        2.560  1000000   100      80     2283        0.050
    0.841        3.160  1000000   100     100     2523        0.050
    0.859        4.030  1000000   100     120     2795        0.050
    0.859        3.810  1000000   100     140     2989        0.050
    0.896        4.220  1000000   100     160     3270        0.050
    0.880        4.550  1000000   100     180     3561        0.050
    0.906        4.670  1000000   100     200     3705        0.050
    0.888        4.810  1000000   100     220     3981        0.050
    0.921        4.820  1000000   100     240     4157        0.050
    0.896        5.700  1000000   100     260     4364        0.050
    0.925        6.140  1000000   100     280     4672        0.050
    0.920        5.380  1000000   100     300     4870        0.050
    0.906        7.050  1000000   100     320     5190        0.050
    0.914        7.390  1000000   100     340     5303        0.050
    0.920        7.570  1000000   100     360     5450        0.050
    0.919        7.420  1000000   100     380     5784        0.050
    0.923        7.550  1000000   100     400     5997        0.050
    0.939        8.080  1000000   100     420     6217        0.050
    0.936        7.170  1000000   100     440     6202        0.050
    0.936        8.550  1000000   100     460     6627        0.050
    0.921       11.800  1000000   100     480     6949        0.050
    0.916        9.630  1000000   100     500     6957        0.050
   ``` 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to