benwtrent commented on PR #12050: URL: https://github.com/apache/lucene/pull/12050#issuecomment-1428716712
@jmazanec15 did his due diligence, just being paranoid :). I have confirmed that for the following ann-benchmarks datasets the recall before and after this change are 1-1: `mnist-784-euclidean`, `sift-128-euclidean`, `glove-100-angular`. However, all these datasets are pretty small, and may not kick off many segment merges, etc. So I tested with `deep-image-96-angular` and it took some time. But here are the results: | parameters | test recall | control recall | |---------------------------------------------|-------------|----------------| | {'M': 48, 'efConstruction': 100} fanout=100 | 0.995 | 0.994 | | {'M': 16, 'efConstruction': 100} fanout=100 | 0.986 | 0.986 | | {'M': 16, 'efConstruction': 100} fanout=50 | 0.969 | 0.969 | | {'M': 16, 'efConstruction': 100} fanout=500 | 0.998 | 0.998 | | {'M': 48, 'efConstruction': 100} fanout=500 | 0.999 | 0.999 | | {'M': 16, 'efConstruction': 100} fanout=10 | 0.892 | 0.892 | | {'M': 48, 'efConstruction': 100} fanout=50 | 0.986 | 0.986 | | {'M': 48, 'efConstruction': 100} fanout=10 | 0.941 | 0.940 | So, there are no significant changes in recall. So, I think this change is good and we should update the test. @jpountz -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org