benwtrent opened a new pull request, #14215: URL: https://github.com/apache/lucene/pull/14215
previously related PR: https://github.com/apache/lucene/pull/12770 While my original change to help move us towards a saner HNSW search behavior, it is will still actually explore a candidate if its score is `==` min accepted. This will devolve in the degenerate case where all vectors are the same. Here are some test runs. One test indexes the same vector many times. The other indexes the same 16 vectors many times. There isn't much difference with the "few unique vectors" case from what I can tell. However, the super degenerate case where all scores are exactly the same, this is magnitudes faster. Logically, it makes sense to make the condition to skip a candidate the exact same for adding a candidate. Also note that this degenerate case with uniform vector scores got WAY worse with the connected components change. [Archive.zip](https://github.com/user-attachments/files/18710710/Archive.zip) related to (but doesn't fully solve): https://github.com/apache/lucene/issues/14214 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org