benwtrent opened a new pull request, #14215:
URL: https://github.com/apache/lucene/pull/14215

   previously related PR: https://github.com/apache/lucene/pull/12770
   
   While my original change to help move us towards a saner HNSW search 
behavior, it is will still actually explore a candidate if its score is `==` 
min accepted. This will devolve in the degenerate case where all vectors are 
the same.
   
   
   Here are some test runs. One test indexes the same vector many times. The 
other indexes the same 16 vectors many times. 
   
   There isn't much difference with the "few unique vectors" case from what I 
can tell. However, the super degenerate case where all scores are exactly the 
same, this is magnitudes faster.
   
   Logically, it makes sense to make the condition to skip a candidate the 
exact same for adding a candidate.
   
   Also note that this degenerate case with uniform vector scores got WAY worse 
with the connected components change.
   
   [Archive.zip](https://github.com/user-attachments/files/18710710/Archive.zip)
   
   related to (but doesn't fully solve): 
https://github.com/apache/lucene/issues/14214


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to