benwtrent commented on issue #14214:
URL: https://github.com/apache/lucene/issues/14214#issuecomment-2671796667

   So, these adverse scenarios where connect components has to do a ton of work 
all stem from us keeping the graph very sparse (e.g. only connecting diverse 
nodes). 
   
   I wonder if we can augment our building algorithm per node, meaning when we 
detect a highly clustered area for a node (e.g. majority/all the scores of 
beam_width candidates are within 1e-7 or something). When we detect very 
clustered areas, we force MORE connections, by passing the typical diversity 
forcing on forward connections.
   
   I am not sure we need to change the backlink behavior, but maybe we need to 
do that well?
   
   The idea I have in mind is basically similar to "delauney" type things here: 
https://github.com/nmslib/nmslib/blob/2ae5378027a107474a952edae1e1c2dc2df941d2/similarity_search/src/method/hnsw.cc
   
   However, we don't allow it to be configured and we just pick the right one 
(per node) based on the score distributions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to