benwtrent commented on issue #14214: URL: https://github.com/apache/lucene/issues/14214#issuecomment-2671796667
So, these adverse scenarios where connect components has to do a ton of work all stem from us keeping the graph very sparse (e.g. only connecting diverse nodes). I wonder if we can augment our building algorithm per node, meaning when we detect a highly clustered area for a node (e.g. majority/all the scores of beam_width candidates are within 1e-7 or something). When we detect very clustered areas, we force MORE connections, by passing the typical diversity forcing on forward connections. I am not sure we need to change the backlink behavior, but maybe we need to do that well? The idea I have in mind is basically similar to "delauney" type things here: https://github.com/nmslib/nmslib/blob/2ae5378027a107474a952edae1e1c2dc2df941d2/similarity_search/src/method/hnsw.cc However, we don't allow it to be configured and we just pick the right one (per node) based on the score distributions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org