benwtrent commented on issue #12627: URL: https://github.com/apache/lucene/issues/12627#issuecomment-1812605864
@msokolov good point. It seems to me we would only fully disconnect a sub-graph only if its very clustered. Is there a way to detect this in the diversity selection? One other thing I just found that confuses me: https://github.com/apache/lucene/pull/12235 This slightly changed the diversity connection in a way to to attempt to improve performance. The key issue is here: https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphBuilder.java#L386-L407 If we have "checked the indices", we always just remove the furthest one, which doesn't seem correct to me. We should check for diversity starting at the furthest one, not always remove it? @nitirajrathore for your connection checker, are you testing on a Lucene > 9.7.0? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org