mayya-sharipova opened a new pull request, #862: URL: https://github.com/apache/lucene/pull/862
Sort HNSW graph neighbors when applying diversity criterion During HNSW graph construction, when a node has already a number of connections larger than maximum allowed (maxConn), we need to prune its connections using a diversity criteria to limit the number of connections to maxConn. Currently when we add reverse connections to already existing nodes, we don't keep them sorted. Thus later, when we apply diversity criteria we may prune not the worst most distant non-diverse nodes. This patch makes sure that neighbours connections are always sorted from best (closest) to worst (distant), and during the application of diversity criteria processes nodes from worst to best. This path does the following: - enhance NeighborArray to always keep neighbour nodes sorted according to their scores (in desc or asc order). Make NeighborArray aware in which order the nodes should be sorted. - make OnHeapHnswGraph aware of the order of similarity function - make HnswGraphBuilder apply diversity criteria from worst to best nodes - create Lucene90NeighborArray to keep the previous logic of NeighborArray for Lucene90Codec -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org