mayya-sharipova opened a new pull request, #862:
URL: https://github.com/apache/lucene/pull/862

   Sort HNSW graph neighbors when applying diversity criterion
   
   During HNSW graph construction, when a node has already a number of
   connections larger than maximum allowed (maxConn), we need to prune
   its connections using a diversity criteria to limit the number of
   connections to maxConn.
   
   Currently when we add reverse connections to already existing nodes,
   we don't keep them sorted. Thus later, when we apply diversity criteria
   we may prune not the worst most distant non-diverse nodes.
   
   This patch makes sure that neighbours connections are always sorted
   from best (closest) to worst (distant), and during the application
   of diversity criteria processes nodes from worst to best.
   
   This path does the following:
   - enhance NeighborArray to always keep neighbour nodes sorted according
     to their scores (in desc or asc order). Make NeighborArray aware in
     which order the nodes should be sorted.
   - make OnHeapHnswGraph aware of the order of similarity function
   - make HnswGraphBuilder apply diversity criteria from worst to
     best nodes
   - create Lucene90NeighborArray to keep the previous logic of
     NeighborArray for Lucene90Codec


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to