msokolov opened a new pull request #2157: URL: https://github.com/apache/lucene-solr/pull/2157
This replaces the simple nearest neighbor selection with a criterion that takes into account the distance of the neighbors from each other. It is seen to provide dramatically improved recall on at least two datasets, and is what is being used by our reference implementation, hnswlib. Also: * Split Neighbors into NeighborArray and NeighborQueue; use queue for gathering results; store graph arcs in arrays since diversity selection does not use a queue * Add InfoStream progress messages to HnswGraphBuilder * Add options to KnnGraphTester to support testing using ann-benchmarks (no warmup, write output to file) * Improve memory usage; eliminate more object allocations; replaced iterator objects with single-use iterator "views" ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org