Adrien Grand created LUCENE-10383: ------------------------------------- Summary: Explore moving HNSW's NeighborQueue to a radix heap Key: LUCENE-10383 URL: https://issues.apache.org/jira/browse/LUCENE-10383 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand
Now that [~julietibs] improved merging via LUCENE-10375, nightly benchmarks report the following two top CPU consumers: {noformat} 18.45% 548638 org.apache.lucene.util.VectorUtil#dotProduct() at org.apache.lucene.index.VectorSimilarityFunction$2#compare() at org.apache.lucene.util.hnsw.HnswGraph#search() at org.apache.lucene.util.hnsw.HnswGraphBuilder#addGraphNode() 13.23% 393609 org.apache.lucene.util.LongHeap#upHeap() at org.apache.lucene.util.LongHeap#push() at org.apache.lucene.util.hnsw.NeighborQueue#add() at org.apache.lucene.util.hnsw.HnswGraph#search() {noformat} Exploration at LUCENE-7979 suggested that radix heaps perform better than binary heaps if the heap is large: queries with 16 clauses or more got faster while queries with fewer clauses would get slower. With a default beam width of 100, I'd be interested in seeing how HNSW indexing performs if we replace the current binary LongHeap with a radix heap. -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org