Adrien Grand created LUCENE-10383:
-------------------------------------

             Summary: Explore moving HNSW's NeighborQueue to a radix heap
                 Key: LUCENE-10383
                 URL: https://issues.apache.org/jira/browse/LUCENE-10383
             Project: Lucene - Core
          Issue Type: Task
            Reporter: Adrien Grand


Now that [~julietibs] improved merging via LUCENE-10375, nightly benchmarks 
report the following two top CPU consumers:

{noformat}
18.45%        548638        org.apache.lucene.util.VectorUtil#dotProduct()
                              at 
org.apache.lucene.index.VectorSimilarityFunction$2#compare()
                              at org.apache.lucene.util.hnsw.HnswGraph#search()
                              at 
org.apache.lucene.util.hnsw.HnswGraphBuilder#addGraphNode()
13.23%        393609        org.apache.lucene.util.LongHeap#upHeap()
                              at org.apache.lucene.util.LongHeap#push()
                              at org.apache.lucene.util.hnsw.NeighborQueue#add()
                              at org.apache.lucene.util.hnsw.HnswGraph#search()
{noformat}

Exploration at LUCENE-7979 suggested that radix heaps perform better than 
binary heaps if the heap is large: queries with 16 clauses or more got faster 
while queries with fewer clauses would get slower. With a default beam width of 
100, I'd be interested in seeing how HNSW indexing performs if we replace the 
current binary LongHeap with a radix heap.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to