mikemccand opened a new issue, #14763: URL: https://github.com/apache/lucene/issues/14763
### Description Peeking at the last nightly benchmark data point, I see the top CPU hotspots during indexing. That 2nd one (13.12% in `ramBytesUsed`) is concerning ... I know we need to account properly for RAM so IW can flush when RAM exceeds its allowance ... but maybe we can optimize how we do that for HNSW? 8.63% spent clearing bitsets for HNSW searching is also scary -- that likely impacts search performance too (since building an HNSW graph is done by doing a search for each inserted vector)? Also what exactly is `reduceLanesTemplate`? I find this name very non-intuitive :) Is it essentially a cast (like `long` -> `int`) for a vector? ``` Profiler for cpu: WARNING: Using incubator modules: jdk.incubator.vector PROFILE SUMMARY from 4433882 events (total: 4M) tests.profile.mode=cpu tests.profile.count=50 tests.profile.stacksize=4 tests.profile.linenumbers=false PERCENT CPU SAMPLES STACK 23.57% 1M jdk.incubator.vector.FloatVector#reduceLanesTemplate() [Inlined code] at jdk.incubator.vector.Float256Vector#reduceLanes() [Inlined code] at org.apache.lucene.internal.vectorization.PanamaVectorUtilSupport#dotProductBody() [JIT compiled code] at org.apache.lucene.internal.vectorization.PanamaVectorUtilSupport#dotProduct() [Inlined code] 13.12% 581719 org.apache.lucene.util.hnsw.NeighborArray#ramBytesUsed() [Inlined code] at org.apache.lucene.util.hnsw.OnHeapHnswGraph#updateGraphRamBytesUsed() [JIT compiled code] at org.apache.lucene.util.hnsw.OnHeapHnswGraph#addNode() [JIT compiled code] at org.apache.lucene.util.hnsw.HnswGraphBuilder#addGraphNodeInternal() [JIT compiled code] 9.51% 421729 org.apache.lucene.index.FloatVectorValues$1#vectorValue() [Inlined code] at org.apache.lucene.codecs.hnsw.DefaultFlatVectorScorer$FloatScoringSupplier$1#score() [Inlined code] at org.apache.lucene.util.hnsw.HnswGraphSearcher#searchLevel() [JIT compiled code] at org.apache.lucene.util.hnsw.HnswGraphBuilder#addGraphNodeInternal() [JIT compiled code] 8.63% 382600 java.util.Arrays#fill() [Inlined code] at org.apache.lucene.util.FixedBitSet#clear() [Inlined code] at org.apache.lucene.util.hnsw.HnswGraphSearcher#prepareScratchState() [Inlined code] at org.apache.lucene.util.hnsw.HnswGraphSearcher#searchLevel() [JIT compiled code] 4.55% 201882 org.apache.lucene.util.FixedBitSet#getAndSet() [Inlined code] at org.apache.lucene.util.hnsw.HnswGraphSearcher#searchLevel() [JIT compiled code] at org.apache.lucene.util.hnsw.HnswGraphBuilder#addGraphNodeInternal() [JIT compiled code] at org.apache.lucene.util.hnsw.HnswGraphBuilder#addGraphNode() [Inlined code] 3.12% 138341 org.apache.lucene.util.RamUsageEstimator#sizeOf() [Inlined code] at org.apache.lucene.internal.hppc.MaxSizedIntArrayList#ramBytesUsed() [Inlined code] at org.apache.lucene.util.hnsw.NeighborArray#ramBytesUsed() [Inlined code] at org.apache.lucene.util.hnsw.OnHeapHnswGraph#updateGraphRamBytesUsed() [JIT compiled code] 2.60% 115254 org.apache.lucene.util.hnsw.HnswConcurrentMergeBuilder$MergeSearcher#graphSeek() [JIT compiled code] at org.apache.lucene.util.hnsw.HnswGraphSearcher#searchLevel() [JIT compiled code] at org.apache.lucene.util.hnsw.HnswGraphBuilder#addGraphNodeInternal() [JIT compiled code] at org.apache.lucene.util.hnsw.HnswGraphBuilder#addGraphNode() [Inlined code] 2.42% 107338 org.apache.lucene.internal.hppc.MaxSizedIntArrayList#ramBytesUsed() [Inlined code] at org.apache.lucene.util.hnsw.NeighborArray#ramBytesUsed() [Inlined code] at org.apache.lucene.util.hnsw.OnHeapHnswGraph#updateGraphRamBytesUsed() [JIT compiled code] at org.apache.lucene.util.hnsw.OnHeapHnswGraph#addNode() [JIT compiled code] 2.36% 104842 org.apache.lucene.internal.vectorization.PanamaVectorUtilSupport#dotProductBody() [JIT compiled code] at org.apache.lucene.internal.vectorization.PanamaVectorUtilSupport#dotProduct() [Inlined code] at org.apache.lucene.util.VectorUtil#dotProduct() [Inlined code] at org.apache.lucene.index.VectorSimilarityFunction$2#compare() [Inlined code] 2.36% 104836 org.apache.lucene.util.hnsw.OnHeapHnswGraph#updateGraphRamBytesUsed() [JIT compiled code] at org.apache.lucene.util.hnsw.OnHeapHnswGraph#addNode() [JIT compiled code] at org.apache.lucene.util.hnsw.HnswGraphBuilder#addGraphNodeInternal() [JIT compiled code] at org.apache.lucene.util.hnsw.HnswGraphBuilder#addGraphNode() [Inlined code] 2.21% 97814 org.apache.lucene.util.hnsw.OnHeapHnswGraph#getNeighbors() [Inlined code] at org.apache.lucene.util.hnsw.HnswConcurrentMergeBuilder$MergeSearcher#graphSeek() [JIT compiled code] at org.apache.lucene.util.hnsw.HnswGraphSearcher#searchLevel() [JIT compiled code] at org.apache.lucene.util.hnsw.HnswGraphBuilder#addGraphNodeInternal() [JIT compiled code] 1.55% 68929 java.util.concurrent.locks.AbstractQueuedSynchronizer#apparentlyFirstQueuedIsExclusive() [Inlined code] at java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync#readerShouldBlock() [Inlined code] at java.util.concurrent.locks.ReentrantReadWriteLock$Sync#tryAcquireShared() [Inlined code] at java.util.concurrent.locks.AbstractQueuedSynchronizer#acquireShared() [Inlined code] ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org