zhaih commented on code in PR #12844: URL: https://github.com/apache/lucene/pull/12844#discussion_r1412533096
########## lucene/core/src/java/org/apache/lucene/util/hnsw/NeighborArray.java: ########## @@ -32,17 +32,20 @@ * @lucene.internal */ public class NeighborArray { + private static final int INITIAL_CAPACITY = 10; private final boolean scoresDescOrder; + private final int maxSize; private int size; float[] score; int[] node; private int sortedNodeSize; public final ReadWriteLock rwlock = new ReentrantReadWriteLock(true); public NeighborArray(int maxSize, boolean descOrder) { - node = new int[maxSize]; - score = new float[maxSize]; + node = new int[INITIAL_CAPACITY]; Review Comment: I think it's just due to having consuming more memory -> more GC cycles needed -> higher latency. So I rechecked the code, when we insert a node, we will first collect `beamWidth` number of candidates, and then try to diversely add those candidates to the neighborArray. So I think: * in case that `beamWidth > maxSize`, we can just init this with `maxSize` and done, because it's likely in a larger graph that the first fill will directly fill the `NeighborArray` to full and there's no point on resizing it with any init size. * in case that `beamWidth < maxSize`, we can just init this with `beamWidth` such that the init fill will likely fill the array in a nearly full state? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org