zhaih commented on code in PR #12844:
URL: https://github.com/apache/lucene/pull/12844#discussion_r1412533096


##########
lucene/core/src/java/org/apache/lucene/util/hnsw/NeighborArray.java:
##########
@@ -32,17 +32,20 @@
  * @lucene.internal
  */
 public class NeighborArray {
+  private static final int INITIAL_CAPACITY = 10;
   private final boolean scoresDescOrder;
+  private final int maxSize;
   private int size;
   float[] score;
   int[] node;
   private int sortedNodeSize;
   public final ReadWriteLock rwlock = new ReentrantReadWriteLock(true);
 
   public NeighborArray(int maxSize, boolean descOrder) {
-    node = new int[maxSize];
-    score = new float[maxSize];
+    node = new int[INITIAL_CAPACITY];

Review Comment:
   I think it's just due to having consuming more memory -> more GC cycles 
needed -> higher latency.
   
   So I rechecked the code, when we insert a node, we will first collect 
`beamWidth` number of candidates, and then try to diversely add those 
candidates to the neighborArray. So I think:
    * in case that `beamWidth > maxSize`, we can just init this with `maxSize` 
and done, because it's likely in a larger graph that the first fill will 
directly fill the `NeighborArray` to full and there's no point on resizing it 
with any init size.
    * in case that `beamWidth < maxSize`, we can just init this with 
`beamWidth` such that the init fill will likely fill the array in a nearly full 
state?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to