tang-hi commented on PR #12255:
URL: https://github.com/apache/lucene/pull/12255#issuecomment-1553239095

   > I've noticed that we create a NeighborQueue with an initialSize set to 
topK. For instance, if topK is 100, the maximum size of LongHeap is also 100. 
However, when we execute the searchLayer function for any layer other than 0, 
the maximum size of LongHeap in the original version is only 1.
   > 
   > In the `searchLevel` method, we attempt to insert qualifying neighbors. 
LongHeap will push an element only if its size is less than maxSize.
   > 
   > ```java
   >  if (results.insertWithOverflow(friendOrd, friendSimilarity) && 
results.size() >= topK) {
   > ```
   > 
   > ```java
   > if (size >= maxSize) {
   >   if (value < heap[1]) {
   >     return false;
   >   }
   >   updateTop(value);
   >   return true;
   > }
   > push(value);
   > return true;
   > ```
   > 
   > So, if maxSize is 100, it will store up to 100 points. However, if maxSize 
is 1, it will only store 1 point.
   > 
   > When we try to return the result, we pop elements from the heap. If 
maxSize is 100, it will pop 99 elements when the layer is not 0. This could 
potentially be a time-consuming operation. In contrast, the original version 
does not pop any elements; it simply returns the result.
   > 
   > Here is the relevant code from the [Apache Lucene 
repository](https://github.com/apache/lucene/blob/04ef6de826eea6c1eaeffba6f6f3752b64250406/lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphSearcher.java#LL279C4-L281C6):
   > 
   > ```java
   > while (results.size() > topK) {
   >   results.pop();
   > }
   > ```
   > 
   > In summary, the performance degradation could be due to the increased 
number of pop operations when the maximum size of the heap is larger than 1.
   
   But I'm still trying to run lucenutil bechmark to test my hypothesis.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to