jtibshirani commented on a change in pull request #287:
URL: https://github.com/apache/lucene/pull/287#discussion_r703752457



##########
File path: lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraph.java
##########
@@ -230,36 +244,50 @@ static NeighborQueue searchLevel(
    * Returns the {@link NeighborQueue} connected to the given node.
    *
    * @param level level of the graph
-   * @param node the node whose neighbors are returned
+   * @param node the node whose neighbors are returned, represented as an 
ordinal on the level 0.
    */
   public NeighborArray getNeighbors(int level, int node) {
-    NeighborArray result = graph.get(level).get(node);
-    assert result != null;
-    return result;
+    if (level == 0) {
+      return graph.get(level).get(node);
+    }
+    int nodeOrd = Arrays.binarySearch(nodesByLevel.get(level), 0, 
graph.get(level).size(), node);

Review comment:
       Would a clearer name for this be `nodeIndex`? I think of `nodeOrd` as 
representing the ordinal on level 0.

##########
File path: lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraph.java
##########
@@ -56,21 +58,27 @@
 public final class HnswGraph extends KnnGraphValues {
 
   private final int maxConn;
+  private int curMaxLevel; // the current max graph level
+  private int entryNode; // the current graph entry node on the top level
+
+  // Nodes by level expressed as the level 0's nodes' ordinals.
+  // As level 0 contains all nodes, nodesByLevel.get(0) is null.
+  private final List<int[]> nodesByLevel;
+
   // graph is a list of graph levels.
   // Each level is represented as List<NeighborArray> – nodes' connections on 
this level.
   // Each entry in the list has the top maxConn neighbors of a node. The nodes 
correspond to vectors
   // added to HnswBuilder, and the node values are the ordinals of those 
vectors.
+  // Thus, on all levels, neighbors expressed as the level 0's nodes' ordinals.
   private final List<List<NeighborArray>> graph;
-  private int curMaxLevel; // the current max graph level
-  private int entryNode; // the current graph entry node on the top level
 
   // KnnGraphValues iterator members
   private int upto;
   private NeighborArray cur;
 
   // used for iterating over graph values
   private int curLevel = -1;
-  private int curNode = -1;
+  private int curNodeOrd = -1;

Review comment:
       This is not directly related to this PR, but I am finding the seeking 
behavior hard to follow. We have two pieces of state that seem similar: `cur` 
for the current node's NeighborArray, and `curNodeOrd` which is a node index 
within a level. These can get out of sync depending on what methods are called.
   
   I wonder if we could remove `curNodeOrd` -- I think it is only used to 
support `nextNodeOnLevel()`, which in turn is only used in tests. Maybe we 
could replace it with a package-private method for testing like 
`getAllNodesOnLevel(level)?

##########
File path: lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraph.java
##########
@@ -56,21 +58,27 @@
 public final class HnswGraph extends KnnGraphValues {
 
   private final int maxConn;
+  private int curMaxLevel; // the current max graph level
+  private int entryNode; // the current graph entry node on the top level
+
+  // Nodes by level expressed as the level 0's nodes' ordinals.
+  // As level 0 contains all nodes, nodesByLevel.get(0) is null.
+  private final List<int[]> nodesByLevel;
+
   // graph is a list of graph levels.
   // Each level is represented as List<NeighborArray> – nodes' connections on 
this level.
   // Each entry in the list has the top maxConn neighbors of a node. The nodes 
correspond to vectors
   // added to HnswBuilder, and the node values are the ordinals of those 
vectors.
+  // Thus, on all levels, neighbors expressed as the level 0's nodes' ordinals.
   private final List<List<NeighborArray>> graph;
-  private int curMaxLevel; // the current max graph level
-  private int entryNode; // the current graph entry node on the top level
 
   // KnnGraphValues iterator members
   private int upto;
   private NeighborArray cur;
 
   // used for iterating over graph values
   private int curLevel = -1;
-  private int curNode = -1;
+  private int curNodeOrd = -1;

Review comment:
       This is not directly related to this PR, but I am finding the seeking 
behavior hard to follow. We have two pieces of state that seem similar: `cur` 
for the current node's NeighborArray, and `curNodeOrd` which is a node index 
within a level. These can get out of sync depending on what methods are called.
   
   I wonder if we could remove `curNodeOrd` -- I think it is only used to 
support `nextNodeOnLevel()`, which in turn is only used in tests. Maybe we 
could replace it with a package-private method for testing like 
`getAllNodesOnLevel(level)`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to