benwtrent commented on code in PR #13566: URL: https://github.com/apache/lucene/pull/13566#discussion_r1676046779
########## lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphBuilder.java: ########## @@ -408,7 +410,23 @@ private void finish() throws IOException { } private void connectComponents() throws IOException { - List<Component> components = HnswUtil.components(hnsw); + long start = System.nanoTime(); + for (int level = 0; level < hnsw.numLevels(); level++) { + if (connectComponents(level) == false) { Review Comment: I like the logging, we should do this for sure. But I also think this should be an `assert` as it should never happen in tests. ########## lucene/core/src/java/org/apache/lucene/util/hnsw/HnswUtil.java: ########## @@ -30,46 +30,83 @@ import org.apache.lucene.index.IndexReader; import org.apache.lucene.index.LeafReaderContext; import org.apache.lucene.util.FixedBitSet; -import org.apache.lucene.util.hnsw.HnswGraph; /** Utilities for use in tests involving HNSW graphs */ -public class HnswTestUtil { +public class HnswUtil { /** * Returns true iff level 0 of the graph is fully connected - that is every node is reachable from * any entry point. */ public static boolean isFullyConnected(HnswGraph knnValues) throws IOException { - return componentSizes(knnValues).size() < 2; + for (int level = 0; level < knnValues.numLevels(); level++) { + if (components(knnValues, level).size() > 1) { + return false; + } + } + return true; } /** * Returns the sizes of the distinct graph components on level 0. If the graph is fully-connected * there will only be a single component. If the graph is empty, the returned list will be empty. */ public static List<Integer> componentSizes(HnswGraph hnsw) throws IOException { - List<Integer> sizes = new ArrayList<>(); + return componentSizes(hnsw, 0); + } + + /** + * Returns the sizes of the distinct graph components on the given level. If the graph is + * fully-connected there will only be a single component. If the graph is empty, the returned list + * will be empty. + */ + public static List<Integer> componentSizes(HnswGraph hnsw, int level) throws IOException { + return components(hnsw, level).stream().map(Component::size).toList(); + } + + // Finds all connected components of the graph + static List<Component> components(HnswGraph hnsw, int level) throws IOException { Review Comment: Do you mind providing a bit more color to this java doc. It took me a minute to understand that a component is really "A selection of nodes that are traversable to each other". -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org