Re: [PR] Optimize OnHeapHnswGraph's data structure [lucene]

2023-10-16 Thread via GitHub
zhaih merged PR #12651: URL: https://github.com/apache/lucene/pull/12651 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apach

Re: [PR] Optimize OnHeapHnswGraph's data structure [lucene]

2023-10-15 Thread via GitHub
zhaih commented on PR #12651: URL: https://github.com/apache/lucene/pull/12651#issuecomment-1763700781 I reran'd the benchmark and still get the similar perf and same recall. (Just to make sure the later edits have not messed up things) -- This is an automated message from the Apache Git

Re: [PR] Optimize OnHeapHnswGraph's data structure [lucene]

2023-10-15 Thread via GitHub
zhaih commented on code in PR #12651: URL: https://github.com/apache/lucene/pull/12651#discussion_r1360085712 ## lucene/core/src/test/org/apache/lucene/util/hnsw/HnswGraphTestCase.java: ## @@ -454,77 +454,6 @@ public void testSearchWithSelectiveAcceptOrds() throws IOException {

Re: [PR] Optimize OnHeapHnswGraph's data structure [lucene]

2023-10-15 Thread via GitHub
zhaih commented on code in PR #12651: URL: https://github.com/apache/lucene/pull/12651#discussion_r1360085215 ## lucene/core/src/java/org/apache/lucene/util/hnsw/OnHeapHnswGraph.java: ## @@ -158,50 +185,82 @@ public int entryNode() { return entryNode; } + /** + * WA

Re: [PR] Optimize OnHeapHnswGraph's data structure [lucene]

2023-10-15 Thread via GitHub
zhaih commented on code in PR #12651: URL: https://github.com/apache/lucene/pull/12651#discussion_r1360084970 ## lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphSearcher.java: ## @@ -284,6 +285,13 @@ int graphNextNeighbor(HnswGraph graph) throws IOException { retu

Re: [PR] Optimize OnHeapHnswGraph's data structure [lucene]

2023-10-15 Thread via GitHub
zhaih commented on code in PR #12651: URL: https://github.com/apache/lucene/pull/12651#discussion_r1359985681 ## lucene/core/src/java/org/apache/lucene/util/hnsw/OnHeapHnswGraph.java: ## @@ -158,50 +185,82 @@ public int entryNode() { return entryNode; } + /** + * WA

Re: [PR] Optimize OnHeapHnswGraph's data structure [lucene]

2023-10-15 Thread via GitHub
zhaih commented on code in PR #12651: URL: https://github.com/apache/lucene/pull/12651#discussion_r1359984597 ## lucene/core/src/java/org/apache/lucene/util/hnsw/OnHeapHnswGraph.java: ## @@ -99,27 +122,31 @@ public void addNode(int level, int node) { entryNode = node;

Re: [PR] Optimize OnHeapHnswGraph's data structure [lucene]

2023-10-15 Thread via GitHub
zhaih commented on code in PR #12651: URL: https://github.com/apache/lucene/pull/12651#discussion_r1359984155 ## lucene/core/src/java/org/apache/lucene/util/hnsw/OnHeapHnswGraph.java: ## @@ -74,23 +88,32 @@ public final class OnHeapHnswGraph extends HnswGraph implements Account

Re: [PR] Optimize OnHeapHnswGraph's data structure [lucene]

2023-10-15 Thread via GitHub
msokolov commented on code in PR #12651: URL: https://github.com/apache/lucene/pull/12651#discussion_r1359884135 ## lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphSearcher.java: ## @@ -284,6 +285,13 @@ int graphNextNeighbor(HnswGraph graph) throws IOException { r

Re: [PR] Optimize OnHeapHnswGraph's data structure [lucene]

2023-10-14 Thread via GitHub
zhaih commented on code in PR #12651: URL: https://github.com/apache/lucene/pull/12651#discussion_r1359572233 ## lucene/core/src/java/org/apache/lucene/util/hnsw/OnHeapHnswGraph.java: ## @@ -163,45 +185,66 @@ public NodesIterator getNodesOnLevel(int level) { if (level == 0)

Re: [PR] Optimize OnHeapHnswGraph's data structure [lucene]

2023-10-14 Thread via GitHub
zhaih commented on code in PR #12651: URL: https://github.com/apache/lucene/pull/12651#discussion_r1359568778 ## lucene/core/src/java/org/apache/lucene/util/hnsw/OnHeapHnswGraph.java: ## @@ -40,31 +41,39 @@ public final class OnHeapHnswGraph extends HnswGraph implements Account

Re: [PR] Optimize OnHeapHnswGraph's data structure [lucene]

2023-10-14 Thread via GitHub
benwtrent commented on code in PR #12651: URL: https://github.com/apache/lucene/pull/12651#discussion_r1359458954 ## lucene/core/src/java/org/apache/lucene/util/hnsw/OnHeapHnswGraph.java: ## @@ -163,45 +185,66 @@ public NodesIterator getNodesOnLevel(int level) { if (level =

Re: [PR] Optimize OnHeapHnswGraph's data structure [lucene]

2023-10-12 Thread via GitHub
zhaih commented on code in PR #12651: URL: https://github.com/apache/lucene/pull/12651#discussion_r1357264961 ## lucene/core/src/java/org/apache/lucene/util/hnsw/OnHeapHnswGraph.java: ## @@ -40,31 +41,29 @@ public final class OnHeapHnswGraph extends HnswGraph implements Account

Re: [PR] Optimize OnHeapHnswGraph's data structure [lucene]

2023-10-12 Thread via GitHub
zhaih commented on PR #12651: URL: https://github.com/apache/lucene/pull/12651#issuecomment-1760235738 OK I have incorporate all the learning I have from #12660 and added several more assertions to make it safer, please take a look again when you have time @msokolov, thanks! -- This is a

Re: [PR] Optimize OnHeapHnswGraph's data structure [lucene]

2023-10-12 Thread via GitHub
zhaih commented on code in PR #12651: URL: https://github.com/apache/lucene/pull/12651#discussion_r1357264961 ## lucene/core/src/java/org/apache/lucene/util/hnsw/OnHeapHnswGraph.java: ## @@ -40,31 +41,29 @@ public final class OnHeapHnswGraph extends HnswGraph implements Account

Re: [PR] Optimize OnHeapHnswGraph's data structure [lucene]

2023-10-12 Thread via GitHub
zhaih commented on code in PR #12651: URL: https://github.com/apache/lucene/pull/12651#discussion_r1357261063 ## lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphSearcher.java: ## @@ -232,7 +231,6 @@ void searchLevel( graphSeek(graph, level, topCandidateNode);

Re: [PR] Optimize OnHeapHnswGraph's data structure [lucene]

2023-10-12 Thread via GitHub
zhaih commented on code in PR #12651: URL: https://github.com/apache/lucene/pull/12651#discussion_r1357260551 ## lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphBuilder.java: ## @@ -288,7 +294,6 @@ private void selectAndLinkDiverse( // only adding it if it is cl

Re: [PR] Optimize OnHeapHnswGraph's data structure [lucene]

2023-10-11 Thread via GitHub
msokolov commented on code in PR #12651: URL: https://github.com/apache/lucene/pull/12651#discussion_r1356041142 ## lucene/core/src/java/org/apache/lucene/util/hnsw/OnHeapHnswGraph.java: ## @@ -40,31 +41,29 @@ public final class OnHeapHnswGraph extends HnswGraph implements Acco

Re: [PR] Optimize OnHeapHnswGraph's data structure [lucene]

2023-10-11 Thread via GitHub
zhaih commented on PR #12651: URL: https://github.com/apache/lucene/pull/12651#issuecomment-1758887879 Yes that's the idea, although I actually made some mistakes here so the merging is not entirely pre-allocated, also something in searching might be broken due to the size() behavior ch

Re: [PR] Optimize OnHeapHnswGraph's data structure [lucene]

2023-10-11 Thread via GitHub
msokolov commented on PR #12651: URL: https://github.com/apache/lucene/pull/12651#issuecomment-1758885797 I like this! Actually I think when we are merging we can preallocate the entire array so we don't need to resize at all which should greatly simplify making this beast thread-safe (sinc