jbellis opened a new pull request, #12254: URL: https://github.com/apache/lucene/pull/12254
### Motivation 1. It is faster to query a single on-heap hnsw index, than to query multiple such indexes and combine the result. 2. Even with some contention necessarily occurring during building of the index, we still come out way ahead in terms of total efficiency vs creating per-thread indexes and combining them, since combining such indexes boils down to "pick the largest and then add all the other nodes normally," you don't really benefit from having computed the others previously. ### Performance On my i9-12900 (8 performance cores and 8 efficiency) I get a bit less than 10x speedup of wall clock time for building and querying the "siftsmall" and "sift" datasets from http://corpus-texmex.irisa.fr/. The small dataset is 10k vectors while the large is 1M. This speedup feels pretty good for a data structure that isn't completely parallelizable, and it's good to see that it's consistent as the dataset gets larger. The concurrent classes achieve identical recall compared to the non-concurrent versions within my ability to test it, and are drop-in replacements for OnHeapHnswGraph and HnswGraphBuilder; I use threadlocals to work around the places where the existing API assumes no concurrency. The timing harness (which includes recall checking) is here: https://github.com/jbellis/hnswdemo/tree/concurrent ### Design ConcurrentOnHeapHnswGraph is basically the same design as OnHeapHnswGraph but with the neighbor lists changed to use ConcurrentHashMap at both level 0 and higher levels. The internal guarantees change a bit along with this, e.g., addNode only initializes a neighbor list for node N when N is added, rather than all nodes <= N. ConcurrentHnswGraphBuilder changes more, because it’s designed around “all the logic is in Builder and it calls primitive operations on the Graph and the Neighbors as necessary.” Unfortunately you can’t build threadsafe concurrent structures this way, not without massive synchronized blocks and a ton of contention. So I’ve moved a lot of the logic around adding neighbors into ConcurrentNeighborSet. The key to achieving good concurrency without synchronization is, we track in-progress node additions in a ConcurrentSkipListSet. After adding ourselves, we take a snapshot of this set, and consider all other in-progress updates as neighbor candidates (subject to normal level constraints). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org