jbellis opened a new pull request, #12254:
URL: https://github.com/apache/lucene/pull/12254

   ### Motivation
   
   1. It is faster to query a single on-heap hnsw index, than to query multiple 
such indexes and combine the result.
   2. Even with some contention necessarily occurring during building of the 
index, we still come out way ahead in terms of total efficiency vs creating 
per-thread indexes and combining them, since combining such indexes boils down 
to "pick the largest and then add all the other nodes normally," you don't 
really benefit from having computed the others previously.
   
   ### Performance
   
   On my i9-12900 (8 performance cores and 8 efficiency) I get a bit less than 
10x speedup of wall clock time for building and querying the "siftsmall" and 
"sift" datasets from http://corpus-texmex.irisa.fr/.  The small dataset is 10k 
vectors while the large is 1M.  This speedup feels pretty good for a data 
structure that isn't completely parallelizable, and it's good to see that it's 
consistent as the dataset gets larger. 
   
   The concurrent classes achieve identical recall compared to the 
non-concurrent versions within my ability to test it, and are drop-in 
replacements for OnHeapHnswGraph and HnswGraphBuilder; I use threadlocals to 
work around the places where the existing API assumes no concurrency.
   
   The timing harness (which includes recall checking) is here: 
https://github.com/jbellis/hnswdemo/tree/concurrent
   
   ### Design
   
   ConcurrentOnHeapHnswGraph is basically the same design as OnHeapHnswGraph 
but with the neighbor lists changed to use ConcurrentHashMap at both level 0 
and higher levels.  The internal guarantees change a bit along with this, e.g., 
addNode only initializes a neighbor list for node N when N is added, rather 
than all nodes <= N.
   
   ConcurrentHnswGraphBuilder changes more, because it’s designed around “all 
the logic is in Builder and it calls primitive operations on the Graph and the 
Neighbors as necessary.”  Unfortunately you can’t build threadsafe concurrent 
structures this way, not without massive synchronized blocks and a ton of 
contention.  So I’ve moved a lot of the logic around adding neighbors into 
ConcurrentNeighborSet.
   
   The key to achieving good concurrency without synchronization is, we track 
in-progress node additions in a ConcurrentSkipListSet. After adding ourselves, 
we take a snapshot of this set, and consider all other in-progress updates as 
neighbor candidates (subject to normal level constraints).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to