[GitHub] [lucene] jmazanec15 commented on issue #11354: Reuse HNSW graphs when merging segments? [LUCENE-10318]

GitBox Tue, 03 Jan 2023 08:21:24 -0800


jmazanec15 commented on issue #11354:
URL: https://github.com/apache/lucene/issues/11354#issuecomment-1369963611


   Hi @msokolov, 
   
   > First, it looks to me as if we see some very since improvement for the 
larger graphs, preserve the same recall, and changes to QPS are probably noise. 
I guess the assumption is we are producing similar results with less work?
   
   Right, basically instead of adding the first 0-X ordinals to the graph, we 
manually insert the nodes and their neighbors from the initializer graph into 
the merge graph, avoiding the searching for neighbors step. I think QPS is 
mostly noise. Recall is roughly the same - not always exactly because in the PR 
the random number generation gets a bit mixed up.
   
   > Just so we can understand these results a little better, could you 
document how you arrived at them? What dataset did you use? How did you measure 
the times and recall (was it using KnnGraphTester? luceneutil? some other 
benchmarking tool?). 
   
   Sure, I used the same procedure for the latest results as outlined here: 
https://github.com/apache/lucene/issues/11354#issuecomment-1239961308. I used 
the sift 1M 128 dimensional L2 data set. This was using KnnGraphTester, 
controlling the number of initial segments and then forcemerging to 1 segment.
   
   > I'd also be curious to see the numbers and sizes of the segments in the 
results: I assume they would be unchanged from Control to Test, but it would be 
nice to be able to verify.
   
   I would assume so too. Let me get these numbers as well - will post soon.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] jmazanec15 commented on issue #11354: Reuse HNSW graphs when merging segments? [LUCENE-10318]

Reply via email to