Re: [PR] Feature/collaborative hnsw search [lucene]

via GitHub Fri, 06 Mar 2026 01:45:27 -0800


krickert commented on PR #15676:
URL: https://github.com/apache/lucene/pull/15676#issuecomment-4010691268


   Thanks for the suggestions, @vigyasharma.
   
   You’re right; I used the current 100-visit warm-up as a static safeguard to 
prevent the "entry point trap" at local bridge nodes. My next round of tests 
will retain the 100-visit warm-up to establish a baseline, then I'll introduce 
the variance. This approach allows me to isolate the recall-safety of the core 
logic before adding complexity with additional variables.
   
   The current test results show that collaborative search produces results 
identical to a standard distributed search - achieving recall parity with the 
current Lucene HNSW implementation - while ensuring it doesn't regress on 
high-performance local hardware.
   
   I've seen significant success testing this on resource-constrained clusters 
(Raspberry Pis), where the pruning yielded a ~50% reduction in CPU cycles and 
latency without any recall loss. On high-end localhost setups with small 
shards, the gains are understandably masked by the raw speed of the traversal, 
but the recall floor remains solid.
   
   Regarding the heuristics:
   
   1. I agree that a static visit count is a blunt instrument. I’m currently 
preparing a 250GB index benchmark (court law cases) which will provide much 
more realistic graph depth than my initial tests.
   2. Once that larger-scale data is ready, I plan to use it to test your 
suggestion of a variance-based trigger. This would allow the pruning to be 
topology-aware rather than relying on a static visit counter.
   
   I’ll share those Pareto frontier results once the large-scale runs are 
complete. It should make the benefits clear even on high-performance hardware.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Feature/collaborative hnsw search [lucene]

Reply via email to