Re: [PR] DiversifyingChildren speedup - siblings expansion [lucene]

via GitHub Mon, 11 May 2026 07:32:57 -0700


aruggero commented on PR #16034:
URL: https://github.com/apache/lucene/pull/16034#issuecomment-4421659206


   Here are the new benchmark results, thanks to the scratch space reuse :
   
   | children  | k | correlation | main (ms/op) | sibling (ms/op) | overhead |
   | -------- | --- | --------- | ------------- | --------------- | ---------- |
   |    4     | 10  |    best     | 0.070 ± 0.003 |  0.072 ± 0.003  | +2.9%   |
   |    4     | 10  |  standard   | 0.053 ± 0.003 |  0.058 ± 0.004  | +9.4%  |
   |    4     | 10  |    worst    | 0.050 ± 0.005 |  0.057 ± 0.003  |  +14.0%  |
   |    4     | 100 |    best     | 0.400 ± 0.013 |  0.452 ± 0.016  |  +13.0%  |
   |    4     | 100 |  standard   | 0.251 ± 0.012 |  0.309 ± 0.012  |  +23.1%  |
   |    4     | 100 |    worst    | 0.270 ± 0.026 |  0.305 ± 0.009 |  +13.0%  |
   |    8     | 10  |    best     | 0.101 ± 0.005 |  0.109 ± 0.006  |  +7.9%   |
   |    8     | 10  |  standard   | 0.065 ± 0.003 |  0.078 ± 0.003  |  +20.0%  |
   |    8     | 10  |    worst    | 0.064 ± 0.003 |  0.080 ± 0.003  |  +25.0%  |
   |    8     | 100 |    best     | 0.642 ± 0.019 |  0.716 ± 0.017  |  +11.5%  |
   |    8     | 100 |  standard   | 0.330 ± 0.027 |  0.486 ± 0.027  |  +47.3%  
   |    8     | 100 |    worst    | 0.307 ± 0.016 |  0.488 ± 0.028  |  +59.0%  |
   |    16    | 10  |    best     | 0.147 ± 0.004 |  0.151 ± 0.008  |  +2.7%  |
   |    16    | 10  |  standard   | 0.080 ± 0.004 |  0.109 ± 0.007  |  +36.3%  |
   |    16    | 10  |    worst    | 0.075 ± 0.005 |  0.107 ± 0.002  |  +42.7%  |
   |    16    | 100 |    best     | 0.985 ± 0.053 |  1.144 ± 0.040  |  +16.1%  |
   |    16    | 100 |  standard   | 0.568 ± 0.022 |  0.858 ± 0.071  |  +51.1%  |
   |    16    | 100 |    worst    | 0.496 ± 0.021 |  0.880 ± 0.052  |  +77.4%  |
   
   | Scenario | Siblings score | Threshold rises | HNSW early exit | Previous 
overhead | Current overhead |
   
|----------|----------------|-----------------|-----------------|-------------------|------------------|
   | best | High (nearly identical) | Fast | Yes | ~5–12% | ~3–16% |
   | standard | Moderate | Moderate | Partial | ~13–60% | ~9–51% |
   | worst | Random/low | Barely | No | ~12–74% | ~13–77% |
   
   The main change worth calling out: 
   - standard improved meaningfully at the top end (60% → 51%) thanks to 
scratch space reuse — that's the case most representative of real-world data.
   - The best lower bound dropped to 3% (nearly free for well-correlated 
siblings with small k).
   - The worst upper bound nudged up slightly (74% → 77%), but that's within 
benchmark noise at children=16, k=100.
   
   We still have a significant overhead in general.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] DiversifyingChildren speedup - siblings expansion [lucene]

Reply via email to