Re: [PR] Consistent KNN query results with multiple leafs [lucene]

via GitHub Tue, 11 Feb 2025 15:25:13 -0800


benwtrent commented on PR #14191:
URL: https://github.com/apache/lucene/pull/14191#issuecomment-2652285114


   OK, I ran a slightly modified version of this: 
https://github.com/apache/lucene/compare/main...benwtrent:lucene:feature/consistent-sharing-knn?expand=1
   
   I indexed 8M docs with Elasticsearch's default merging. This resulted in 
multiple different tiers and many segments. 
   
   The store shows its better than no sharing at all (2x), but not nearly as 
good (from a visited standpoint) as the current sharing model (between all 
segments).
   
   ## No sharing:
   ```
   recall  latency(ms)     nDoc  topK  fanout  visited  num segments  
selectivity
    1.000      213.400  8000000   100     100   412280           128        
1.000
   ```
   
   ## Current Sharing:
   
   ### Single threaded:
   
   ```
   recall  latency(ms)     nDoc  topK  fanout  visited  num segments  
selectivity
    0.949       59.900  8000000   100     100    87299           128        
1.000
   ```
   
   
   ### 8 Threads (multiple runs)
   ```
   recall  latency(ms)     nDoc  topK  fanout  visited  num segments  
selectivity
    0.967       12.000  8000000   100     100    84477           128        
1.000
    0.967       10.100  8000000   100     100    84568           128        
1.000
    0.949       12.300  8000000   100     100    85034           128        
1.000
    0.949       13.300  8000000   100     100    84267           128        
1.000
    0.967       11.700  8000000   100     100    86035           128        
1.000
    0.949       12.000  8000000   100     100    84993           128        
1.000
    0.949       12.700  8000000   100     100    85139           128        
1.000
    0.967       10.000  8000000   100     100    85590           128        
1.000
    0.949       10.200  8000000   100     100    85106           128        
1.000
    0.967       12.600  8000000   100     100    85817           128        
1.000
   ```
   
   ## New Sharing:
   
   ### Single thread:
   
   ```
   recall  latency(ms)     nDoc  topK  fanout  visited  num segments  
selectivity
    0.957      119.500  8000000   100     100   193402           128        
1.000
   ```
   
   ### 8 Threads (multiple runs)
   
   ```
   recall  latency(ms)     nDoc  topK  fanout  visited  num segments  
selectivity
    0.957       16.400  8000000   100     100   193402           128        
1.000
    0.957       14.600  8000000   100     100   193402           128        
1.000
    0.957       16.800  8000000   100     100   193402           128        
1.000
    0.957       16.400  8000000   100     100   193402           128        
1.000
    0.957       16.700  8000000   100     100   193402           128        
1.000
    0.957       16.000  8000000   100     100   193402           128        
1.000
    0.957       16.200  8000000   100     100   193402           128        
1.000
    0.957       14.400  8000000   100     100   193402           128        
1.000
    0.957       19.500  8000000   100     100   193402           128        
1.000
    0.957       18.600  8000000   100     100   193402           128        
1.000
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] Consistent KNN query results with multiple leafs [lucene]

Reply via email to