Re: [PR] Fix global score update bug in MultiLeafKnnCollector [lucene]

via GitHub Tue, 11 Jun 2024 08:16:30 -0700


gsmiller commented on PR #13463:
URL: https://github.com/apache/lucene/pull/13463#issuecomment-2161023588


   @benwtrent ah, you're right. I only had a single segment. I played with 
making the write buffer really small but couldn't get more than one segment 
with that 100d enwiki dataset. I ran with cohere data along with a 12MB write 
buffer to try to reproduce your results. I'm probably doing something wrong 
still, but I at least confirmed I had more than one segment in my index (ended 
up producing 16 in my run). I'll post the results I got with that dataset here, 
but I'm not sure I trust them at this point given the low recall being reported 
(I suspect I just have something wrong with my setup):
   
   ```
   BASELINE
   recall  latency nDoc    fanout  maxConn beamWidth       visited index ms
   0.385   13.05   1000000 0       16      100     22696   255473  1.00    
post-filter
   
   CANDIDATE
   recall  latency nDoc    fanout  maxConn beamWidth       visited index ms
   0.383   13.60   1000000 0       16      100     23645   249901  1.00    
post-filter
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] Fix global score update bug in MultiLeafKnnCollector [lucene]

Reply via email to