gsmiller commented on PR #13463: URL: https://github.com/apache/lucene/pull/13463#issuecomment-2161023588
@benwtrent ah, you're right. I only had a single segment. I played with making the write buffer really small but couldn't get more than one segment with that 100d enwiki dataset. I ran with cohere data along with a 12MB write buffer to try to reproduce your results. I'm probably doing something wrong still, but I at least confirmed I had more than one segment in my index (ended up producing 16 in my run). I'll post the results I got with that dataset here, but I'm not sure I trust them at this point given the low recall being reported (I suspect I just have something wrong with my setup): ``` BASELINE recall latency nDoc fanout maxConn beamWidth visited index ms 0.385 13.05 1000000 0 16 100 22696 255473 1.00 post-filter CANDIDATE recall latency nDoc fanout maxConn beamWidth visited index ms 0.383 13.60 1000000 0 16 100 23645 249901 1.00 post-filter ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org