tveasey commented on PR #12962:
URL: https://github.com/apache/lucene/pull/12962#issuecomment-1895920963

   > This makes an interval of 255 a reasonable choice.
   
   I agree. This looks better to me. One thing I would be intrigued to try is 
the slight change in schedules as per 
[this](https://github.com/apache/lucene/pull/12962#discussion_r1453323532). 
Particularly, what happens if we delay using the information in 
`minCompetitiveSimilarity`. However, these results are already very good and we 
could push them out to a follow on PR.
   
   One last thing I think we should consider is exploring the variance we get 
in recall as a result of this change. Specifically, if we were to run with some 
random waits in the different segment searches what is the variation in the 
recalls we see?
   
   The danger is we get unlikely in ordering of searches and prune segment 
searches which contain the true nearest neighbours more aggressively sometimes. 
On average this shouldn't happen, but if we also see low variation in recall 
for the 1M test case in such a test case it would reassuring.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to