jimczi opened a new pull request, #12551: URL: https://github.com/apache/lucene/pull/12551
This PR introduces a new parameter known as 'efSearch' to the knn vector query. 'efSearch' governs the maximum size of the priority queue employed for nearest neighbor searches. As each segment may contain a varying number of elements, 'efSearch' is dynamically adjusted on a per-segment basis, taking into account the segment's size. This addition is valuable for improving the precision of a knn query while accommodating variations in segment sizes within a single search. For instance, if an index comprises 2 segments, with one holding 90% of the total documents and the other 10%, setting 'efSearch' to 100 and 'k' to 10 will result in a priority queue size of 90 for the first segment and 10 for the other. I have initiated this PR to solicit feedback on the heuristic used to determine the 'ef' size for each segment. Meanwhile, I will be conducting tests to assess the recall and performance differences between single segments and multiple segments using different 'k' and 'efSearch' values. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org