Re: [PR] [WIP] Concurrent merging with join set algorithm [lucene]

via GitHub Mon, 24 Nov 2025 10:59:36 -0800


msokolov commented on PR #15208:
URL: https://github.com/apache/lucene/pull/15208#issuecomment-3572268374


   Looks like good progress!  Excellent speedup in merge times. I do worry a 
bit about the recall drop, and I wonder if we can somehow have a benchmark that 
lets us compare merge time results for similar recalls? EG maybe we increase M 
slightly for candidate?  
   
   Also "I modified the algorithm a bit so that it does not depend on previous 
node being added to the graph when adding the rest of nodes that are outside of 
join set." makes me wonder if this might be part of the problem with recall?  
I'm not sure exactly what you meant (haven't reviewed the code), but it sounds 
as if we don't wait for all of the "covering set" to be indexed before 
beginning to add the remainder of the nodes?  If we're still relying on seeding 
the query with nodes in the existing graph, but some of them are not in the 
existing graph ... not sure what that will do!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [WIP] Concurrent merging with join set algorithm [lucene]

Reply via email to