Pulkitg64 commented on PR #15003:
URL: https://github.com/apache/lucene/pull/15003#issuecomment-3245140534

   Thanks @benwtrent for the suggestion. For now, I am thinking that we can 
keep threshold of 10% deletes i.e. we will consider only those segments for 
merging without building graph from scratch for which delete % is less than or 
equal to 10%.
   
   I can create a separate issue/PR for fixing the graph (reconnecting nodes) 
and try to increase delete threshold from 10%.
   
   
   I re-ran the benchmark again with varying delete % till 15% and results are 
similar only.
   
   
   | Experiment | Experiment | Baseline |                      | Candidate |    
              | Change  |
   
|------------|------------|----------|----------------------|-----------|------------------|---------|
   | Delete Pct | Delete Pct | Recall   | Force Merge Time (s) | Recall    | 
Force Merge Time | Recall  |
   | 50% delete | 0% delete  | 0.872    | 0                    | 0.873     | 0  
              |         |
   | 40% delete | 2% delete  | 0.871    | 831                  | 0.866     | 13 
              | -1%     |
   | 30% delete | 5% delete  | 0.873    | 810                  | 0.863     | 13 
              | -1%     |
   | 20% delete | 8% delete  | 0.874    | 783                  | 0.861     | 13 
              | -1%     |
   | 10% delete | 10% delete | 0.874    | 773                  | 0.857     | 13 
              | -2%     |
   
   
   Also ran with different max-conn by keeping the delete % threshold as 10%:
   
   | Experiment |            | Baseline |                      | Candidate |    
              | Change  |                  |
   
|------------|------------|----------|----------------------|-----------|------------------|---------|------------------|
   | Max Con    | Delete Pct | Recall   | Force Merge Time (s) | Recall    | 
Force Merge Time | Recall  | Force Merge Time |
   | 32         | 10% delete | 0.874    | 773                  | 0.857     | 13 
              | -2%     | 60x              |
   | 16         | 10% delete | 0.811    | 550                  | 0.793     | 12 
              | -2%     | 45x              |
   | 8          | 10% delete | 0.696    | 360                  | 0.675     | 12 
              | -3%     | 30x              |
   
   Raising a new revision with the threshold limit.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to