Pulkitg64 commented on PR #15003:
URL: https://github.com/apache/lucene/pull/15003#issuecomment-3213582208

   Based on @msokolov  suggestion, I ran the benchmarks by simulating singleton 
merging. For this I  indexed 1M docs and then force merge the segments then 
delete documents and then again force merge the segment.
   
   I am seeing consistent improvement (about 50x speedup) in force merge time 
after deletes but also degradation in recall numbers (about 10%). It's probably 
because of disconnectedness issue (Let me try to find connectedness number of 
these graphs as well.)
   
   
   | Experiment | Baseline |                      | Candidate |                 
 | Change  |                  |
   
|------------|----------|----------------------|-----------|------------------|---------|------------------|
   | Delete Pct | Recall   | Force Merge Time (s) | Recall    | Force Merge 
Time | Recall  | Force Merge Time |
   | 50% delete | 0.892    | 417.52               | 0.763     | 8.43            
 | -14%    | 50x              |
   | 40% delete | 0.887    | 505.74               | 0.799     | 9.91            
 | -10%    | 50x              |
   | 30% delete | 0.88     | 585                  | 0.822     | 10.98           
 | -7%     | 53x              |
   | 20% delete | 0.878    | 677                  | 0.802     | 12.4            
 | -9%     | 54x              |
   | 10% delete |  0.874 | 772.42               | 0.856     | 13.5            | 
 -2%      |  59x               |
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to