Pulkitg64 commented on PR #15003:
URL: https://github.com/apache/lucene/pull/15003#issuecomment-3245140534
Thanks @benwtrent for the suggestion. For now, I am thinking that we can
keep threshold of 10% deletes i.e. we will consider only those segments for
merging without building graph from scratch for which delete % is less than or
equal to 10%.
I can create a separate issue/PR for fixing the graph (reconnecting nodes)
and try to increase delete threshold from 10%.
I re-ran the benchmark again with varying delete % till 15% and results are
similar only.
| Experiment | Experiment | Baseline | | Candidate |
| Change |
|------------|------------|----------|----------------------|-----------|------------------|---------|
| Delete Pct | Delete Pct | Recall | Force Merge Time (s) | Recall |
Force Merge Time | Recall |
| 50% delete | 0% delete | 0.872 | 0 | 0.873 | 0
| |
| 40% delete | 2% delete | 0.871 | 831 | 0.866 | 13
| -1% |
| 30% delete | 5% delete | 0.873 | 810 | 0.863 | 13
| -1% |
| 20% delete | 8% delete | 0.874 | 783 | 0.861 | 13
| -1% |
| 10% delete | 10% delete | 0.874 | 773 | 0.857 | 13
| -2% |
Also ran with different max-conn by keeping the delete % threshold as 10%:
| Experiment | | Baseline | | Candidate |
| Change | |
|------------|------------|----------|----------------------|-----------|------------------|---------|------------------|
| Max Con | Delete Pct | Recall | Force Merge Time (s) | Recall |
Force Merge Time | Recall | Force Merge Time |
| 32 | 10% delete | 0.874 | 773 | 0.857 | 13
| -2% | 60x |
| 16 | 10% delete | 0.811 | 550 | 0.793 | 12
| -2% | 45x |
| 8 | 10% delete | 0.696 | 360 | 0.675 | 12
| -3% | 30x |
Raising a new revision with the threshold limit.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]