dnhatn opened a new pull request #134: URL: https://github.com/apache/lucene/pull/134
This commit enables bulk-merges (i.e., raw chunk copying) for stored fields when index sort is enabled. I benchmarked this change with Elasticsearch using Rally and found that this reduced the merge time by 20% (from 14.9 min -> 11.9 min). | Metric | Task | Baseline | Contender | Diff | Unit | |--------------------------------------------------------------:|-------------:|------------:|------------:|---------:|-------:| | Cumulative indexing time of primary shards | | 27.8208 | 26.4527 | -1.36817 | min | | Min cumulative indexing time across primary shard | | 4.21395 | 3.90437 | -0.30958 | min | | Median cumulative indexing time across primary shard | | 4.68754 | 4.43102 | -0.25652 | min | | Max cumulative indexing time across primary shard | | 5.00328 | 5.11322 | 0.10993 | min | | Cumulative indexing throttle time of primary shards | | 47.5573 | 43.7704 | -3.7869 | min | | Min cumulative indexing throttle time across primary shard | | 5.82492 | 6.24037 | 0.41545 | min | | Median cumulative indexing throttle time across primary shard | | 7.86763 | 6.94697 | -0.92066 | min | | Max cumulative indexing throttle time across primary shard | | 9.2532 | 8.78763 | -0.46557 | min | | Cumulative merge time of primary shards | | 14.9712 | 11.9894 | -2.9818 | min | | Cumulative merge count of primary shards | | 52 | 49 | -3 | | | Min cumulative merge time across primary shard | | 1.54965 | 1.2836 | -0.26605 | min | | Median cumulative merge time across primary shard | | 2.7143 | 1.99087 | -0.72343 | min | | Max cumulative merge time across primary shard | | 2.96187 | 3.013 | 0.05113 | min | | Cumulative merge throttle time of primary shards | | 0.0308 | 0.118767 | 0.08797 | min | | Min cumulative merge throttle time across primary shard | | 0.00116667 | 0.00716667 | 0.006 | min | | Median cumulative merge throttle time across primary shard | | 0.00191667 | 0.0116167 | 0.0097 | min | | Max cumulative merge throttle time across primary shard | | 0.0175333 | 0.0651 | 0.04757 | min | | Cumulative refresh time of primary shards | | 15.7809 | 15.123 | -0.65797 | min | | Cumulative refresh count of primary shards | | 1786 | 1843 | 57 | | | Min cumulative refresh time across primary shard | | 2.15032 | 2.12483 | -0.02548 | min | | Median cumulative refresh time across primary shard | | 2.6072 | 2.55063 | -0.05657 | min | | Max cumulative refresh time across primary shard | | 3.07643 | 2.76125 | -0.31518 | min | | Cumulative flush time of primary shards | | 0 | 0 | 0 | min | | Cumulative flush count of primary shards | | 0 | 0 | 0 | | | Min cumulative flush time across primary shard | | 0 | 0 | 0 | min | | Median cumulative flush time across primary shard | | 0 | 0 | 0 | min | | Max cumulative flush time across primary shard | | 0 | 0 | 0 | min | | Total Young Gen GC time | | 37.456 | 38.081 | 0.625 | s | | Total Young Gen GC count | | 1407 | 1379 | -28 | | | Total Old Gen GC time | | 0 | 0 | 0 | s | | Total Old Gen GC count | | 0 | 0 | 0 | | | Store size | | 2.77534 | 2.89203 | 0.11669 | GB | | Translog size | | 3.07336e-07 | 3.07336e-07 | 0 | GB | | Heap used for segments | | 0.898968 | 0.821392 | -0.07758 | MB | | Heap used for doc values | | 0.0717201 | 0.0275154 | -0.0442 | MB | | Heap used for terms | | 0.676544 | 0.648956 | -0.02759 | MB | | Heap used for norms | | 0.09198 | 0.0881958 | -0.00378 | MB | | Heap used for points | | 0 | 0 | 0 | MB | | Heap used for stored fields | | 0.0587234 | 0.0567245 | -0.002 | MB | | Segment count | | 119 | 115 | -4 | | | Min Throughput | index-append | 11692.2 | 12325.1 | 632.907 | docs/s | | Mean Throughput | index-append | 12580.3 | 12918.5 | 338.218 | docs/s | | Median Throughput | index-append | 12602.7 | 12950.9 | 348.173 | docs/s | | Max Throughput | index-append | 13131.5 | 13787.7 | 656.203 | docs/s | | 50th percentile latency | index-append | 2917.01 | 2920.49 | 3.48737 | ms | | 90th percentile latency | index-append | 4142.28 | 3969.75 | -172.527 | ms | | 99th percentile latency | index-append | 15025.7 | 12156.8 | -2868.93 | ms | | 99.9th percentile latency | index-append | 17450.6 | 12870.4 | -4580.23 | ms | | 100th percentile latency | index-append | 18089.8 | 13110.4 | -4979.42 | ms | | 50th percentile service time | index-append | 2917.01 | 2920.49 | 3.48737 | ms | | 90th percentile service time | index-append | 4142.28 | 3969.75 | -172.527 | ms | | 99th percentile service time | index-append | 15025.7 | 12156.8 | -2868.93 | ms | | 99.9th percentile service time | index-append | 17450.6 | 12870.4 | -4580.23 | ms | | 100th percentile service time | index-append | 18089.8 | 13110.4 | -4979.42 | ms | I also tried to enable bulk merge with deletions but the result wasn't great. I will explore an idea that iterates the start pointers of chunks sequentially to avoid the cost of searching. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org