aokolnychyi opened a new pull request, #11086: URL: https://github.com/apache/iceberg/pull/11086
This PR optimizes writing metadata for many new files, which is helpful during initial table creation and row-level operations that modify many files. Benchmark results prior to the changes: ``` Benchmark (fast) (numFiles) Mode Cnt Score Error Units AppendBenchmark.appendFiles true 50000 ss 5 1.111 ± 0.035 s/op AppendBenchmark.appendFiles true 100000 ss 5 2.114 ± 0.049 s/op AppendBenchmark.appendFiles true 500000 ss 5 10.144 ± 0.182 s/op AppendBenchmark.appendFiles true 1000000 ss 5 20.205 ± 0.388 s/op AppendBenchmark.appendFiles true 2500000 ss 5 51.280 ± 3.610 s/op AppendBenchmark.appendFiles false 50000 ss 5 1.125 ± 0.084 s/op AppendBenchmark.appendFiles false 100000 ss 5 2.107 ± 0.095 s/op AppendBenchmark.appendFiles false 500000 ss 5 10.117 ± 0.398 s/op AppendBenchmark.appendFiles false 1000000 ss 5 20.350 ± 1.046 s/op AppendBenchmark.appendFiles false 2500000 ss 5 48.823 ± 3.604 s/op ``` Benchmark results after the changes: ``` Benchmark (fast) (numFiles) Mode Cnt Score Error Units AppendBenchmark.appendFiles true 50000 ss 5 0.383 ± 0.072 s/op AppendBenchmark.appendFiles true 100000 ss 5 0.435 ± 0.031 s/op AppendBenchmark.appendFiles true 500000 ss 5 1.283 ± 0.152 s/op AppendBenchmark.appendFiles true 1000000 ss 5 2.325 ± 0.411 s/op AppendBenchmark.appendFiles true 2500000 ss 5 5.615 ± 1.203 s/op AppendBenchmark.appendFiles false 50000 ss 5 0.403 ± 0.023 s/op AppendBenchmark.appendFiles false 100000 ss 5 0.484 ± 0.058 s/op AppendBenchmark.appendFiles false 500000 ss 5 1.357 ± 0.190 s/op AppendBenchmark.appendFiles false 1000000 ss 5 2.596 ± 0.829 s/op AppendBenchmark.appendFiles false 2500000 ss 5 5.993 ± 1.447 s/op ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org