aokolnychyi opened a new pull request, #8278: URL: https://github.com/apache/iceberg/pull/8278
This PR adds a table option to disable column stats filtering in `DeleteFileIndex`. Such filtering rarely yields any benefits on top of partition and sequence number filtering but may consume a lot of time. It is not a problem for targeted scans or small tables but it can cause a noticeable performance degradation for full table scans with millions of files. This change keeps the filtering on by default to preserve the existing behavior. ``` Benchmark Mode Cnt Score Error Units DeleteFileIndexBenchmark.buildIndexAndLookupWithColumnStatsFiltering ss 10 13.447 ± 0.147 s/op DeleteFileIndexBenchmark.buildIndexAndLookupWithoutColumnStatsFiltering ss 10 0.188 ± 0.001 s/op ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
