shanielh commented on PR #11895: URL: https://github.com/apache/iceberg/pull/11895#issuecomment-2566977078
> LGTM as well ! Thank you for the fix ! > > > have a JFR dump that shows this method uses 35% CPU utilization, this > > is why I think this commit is important > > interesting queue must really be huge, do you know what the manifest size / count we are looking at or more details of the table state ? Actually I was using `ParallelIterable` in order to read multiple parquet files in order to compact them, and to scan manifest files. Table had 180 manifest files with a lot of files: ```sql select count(*), sum(added_data_files_count), sum(existing_data_files_count), sum(deleted_data_files_count) from schema."table$manifests"; ``` | count(*) | sum(added_data_files_count) | sum(existing_data_files_count) | sum(deleted_data_files_count) | |----------|-----------------------------|--------------------------------|-------------------------------| | 180 | 1826 | 2703684 | 6844 | -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org