Fokko commented on code in PR #10246: URL: https://github.com/apache/iceberg/pull/10246#discussion_r1584982185
########## core/src/main/java/org/apache/iceberg/FastAppend.java: ########## @@ -156,6 +156,8 @@ public List<ManifestFile> apply(TableMetadata base, Snapshot snapshot) { manifests.addAll(snapshot.allManifests(ops.io())); } + manifests.forEach(summaryBuilder::addedManifestStats); Review Comment: This is also my main question. My train of thought: You will need to read the manifest-list in any situation. The number of manifest can vary widely: - If FastAppends are used frequently, there will be many small manifests that you want to bundle into batches. - If MergeAppends are used, the manifests are rather hefty (8 megabytes by default, set using `commit.manifest.target-size-bytes`). With the knowledge from the summary, you could spin up executors before reading the manifest-list, but this can be difficult since you would also need to know the sizes of the manifest to do some effective planning. The downside is that we add extra information to the metadata-JSON, which can also grow in size when there are many commits. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org