nk1506 commented on code in PR #10246: URL: https://github.com/apache/iceberg/pull/10246#discussion_r1585139953
########## core/src/main/java/org/apache/iceberg/FastAppend.java: ########## @@ -156,6 +156,8 @@ public List<ManifestFile> apply(TableMetadata base, Snapshot snapshot) { manifests.addAll(snapshot.allManifests(ops.io())); } + manifests.forEach(summaryBuilder::addedManifestStats); Review Comment: Thanks @Fokko for feedback. > With the knowledge from the summary, you could spin up executors before reading the manifest-list, but this can be difficult since you would also need to know the sizes of the manifest to do some effective planning. _With manifest stats, It can help with better planning in terms of manifest file scans. Without these stats in metadata json either someone will have to assume or do one more IO with manifestList._ > The downside is that we add extra information to the metadata-JSON, which can also grow in size when there are many commits. _I agree it adds some extra bytes to metadataJson. But I think here we have options to optimize with expire snapshots. But to avoid an extra IO for planning I dont find any other way._ WDYT ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org