aokolnychyi commented on code in PR #8803: URL: https://github.com/apache/iceberg/pull/8803#discussion_r1388806220
########## api/src/main/java/org/apache/iceberg/ContentFile.java: ########## @@ -165,6 +166,20 @@ default Long fileSequenceNumber() { */ F copyWithoutStats(); + /** + * Copies this file with only specific column stats. Manifest readers can reuse file instances; + * use this method to copy data and only copy specific stats when collecting files. + * + * @param requestedColumnIds column ids for which to keep stats. If <code>null</code> then every + * column stat is kept. + * @return a copy of this data file, with stats lower bounds, upper bounds, value counts, null + * value counts, and nan value counts for only specific columns. + */ + default F copyWithStats(Set<Integer> requestedColumnIds) { Review Comment: I spent a bit more time on this one. I do think we either drop support for `null` as a valid value here or drop the utility to copy. It is up to you which approach to pick. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org