pvary commented on code in PR #8803:
URL: https://github.com/apache/iceberg/pull/8803#discussion_r1389220119


##########
api/src/main/java/org/apache/iceberg/ContentFile.java:
##########
@@ -165,6 +166,20 @@ default Long fileSequenceNumber() {
    */
   F copyWithoutStats();
 
+  /**
+   * Copies this file with only specific column stats. Manifest readers can 
reuse file instances;
+   * use this method to copy data and only copy specific stats when collecting 
files.
+   *
+   * @param requestedColumnIds column ids for which to keep stats. If 
<code>null</code> then every
+   *     column stat is kept.
+   * @return a copy of this data file, with stats lower bounds, upper bounds, 
value counts, null
+   *     value counts, and nan value counts for only specific columns.
+   */
+  default F copyWithStats(Set<Integer> requestedColumnIds) {

Review Comment:
   I am not fan of the special value settings, where `null`, and `empty` have 
specific meaning, so I ended up not defining what we are doing  with `null` on 
the API.
   
   Still behind the scenes in the constructors for `BaseFile`, 
`GenericDataFile`, etc, I need to keep the `null` as a possible value, but at 
least that is not a public API.
   
   Will talk about this one more time, before finalising the solution... 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to