aokolnychyi commented on code in PR #8803:
URL: https://github.com/apache/iceberg/pull/8803#discussion_r1382325507


##########
api/src/main/java/org/apache/iceberg/Scan.java:
##########
@@ -77,6 +78,21 @@ public interface Scan<ThisT, T extends ScanTask, G extends 
ScanTaskGroup<T>> {
    */
   ThisT includeColumnStats();
 
+  /**
+   * Create a new scan from this that loads the column stats for the specific 
columns with each data
+   * file. If the columns set is empty or <code>null</code> then all column 
stats will be kept, if
+   * {@link #includeColumnStats()} is set.
+   *
+   * <p>Column stats include: value count, null value count, lower bounds, and 
upper bounds.
+   *
+   * @param columnsToKeepStats column ids from the table's schema

Review Comment:
   +1.
   
   We have `select` that accepts `Collection<String> columns` below. I 
understand using IDs is straightforward but someone has to do the conversion 
from name to ID and we better do it ourselves behind this API. If an external 
long-running process needs to ensure the ID did not change, it can that check 
outside.



##########
api/src/main/java/org/apache/iceberg/Scan.java:
##########
@@ -77,6 +78,21 @@ public interface Scan<ThisT, T extends ScanTask, G extends 
ScanTaskGroup<T>> {
    */
   ThisT includeColumnStats();
 
+  /**
+   * Create a new scan from this that loads the column stats for the specific 
columns with each data
+   * file. If the columns set is empty or <code>null</code> then all column 
stats will be kept, if
+   * {@link #includeColumnStats()} is set.
+   *
+   * <p>Column stats include: value count, null value count, lower bounds, and 
upper bounds.
+   *
+   * @param columnsToKeepStats column ids from the table's schema

Review Comment:
   +1.
   
   We have `select` that accepts `Collection<String> columns` below. I 
understand using IDs is straightforward but someone has to do the conversion 
from name to ID and we better do it ourselves behind this API. If an external 
long-running process needs to ensure the ID did not change, it can perform that 
check outside.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to