findepi commented on code in PR #6582:
URL: https://github.com/apache/iceberg/pull/6582#discussion_r1085085232


##########
core/src/main/java/org/apache/iceberg/puffin/StandardBlobTypes.java:
##########
@@ -26,4 +26,6 @@ private StandardBlobTypes() {}
    * href="https://datasketches.apache.org/";>Apache DataSketches</a> library
    */
   public static final String APACHE_DATASKETCHES_THETA_V1 = 
"apache-datasketches-theta-v1";
+
+  public static final String NDV_BLOB = "ndv-blob";

Review Comment:
   > Does Trino update the NDV sketch every time a write happens?
   
   Not yet, but there is a WIP PR for that: 
https://github.com/trinodb/trino/pull/15441
   
   > What if a table is wrote both by Trino and Spark? I believe the update 
from Spark side will be missing in that case.
   
   That's unfortunately true, but we hope this is just a temporary limitation.
   I would feel uncomfortably advising users not to use Spark because it cannot 
update Iceberg stats properly.
   
   > A-synchronized operation like this procedure.
   
   You need this anyway, since not all writes will update stats. For example, 
it's quite hard to updates NDV stats for a deletion (was this the _only_ 
appearance of a value, or one of many?)
   Trino provides ANALYZE statement to analyze stats for a table, and it 
currently computes a Theta sketch and puts it in Iceberg table's Puffin stats 
file.
   
   Automatic stats update will work well for append-only tables that do not 
undergo deletions or updates (or have deletions and updates in ignorable 
quantity). 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to