EremenkoValentin opened a new issue, #11475:
URL: https://github.com/apache/iceberg/issues/11475

   ### Query engine
   
   Iceberg API
   
   ### Question
   
   Does Iceberg support incremental statistics calculation? How can this be 
done for columns? How do you calculate changes between two snapshots?
   
   Hello everyone. I want to collect column statistics without reading the 
table every time. After examining the manifest files, I found that only 
statistics (value count, null count, NaN count, upper, lower) for changes made 
to a partition are stored.
   
   As far as I understand, Puffin files allow storing NDV, but I couldn’t find 
information on how to use them. Can someone provide guidance or a link to 
documentation that contains the answers? Thanks all.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to