jeesou commented on PR #10659: URL: https://github.com/apache/iceberg/pull/10659#issuecomment-2220881056
Hi @huaxingao , @karuppayya, just to clear out a few aspects, This PR would be a continuation of https://github.com/apache/iceberg/pull/10288. To the above mentioned PR we would need to create a Procedure to trigger the Analyze action. But would this PR changes mean that we would be able to see other columnar statistics like min, max, etc along with NDV in the .stat file? Could you please help understand how "Iceberg can report column stats to Spark engine for CBO". Because even with this PR changes along with https://github.com/apache/iceberg/pull/10288, the Statistics section looks like : "blob-metadata" : [ { "type" : "apache-datasketches-theta-v1", "snapshot-id" : 6563878496533098178, "sequence-number" : 1, "fields" : [ 3 ], "properties" : { "ndv" : "3" } }] Other statistics are not coming right now, which is expected as per the code. Please help understand that how will we generate the other statistics and how will Spark access the statistics. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org