asp437 commented on issue #8229:
URL: https://github.com/apache/iceberg/issues/8229#issuecomment-1667465369

   Thank you for reply. Will take a look on this changelog view.
   
   Yeah, I've took a look on [this 
code](https://github.com/apache/iceberg/blob/4f401a7779a56c25e9371cc3a95a13a51012cd6c/spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/MergeRowsExec.scala#L131)
 and looks like it is the only place such stats can be calculated. Outside 
`MergeRowsExec` all rows of affected partitions will be in iterator.
   
   Do you think it is possible to add collecting such stats into the Iceberg? 
I'm not familiar with Spark internal and not sure about the details, asking 
about the idea in general. Maybe I will try to dive deeper in case it is 
possible and can be used by others.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to