[GitHub] [iceberg] asp437 commented on issue #8229: MERGE INTO number of affected rows

via GitHub Mon, 07 Aug 2023 01:55:30 -0700


asp437 commented on issue #8229:
URL: https://github.com/apache/iceberg/issues/8229#issuecomment-1667465369


   Thank you for reply. Will take a look on this changelog view.
   
   Yeah, I've took a look on [this 
code](https://github.com/apache/iceberg/blob/4f401a7779a56c25e9371cc3a95a13a51012cd6c/spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/MergeRowsExec.scala#L131)
 and looks like it is the only place such stats can be calculated. Outside 
`MergeRowsExec` all rows of affected partitions will be in iterator.
   
   Do you think it is possible to add collecting such stats into the Iceberg? 
I'm not familiar with Spark internal and not sure about the details, asking 
about the idea in general. Maybe I will try to dive deeper in case it is 
possible and can be used by others.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] asp437 commented on issue #8229: MERGE INTO number of affected rows

Reply via email to