leeoren opened a new pull request, #14138:
URL: https://github.com/apache/iceberg/pull/14138

   Summary of Proposed Changes
   
   1. Improve readability of DeltaWrite in Spark event logs
   Currently, when event logs contain the DeltaWrite action, the plan 
description is printed as:
   ```
   (1) WriteDelta
   Input [1]: [_col#1]
   Arguments: org.apache.iceberg.spark.source.SparkPositionDeltaWrite@5234f6c5
   ```
   
   By comparison, ReplaceData (implemented in 
[SparkWrite](https://github.com/apache/iceberg/blob/8353ac8f80799495cfdc32dd37222ed1b8d8070f/spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkWrite.java#L263-L266))
   is rendered more informatively:
   ```
   (1) ReplaceData
   Input [1]: [col#1]
   Arguments: IcebergWrite(table=iceberg_table, format=PARQUET)
   ```
   
   This change introduces a toString implementation for SparkPositionDeltaWrite 
to produce more descriptive and user-friendly plan output.
   
   2. Expose IcebergScan details in physical plans
   In Spark event logs, IcebergScan does not currently appear in the physical 
plan description. Instead, only the generic BatchScan is shown:
   ```
   == Physical Plan ==
   AppendData ...
   +- *(1) ColumnarToRow
      +- BatchScan glue_catalog.namespace.table_name[#col1] 
glue_catalog.namespace.table_name (branch=null) [filters=, groupedBy=] 
RuntimeFilters: []
   ```
   
   This change adds a description method to SparkBatchQueryScan, allowing query 
plans to include Iceberg-specific scan information and making event logs more 
informative.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to