anandnalya commented on issue #9960:
URL: https://github.com/apache/iceberg/issues/9960#issuecomment-4657859861

   This still reproduces on the latest Iceberg release, and I can isolate it to 
the `write.spark.accept-any-schema` table property (confirming @voducdan's and 
@siddiquebagwan's observations above).
   
   **Environment**
   - Spark 3.5.7
   - Hadoop 3.3.6
   - Iceberg `iceberg-spark-runtime-3.5_2.13:1.10.0`
   - Catalog: `org.apache.iceberg.spark.SparkSessionCatalog`, `type=hive`
   - 
`spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions`
 (confirmed set)
   - No `spark.sql.optimizer.excludedRules` / `analyzer.excludedRules`
   
   **Minimal repro — the property is the only variable**
   
   Fails:
   ```sql
   CREATE TABLE scratch.t_accept (id BIGINT, val STRING) USING iceberg
     TBLPROPERTIES ('format-version'='2', 
'write.spark.accept-any-schema'='true');
   
   EXPLAIN UPDATE scratch.t_accept SET val = NULL WHERE id = 1;
   -- Error occurred during query planning:
   -- UPDATE TABLE is not supported temporarily.
   ```
   
   Works (identical table, property absent):
   ```sql
   CREATE TABLE scratch.t_plain (id BIGINT, val STRING) USING iceberg
     TBLPROPERTIES ('format-version'='2');
   
   EXPLAIN UPDATE scratch.t_plain SET val = NULL WHERE id = 1;
   -- == Physical Plan ==
   -- ReplaceData IcebergWrite(table=spark_catalog.scratch.t_plain, 
format=PARQUET)
   ```
   
   **Only UPDATE is affected — the copy-on-write rewrite path itself is fine.** 
A DELETE that forces the same copy-on-write rewrite (`ReplaceData`) on the 
*same* `accept-any-schema=true` table succeeds:
   ```sql
   EXPLAIN DELETE FROM scratch.t_accept WHERE id IN (SELECT id FROM 
scratch.t_plain);
   -- == Physical Plan ==
   -- ReplaceData IcebergWrite(table=spark_catalog.scratch.t_accept, 
format=PARQUET)   ✅
   ```
   
   This lines up with @nastra's note that 
[SPARK-43324](https://github.com/apache/spark/pull/41028) moved UPDATE handling 
out of the Iceberg extensions into Spark 3.5's native `RewriteUpdateTable` 
rule. That rule appears not to handle tables advertising the 
`ACCEPT_ANY_SCHEMA` capability (set via `write.spark.accept-any-schema=true`), 
so the `UpdateTable` node is never rewritten and falls through to the V1 
`BasicOperators` path (`SparkStrategies.scala`), which throws 
`ddlUnsupportedTemporarilyError("UPDATE TABLE")`. The DELETE copy-on-write 
rewrite above doesn't have to align `SET` assignments against the table schema, 
which is consistent with only UPDATE breaking.
   
   **Workaround:** clear the property, run the UPDATE, then restore it:
   ```sql
   ALTER TABLE <t> UNSET TBLPROPERTIES ('write.spark.accept-any-schema');
   UPDATE <t> SET ... WHERE ...;
   ALTER TABLE <t> SET TBLPROPERTIES ('write.spark.accept-any-schema'='true');
   ```
   
   Reproducible on Iceberg 1.10.0 / Spark 3.5.7. Since the root cause now looks 
like it lives in Spark's `RewriteUpdateTable` (post SPARK-43324) rather than 
the Iceberg extensions, could this be reopened — or is there an existing Spark 
JIRA tracking the `ACCEPT_ANY_SCHEMA` + row-level UPDATE interaction it should 
be redirected to?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to