malexer commented on issue #12558:
URL: https://github.com/apache/iceberg/issues/12558#issuecomment-2821532361

   We’re encountering the same issue with Iceberg 1.0.0 (currently using AWS 
Glue 4.0).
   
   We’re trying to run a conditional MERGE operation like this:
   ```sql
   MERGE INTO target_table t
   USING source_table s
   ON t.unique_id = s.unique_id
   WHEN MATCHED AND t.name != s.name THEN UPDATE SET *
   WHEN NOT MATCHED THEN INSERT *;
   ```
   Our intention is to update only when the name column has changed, and to 
skip the update if there are no changes.
   
   However, when we execute this statement multiple times with identical 
`target_table` and `source_table` (i.e., no actual differences), we still see a 
new snapshot created each time. Moreover, new data files are written to S3 with 
similar content and size.
   
   It appears that the UPDATE is being executed even when the `t.name != 
s.name` condition is not met, resulting in unnecessary file rewrites.
   
   It also undermines the ability to rely on Iceberg’s snapshot-based history 
for accurately tracking meaningful data changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to