BsoBird commented on issue #11765: URL: https://github.com/apache/iceberg/issues/11765#issuecomment-2541708695
@RussellSpitzer Sir. I am using Spark version 3.5.1, and the Iceberg version is 1.7.1/1.6.1. Sql: ``` merge into target_iceberg_table t using ( select primark_key,modified_time from ( select prikaey_key,modified_time,row_number() over(partition by primary_key order by modified_time desc) as rank from ods_table ) s1 where rank=1 ) s when matched when s.modified_time > t.modified_time then update set t.modified_time = s.modified_time when not matched then insert * ``` In this SQL, target_iceberg_table is a COW (Copy-On-Write) table. However, sir, I might have discovered some issues. When executing the COW-MERGE-INTO command, Spark needs to use the ods_table twice. The first time is to match data files based on incremental records, and the second time is to perform the actual data merge. If the data in the ods_table changes between the first and second usage, I would like to know if this could lead to abnormal execution results? What would happen if the data in the ods_table suddenly increases? What about if the data in the ods_table suddenly decreases? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org