willjanning opened a new issue, #14851: URL: https://github.com/apache/iceberg/issues/14851
### Apache Iceberg version 1.10.0 (latest release) ### Query engine Spark ### Please describe the bug 🐞 In spark sql, with the following conf: `--conf=spark.sql.defaultCatalog=iceberg --conf=spark.sql.catalogImplementation=in-memory --conf=spark.sql.catalog.iceberg=org.apache.iceberg.spark.SparkCatalog --conf=spark.sql.catalog.iceberg.type=hadoop --conf=spark.sql.catalog.iceberg.warehouse=<insert_warehouse_path>` Run the following spark sql commands: ``` step 1: create table example (id INT, name VARCHAR(10)); step 2: merge into example using (select * from values (1, 'Alice'), (2, 'Bob'), (3, 'Charley') as v(id, name)) on example.id = v.id when matched then update set * when not matched then insert *; step 3: merge into example using (select * from values (1, 'David'), (5, 'Eve'), (6, 'Frank') as v(id, name)) on example.id = v.id when matched then update set * when not matched then insert *; step 4: call iceberg.system.create_changelog_view(changelog_view => 'example_net_changes', table => 'example', net_changes => true); step 5: select * from example_net_changes; step 5 output: 1 David INSERT 1 <snapshot_id> 2 Bob INSERT 1 <snapshot_id> 3 Charley INSERT 1 <snapshot_id> 5 Eve INSERT 1 <snapshot_id> 6 Frank INSERT 1 <snapshot_id> ``` note the incorrect ordinal for Bob and Charley (it is 1, should be 0). If step 2 is instead (with all other steps being the same): `step 2: insert into example values (1, 'Alice'), (2, 'Bob'), (3, 'Charley'); ` then you get the correct ordinal for Bob and Charley, 0, in the changelog output. Logically I don't think there's a difference between the initial merge into vs insert into commands, which is why this may be a bug ### Willingness to contribute - [ ] I can contribute a fix for this bug independently - [x] I would be willing to contribute a fix for this bug with guidance from the Iceberg community - [ ] I cannot contribute a fix for this bug at this time -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
