[I] Changelog table incorrect ordinals when net_changes => true, initial merge into write [iceberg]

via GitHub Mon, 15 Dec 2025 08:22:44 -0800


willjanning opened a new issue, #14851:
URL: https://github.com/apache/iceberg/issues/14851


   ### Apache Iceberg version
   
   1.10.0 (latest release)
   
   ### Query engine
   
   Spark
   
   ### Please describe the bug 🐞
   
   In spark sql, with the following conf:
   
   `--conf=spark.sql.defaultCatalog=iceberg 
--conf=spark.sql.catalogImplementation=in-memory 
--conf=spark.sql.catalog.iceberg=org.apache.iceberg.spark.SparkCatalog 
--conf=spark.sql.catalog.iceberg.type=hadoop 
--conf=spark.sql.catalog.iceberg.warehouse=<insert_warehouse_path>`
   
   Run the following spark sql commands:
   
   ```
   step 1: create table example (id INT, name VARCHAR(10));
   
   step 2: merge into example using (select * from values (1, 'Alice'), (2, 
'Bob'), (3, 'Charley') as v(id, name)) on example.id = v.id when matched then 
update set * when not matched then insert *; 
   
   step 3: merge into example using (select * from values (1, 'David'), (5, 
'Eve'), (6, 'Frank') as v(id, name)) on example.id = v.id when matched then 
update set * when not matched then insert *;
   
   step 4: call iceberg.system.create_changelog_view(changelog_view => 
'example_net_changes', table => 'example', net_changes => true);
   
   step 5: select * from example_net_changes;
   
   step 5 output: 
   1    David   INSERT  1       <snapshot_id>
   2    Bob     INSERT  1       <snapshot_id>
   3    Charley INSERT  1       <snapshot_id>
   5    Eve     INSERT  1       <snapshot_id>
   6    Frank   INSERT  1       <snapshot_id>
   ```
   
   note the incorrect ordinal for Bob and Charley (it is 1, should be 0). 
   
   If step 2 is instead (with all other steps being the same):
   
   `step 2: insert into example values (1, 'Alice'), (2, 'Bob'), (3, 
'Charley'); `
   
   then you get the correct ordinal for Bob and Charley, 0, in the changelog 
output.
   
   Logically I don't think there's a difference between the initial merge into 
vs insert into commands, which is why this may be a bug
   
   ### Willingness to contribute
   
   - [ ] I can contribute a fix for this bug independently
   - [x] I would be willing to contribute a fix for this bug with guidance from 
the Iceberg community
   - [ ] I cannot contribute a fix for this bug at this time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[I] Changelog table incorrect ordinals when net_changes => true, initial merge into write [iceberg]

Reply via email to