[I] Concurrent MERGE INTO on different partitions still conflicts？？ [iceberg]

via GitHub Mon, 19 May 2025 00:48:02 -0700


madeirak opened a new issue, #13091:
URL: https://github.com/apache/iceberg/issues/13091


   ### Query engine
   
   Iceberg 1.5
   Spark 3.5
   
   
   ### Question
   
   I’m running two concurrent MERGE INTO statements against an Iceberg table, 
each targeting completely disjoint partitions:
   
   ```
   CREATE TABLE
     iceberg_catalog.xxx.test02 (id INT COMMENT '', b STRING COMMENT '') 
   USING iceberg 
   PARTITIONED BY (id) 
   ```
   -- Task A
   ```
   MERGE INTO xxx.test02 t
   USING (VALUES (1, '3'), (5, '5')) AS s(id, b)
     ON t.id = s.id
   WHEN MATCHED THEN UPDATE SET t.b = t.b + 1
   WHEN NOT MATCHED THEN INSERT *;
   ```
   -- Task B
   ```
   MERGE INTO xxx.test02 t
   USING (VALUES (2, '3'), (6, '5')) AS s(id, b)
     ON t.id = s.id
   WHEN MATCHED THEN UPDATE SET t.b = t.b + 1
   WHEN NOT MATCHED THEN INSERT *;
   ```
   
   Despite only touching partitions id=1,5 in Task A and id=2,6 in Task B, the 
second transaction fails with:
   
   ValidationException: Found conflicting files that can contain records 
matching true:
     […/data/id=1/00001.parquet, …/data/id=5/00002.parquet]
   
   
   Expected Behavior:
   Since the two MERGE INTO operations touch disjoint partitions, each should 
only validate conflicts within its own partition and succeed independently.
   
   
   Questions:
   
   Why does Iceberg’s default Serializable isolation validate all added data 
files, rather than limiting validation to only the affected partitions?
   
   Is there any configuration that I can use to restrict conflict checks to a 
given partition or row filter?
   
   Any guidance or pointer to relevant Iceberg JIRA issues / docs would be 
greatly appreciated.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

[I] Concurrent MERGE INTO on different partitions still conflicts？？ [iceberg]

Reply via email to