madeirak opened a new issue, #13091: URL: https://github.com/apache/iceberg/issues/13091
### Query engine Iceberg 1.5 Spark 3.5 ### Question I’m running two concurrent MERGE INTO statements against an Iceberg table, each targeting completely disjoint partitions: ``` CREATE TABLE iceberg_catalog.xxx.test02 (id INT COMMENT '', b STRING COMMENT '') USING iceberg PARTITIONED BY (id) ``` -- Task A ``` MERGE INTO xxx.test02 t USING (VALUES (1, '3'), (5, '5')) AS s(id, b) ON t.id = s.id WHEN MATCHED THEN UPDATE SET t.b = t.b + 1 WHEN NOT MATCHED THEN INSERT *; ``` -- Task B ``` MERGE INTO xxx.test02 t USING (VALUES (2, '3'), (6, '5')) AS s(id, b) ON t.id = s.id WHEN MATCHED THEN UPDATE SET t.b = t.b + 1 WHEN NOT MATCHED THEN INSERT *; ``` Despite only touching partitions id=1,5 in Task A and id=2,6 in Task B, the second transaction fails with: ValidationException: Found conflicting files that can contain records matching true: […/data/id=1/00001.parquet, …/data/id=5/00002.parquet] Expected Behavior: Since the two MERGE INTO operations touch disjoint partitions, each should only validate conflicts within its own partition and succeed independently. Questions: Why does Iceberg’s default Serializable isolation validate all added data files, rather than limiting validation to only the affected partitions? Is there any configuration that I can use to restrict conflict checks to a given partition or row filter? Any guidance or pointer to relevant Iceberg JIRA issues / docs would be greatly appreciated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org