aokolnychyi commented on code in PR #9050:
URL: https://github.com/apache/iceberg/pull/9050#discussion_r1401275467


##########
spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/source/SparkWrite.java:
##########
@@ -390,7 +390,7 @@ private Expression conflictDetectionFilter() {
         filter = Expressions.and(filter, expr);
       }
 
-      return filter;
+      return SERIALIZABLE == isolationLevel ? Expressions.alwaysTrue() : 
filter;

Review Comment:
   I spent a bit more time thinking about this and I am not fully convinced 
using `true` as the conflict detection filter is the right approach, while I 
admit it works in the given example. The whole idea of the conflict detection 
filter is to control what is considered a serializable action against the 
target table (not the transaction as a whole) if there are concurrent changes 
to that table. In case of MERGE, we are dealing with a transaction that reads 
from multiple tables, so that transaction can only be serializable if we have 
control over both sides and can validate concurrent changes. Unfortunately, we 
don't have that in Spark right now as there is no transaction API.
   
   In other words, the MERGE breaks serializability not because the conflict 
detection condition is wrong but because we have no way to validate concurrent 
changes to the other side.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to