sshivampeta commented on PR #16289:
URL: https://github.com/apache/iceberg/pull/16289#issuecomment-4504322499

   > The contract of REPLACE TABLE is that the table's data, schema, and 
partitioning are replaced. There are no assumptions to validate, so checking 
for conflicts between concurrent changes and those changes is not required. If 
this is what you're trying to change, then I'm -1 for this commit because it is 
changing behavior that is already considered and chosen. It looks like this is 
the intent of this PR, but please reply if I'm wrong.
   > 
   > That said, REPLACE TABLE is also intended to keep table history (it is not 
a drop/re-create). So there are, potentially, valid problems with concurrency. 
For example, if a table's schema is updated from schema-id 1 to schema-id 2 and 
then the REPLACE TABLE operation removes the new schema-id 2 -- as though the 
concurrent commit never happened -- then that's a bug that we should fix.
   
   Thanks for the detailed feedback, @rdblue. You're right that REPLACE TABLE 
by contract replaces the table's data, schema, and partitioning — and that's 
intentional. This PR is not trying to change that contract.
   
   The concern is specifically about table history preservation. As you noted, 
REPLACE TABLE should keep table history (it is not a drop/re-create). The bug 
is that commitReplaceTransaction refreshes base to detect version conflicts but 
then commits the original stale current metadata wholesale — silently erasing 
concurrently-committed schemas, expired snapshots, and other history changes. 
For example:
   
   A concurrent expireSnapshots successfully removes snapshot-1, but the 
replace brings it back because its stale metadata still references it — this is 
table corruption (the snapshot files may have been physically deleted).
   A concurrent updateSchema adds schema-id 2, but the replace drops it — as 
you described, "as though the concurrent commit never happened."
   The current approach (fail-fast on any concurrent metadata change) is the 
simplest safe starting point to prevent this class of bugs. It's conservative — 
it will reject even non-conflicting concurrent changes — but it ensures no 
committed work is silently lost. A more refined approach that allows 
non-conflicting changes to survive (e.g., rebasing replace metadata on top of 
the refreshed base) could follow as a separate PR once the immediate safety 
issue is addressed.
   
   Would this framing address your concern? Happy to discuss further or adjust 
the approach.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to