HonahX commented on PR #1246: URL: https://github.com/apache/iceberg-python/pull/1246#issuecomment-2439772946
@stevie9868 @kevinjqliu Thanks for the great PR and discussions! I agree that there is some issue with the current Transaction mechanism: the `commit_transaction` can be incorrectly called when we should just abandon everything The following pattern has been the most common practice of updating tables in pyiceberg since the beginning ```python with tbl.transaction() as txn: txn.overwrite(...) .... ``` The "with" statement will ensure that the context manager---Transaction object's `__exit__()`(`commit_transaction`[) will always be called (even there is an exception)](https://docs.python.org/3/reference/compound_stmts.html#with) as long as the transaction object is successfully initialized. However, we should only call `commit_transaction` when there is no exception along the way. A simpler example would be: ```python pa_table_with_column = pa.Table.from_pydict( { "foo": ["a", None, "z"], "bar": [19, None, 25], }, schema=pa.schema([ pa.field("foo", pa.large_string(), nullable=True), pa.field("bar", pa.int32(), nullable=True), ]), ) tbl = catalog.create_table(identifier=identifier, schema=pa_table_with_column.schema) with pytest.raises(ValueError): with tbl.transaction() as txn: txn.append(pa_table_with_column) raise ValueError txn.append(pa_table_with_column) assert len(tbl.scan().to_pandas()) == 0 ``` Since I explicitly raise an error during the transaction, the whole transaction should be abandoned. But this code block still insert 3 rows (first append) to the table. Please let me know if these make sense. Would love to hear your thoughts on this! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org