HonahX commented on code in PR #265: URL: https://github.com/apache/iceberg-python/pull/265#discussion_r1451680767
########## pyiceberg/catalog/sql.py: ########## @@ -268,16 +269,32 @@ def drop_table(self, identifier: Union[str, Identifier]) -> None: identifier_tuple = self.identifier_to_tuple_without_catalog(identifier) database_name, table_name = self.identifier_to_database_and_table(identifier_tuple, NoSuchTableError) with Session(self.engine) as session: - res = session.execute( - delete(IcebergTables).where( - IcebergTables.catalog_name == self.name, - IcebergTables.table_namespace == database_name, - IcebergTables.table_name == table_name, + if self.engine.dialect.supports_sane_rowcount: + res = session.execute( + delete(IcebergTables).where( + IcebergTables.catalog_name == self.name, + IcebergTables.table_namespace == database_name, + IcebergTables.table_name == table_name, + ) ) - ) + if res.rowcount < 1: + raise NoSuchTableError(f"Table does not exist: {database_name}.{table_name}") + else: + try: + tbl = ( + session.query(IcebergTables) + .with_for_update(of=IcebergTables, nowait=True) Review Comment: Thanks for the explanation! Reflecting further, I think that using neither NOWAIT nor SKIP_LOCKED might be better for maintaining consistency. Currently, when the engine supports accurate rowcounts, we employ UPDATE TABLE or DELETE TABLE, which inherently wait for locks. If the engine lacks rowcount support, sticking with SELECT FOR UPDATE ensures consistent behavior with UPDATE TABLE operations, as it waits for locks. This consistency eliminates the need to handle additional exceptions, as any concurrent transaction will naturally lead to a `NoResultFound` scenario once the table is dropped, renamed, or updated. If we later decide that avoiding lock waits is preferable in fallback situations, we could consider the following modifications: 1. For `drop_table` and `rename_table`, I think you're right that we can catch `sqlalchemy.exc.OperationalError` and re-raise it as a new exception indicating that another process is set to delete the table. Using `CommitFailedException` doesn't seem appropriate here, as it's primarily meant for failed commits on iceberg tables. 2. For `_commit_table`, `skip_locked` might still be useful. At the point when the SQL command is executed, we're assured of the table's existence. Therefore, encountering `NoResultFound` would directly imply a concurrent commit attempt. Do you think the concern about maintaining consistency between cases that do and do not support rowcount is valid? If not, what are your thoughts on adopting NOWAIT for `drop_table` and `rename_table`, and SKIP_LOCKED for `_commit_table`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org