Fokko commented on code in PR #6323:
URL: https://github.com/apache/iceberg/pull/6323#discussion_r1218658583
##########
python/pyiceberg/table/__init__.py:
##########
@@ -69,21 +72,313 @@
import ray
from duckdb import DuckDBPyConnection
+ from pyiceberg.catalog import Catalog
ALWAYS_TRUE = AlwaysTrue()
+class TableUpdates:
+ _table: Table
+ _updates: Tuple[TableUpdate, ...]
+ _requirements: Tuple[TableRequirement, ...]
+
+ def __init__(
+ self,
+ table: Table,
+ actions: Optional[Tuple[TableUpdate, ...]] = None,
+ requirements: Optional[Tuple[TableRequirement, ...]] = None,
+ ):
+ self._table = table
+ self._updates = actions or ()
+ self._requirements = requirements or ()
+
+ def _append_updates(self, *new_updates: TableUpdate) -> TableUpdates:
+ """Appends updates to the set of staged updates
+
+ Args:
+ *new_updates: Any new updates
+
+ Raises:
+ ValueError: When the type of update is not unique.
+
+ Returns:
+ A new AlterTable object with the new updates appended
+ """
+ for new_update in new_updates:
+ type_new_update = type(new_update)
+ if any(type(update) == type_new_update for update in
self._updates):
+ raise ValueError(f"Updates in a single commit need to be
unique, duplicate: {type_new_update}")
Review Comment:
The whole idea of this check is to avoid multiple similar operations. I
agree that when you change a schema, all the updates to the schema are
accumulated into one `AddSchemaUpdate`. If you try to add another update to the
transaction of the same type, it will throw the `ValueError` that we see above.
The whole public API is currently:
```python
table.new_transaction.set_table_version(2).commit()
table.new_transaction.set_properties(**{
"lifecycle": "true"
}).commit()
table.new_transaction.remove_properties("lifecycle").commit()
table.new_transaction.update_location("s3://...").commit()
```
And you can combine them:
```python
table.new_transaction.set_table_version(2).update_location("s3://...").commit()
```
Coming multiple updates of identical type will raise a `ValueError`:
```python
table.new_transaction.set_table_version(2).set_table_version(2).commit()
```
I think this will guard us from getting into nasty situations. We can always
relax this in the future to allow multiple snapshots, but then the requirements
should be in order as well.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]