Fokko commented on code in PR #728:
URL: https://github.com/apache/iceberg-python/pull/728#discussion_r1632582253


##########
pyiceberg/table/__init__.py:
##########
@@ -1806,6 +1909,88 @@ def __enter__(self) -> U:
         return self  # type: ignore
 
 
+class ManageSnapshots(UpdateTableMetadata["ManageSnapshots"]):
+    """
+    Run snapshot management operations using APIs.
+
+    APIs include create branch, create tag, etc.
+
+    Use table.manage_snapshots().<operation>().commit() to run a specific 
operation.
+    Use table.manage_snapshots().<operation-one>().<operation-two>().commit() 
to run multiple operations.
+    Pending changes are applied on commit.
+
+    We can also use context managers to make more changes. For example,
+
+    with table.manage_snapshots() as ms:
+       ms.create_tag(snapshot_id1, "Tag_A").create_tag(snapshot_id2, "Tag_B")
+       ms.commit()

Review Comment:
   The nice thing about the context manager is that it will do the commit 
automatically once you leave the context:
   
   ```suggestion
       We can also use context managers to make more changes:
   
       with table.manage_snapshots() as ms:
          ms.create_tag(snapshot_id1, "Tag_A").create_tag(snapshot_id2, "Tag_B")
   ```



##########
tests/integration/test_writes/test_writes.py:
##########
@@ -869,3 +870,29 @@ def table_write_subset_of_schema(session_catalog: Catalog, 
arrow_table_with_null
     tbl.append(arrow_table_without_some_columns)
     # overwrite and then append should produce twice the data
     assert len(tbl.scan().to_arrow()) == len(arrow_table_without_some_columns) 
* 2
+
+
+@pytest.mark.integration
+@pytest.mark.parametrize("catalog", 
[pytest.lazy_fixture("session_catalog_hive"), 
pytest.lazy_fixture("session_catalog")])

Review Comment:
   What do you think of splitting this into a separate test file? For example 
`test_snapshots.py`



##########
mkdocs/docs/api.md:
##########
@@ -863,6 +863,29 @@ tbl.overwrite(df, snapshot_properties={"abc": "def"})
 assert tbl.metadata.snapshots[-1].summary["abc"] == "def"
 ```
 
+## Snapshot Management
+
+Manage snapshots with operations through the `Table` API:
+
+```python
+# To run a specific operation
+table.manage_snapshots().create_tag(snapshot_id, "tag123").commit()
+# To run multiple operations
+table.manage_snapshots()
+    .create_tag(snapshot_id1, "tag123")
+    .create_tag(snapshot_id2, "tag456")
+    .commit()
+# Operations are applied on commit.
+```
+
+You can also use context managers to make more changes. For example,

Review Comment:
   ```suggestion
   You can also use context managers to make more changes:
   ```



##########
mkdocs/docs/api.md:
##########
@@ -863,6 +863,29 @@ tbl.overwrite(df, snapshot_properties={"abc": "def"})
 assert tbl.metadata.snapshots[-1].summary["abc"] == "def"
 ```
 
+## Snapshot Management
+
+Manage snapshots with operations through the `Table` API:
+
+```python
+# To run a specific operation
+table.manage_snapshots().create_tag(snapshot_id, "tag123").commit()
+# To run multiple operations
+table.manage_snapshots()
+    .create_tag(snapshot_id1, "tag123")
+    .create_tag(snapshot_id2, "tag456")
+    .commit()
+# Operations are applied on commit.
+```
+
+You can also use context managers to make more changes. For example,
+
+```python
+with table.manage_snapshots() as ms:
+    ms.create_branch(snapshot_id1, "Branch_A").create_tag(snapshot_id2, 
"tag789")
+    ms.commit()

Review Comment:
   The nice thing about the context manager is that it will do the commit 
automatically once you leave the context:
   
https://github.com/apache/iceberg-python/blob/94e8a9835995e3b61f07f0dfb48d8a22a1e1d1b0/pyiceberg/table/__init__.py#L1826



##########
tests/integration/test_writes/test_writes.py:
##########
@@ -869,3 +870,29 @@ def table_write_subset_of_schema(session_catalog: Catalog, 
arrow_table_with_null
     tbl.append(arrow_table_without_some_columns)
     # overwrite and then append should produce twice the data
     assert len(tbl.scan().to_arrow()) == len(arrow_table_without_some_columns) 
* 2
+
+
+@pytest.mark.integration
+@pytest.mark.parametrize("catalog", 
[pytest.lazy_fixture("session_catalog_hive"), 
pytest.lazy_fixture("session_catalog")])
+def test_create_tag(catalog: Catalog) -> None:
+    identifier = "default.test_table_snapshot_operations"
+    tbl = catalog.load_table(identifier)
+    assert len(tbl.history()) > 3
+    tag_snapshot_id = tbl.history()[-3].snapshot_id
+    ms = tbl.manage_snapshots().create_tag(snapshot_id=tag_snapshot_id, 
tag_name="tag123")
+    ms.commit()
+    tbl.refresh()

Review Comment:
   This should happen automatically as part of the commit



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to