HonahX commented on code in PR #956:
URL: https://github.com/apache/iceberg-python/pull/956#discussion_r1687497023


##########
tests/table/test_init.py:
##########
@@ -689,7 +689,7 @@ def test_update_metadata_add_snapshot(table_v2: Table) -> 
None:
         snapshot_id=25,
         parent_snapshot_id=19,
         sequence_number=200,
-        timestamp_ms=1602638573590,
+        timestamp_ms=1602638593590,

Review Comment:
   Seems like an un-related change?



##########
pyiceberg/table/__init__.py:
##########
@@ -1178,10 +1178,15 @@ def update_table_metadata(
     """
     context = _TableMetadataUpdateContext()
     new_metadata = base_metadata
+    new_metadata = new_metadata.model_copy(update={"last_updated_ms": 0})
 
     for update in updates:
         new_metadata = _apply_table_update(update, new_metadata, context)

Review Comment:
   I think we could rely on the `context` to determine if the metadata has been 
updated. In `_apply_table_metadata`, we only `add_update` to the context when 
we actually change the metadata.
   Given that there are some cases that we update `last_updated_ms` within 
`_apply_table_metadata`, I think we could do something like:
   ```python
   if context.has_changes() and base_metadata.last_updated_ms == 
new_metadata.last_updated_ms:
       new_metadata = new_metadata.model_copy(update={"last_updated_ms": 
datetime_to_millis(datetime.now().astimezone())})
   ```
   We could add `has_changes()` that checks if the size of underlying 
`_updates` is greater than 0 to 
[TableMetadataUpdateConetxt](https://github.com/apache/iceberg-python/blob/d0bfb4a67bccd15c2d2e9baa481a5f33fe4c5220/pyiceberg/table/__init__.py#L886).
 
   
   I think one benefit of this approach is code simplicity - we do not need to 
update `last_updated_ms` in every `_apply_table_update`
   
   Would love to hear thoughts on this!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to