HonahX commented on code in PR #956: URL: https://github.com/apache/iceberg-python/pull/956#discussion_r1687489596
########## pyiceberg/table/__init__.py: ########## @@ -1178,10 +1178,15 @@ def update_table_metadata( """ context = _TableMetadataUpdateContext() new_metadata = base_metadata + new_metadata = new_metadata.model_copy(update={"last_updated_ms": 0}) for update in updates: new_metadata = _apply_table_update(update, new_metadata, context) Review Comment: Hi @kevinjqliu. I think it totally make sense to update the field in `_apply_table_update`. I would like to propose another approach to see if we can make it simpler to avoid redundant updates on `last_updated_ms` field. I think we could rely on the `context` to determine if the metadata has been updated. In `_apply_table_metadata`, we only `add_update` to the context when we actually change the metadata. Given that there are some cases that we update `last_updated_ms` within `_apply_table_metadata`, I think we could do something like: ```python if context.has_changes() and base_metadata.last_updated_ms == new_metadata.last_updated_ms: new_metadata = new_metadata.model_copy(update={"last_updated_ms": datetime_to_millis(datetime.now().astimezone())}) ``` We could add `has_changes()` that checks if the size of underlying `_updates` is greater than 0 to [TableMetadataUpdateConetxt](https://github.com/apache/iceberg-python/blob/d0bfb4a67bccd15c2d2e9baa481a5f33fe4c5220/pyiceberg/table/__init__.py#L886). Would love to hear thoughts on this! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org