kevinjqliu commented on code in PR #1607: URL: https://github.com/apache/iceberg-python/pull/1607#discussion_r1947923652
########## mkdocs/docs/configuration.md: ########## @@ -63,6 +63,7 @@ Iceberg tables support table properties to configure table behavior. | `write.parquet.page-row-limit` | Number of rows | 20000 | Set a target threshold for the maximum number of rows within a column chunk | | `write.parquet.dict-size-bytes` | Size in bytes | 2MB | Set the dictionary page size limit per row group | | `write.metadata.previous-versions-max` | Integer | 100 | The max number of previous version metadata files to keep before deleting after commit. | +| `write.metadata.delete-after-commit.enabled` | Boolean | False | Whether to automatically delete old *tracked* metadata files after each table commit. It will retain a number of the most recent metadata files, which can be set using property `write.metadata.previous-versions-max`. | Review Comment: 👍 https://iceberg.apache.org/docs/1.6.0/maintenance/#remove-old-metadata-files ########## pyiceberg/catalog/__init__.py: ########## @@ -858,6 +860,12 @@ def _update_and_stage_table( enforce_validation=current_table is None, metadata_location=current_table.metadata_location if current_table else None, ) + io = self._load_file_io(properties=updated_metadata.properties, location=updated_metadata.location) + + # https://github.com/apache/iceberg/blob/f6faa58/core/src/main/java/org/apache/iceberg/CatalogUtil.java#L527 + # delete old metadata if METADATA_DELETE_AFTER_COMMIT_ENABLED is set to true + if current_table is not None: + self._delete_old_metadata(io, current_table.metadata, updated_metadata) Review Comment: is this the right place for this operation? looking at the java implementation, changes are committed to the table before old metadata files are deleted https://github.com/apache/iceberg/blob/d935460bbed5da0ac205a67912b70fa682ee84df/core/src/main/java/org/apache/iceberg/BaseMetastoreTableOperations.java#L125-L126 At this point, changes are only applied to the table metadata but is not yet committed ########## pyiceberg/catalog/__init__.py: ########## @@ -858,6 +860,12 @@ def _update_and_stage_table( enforce_validation=current_table is None, metadata_location=current_table.metadata_location if current_table else None, ) + io = self._load_file_io(properties=updated_metadata.properties, location=updated_metadata.location) + + # https://github.com/apache/iceberg/blob/f6faa58/core/src/main/java/org/apache/iceberg/CatalogUtil.java#L527 + # delete old metadata if METADATA_DELETE_AFTER_COMMIT_ENABLED is set to true + if current_table is not None: + self._delete_old_metadata(io, current_table.metadata, updated_metadata) Review Comment: the java implementation uses `deleteAfterCommit` https://github.com/apache/iceberg/blob/d935460bbed5da0ac205a67912b70fa682ee84df/core/src/main/java/org/apache/iceberg/CatalogUtil.java#L533 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org