kevinjqliu commented on code in PR #1607:
URL: https://github.com/apache/iceberg-python/pull/1607#discussion_r1947923652


##########
mkdocs/docs/configuration.md:
##########
@@ -63,6 +63,7 @@ Iceberg tables support table properties to configure table 
behavior.
 | `write.parquet.page-row-limit`           | Number of rows                    
 | 20000                      | Set a target threshold for the maximum number 
of rows within a column chunk                                                   
                       |
 | `write.parquet.dict-size-bytes`          | Size in bytes                     
 | 2MB                        | Set the dictionary page size limit per row 
group                                                                           
                          |
 | `write.metadata.previous-versions-max`   | Integer                           
 | 100                        | The max number of previous version metadata 
files to keep before deleting after commit.                                     
                         |
+| `write.metadata.delete-after-commit.enabled` | Boolean                       
 | False                      | Whether to automatically delete old *tracked* 
metadata files after each table commit. It will retain a number of the most 
recent metadata files, which can be set using property 
`write.metadata.previous-versions-max`. |

Review Comment:
   👍 
   https://iceberg.apache.org/docs/1.6.0/maintenance/#remove-old-metadata-files



##########
pyiceberg/catalog/__init__.py:
##########
@@ -858,6 +860,12 @@ def _update_and_stage_table(
             enforce_validation=current_table is None,
             metadata_location=current_table.metadata_location if current_table 
else None,
         )
+        io = self._load_file_io(properties=updated_metadata.properties, 
location=updated_metadata.location)
+
+        # 
https://github.com/apache/iceberg/blob/f6faa58/core/src/main/java/org/apache/iceberg/CatalogUtil.java#L527
+        # delete old metadata if METADATA_DELETE_AFTER_COMMIT_ENABLED is set 
to true
+        if current_table is not None:
+            self._delete_old_metadata(io, current_table.metadata, 
updated_metadata)

Review Comment:
   is this the right place for this operation? 
   looking at the java implementation, changes are committed to the table 
before old metadata files are deleted 
   
https://github.com/apache/iceberg/blob/d935460bbed5da0ac205a67912b70fa682ee84df/core/src/main/java/org/apache/iceberg/BaseMetastoreTableOperations.java#L125-L126
   
   At this point, changes are only applied to the table metadata but is not yet 
committed 



##########
pyiceberg/catalog/__init__.py:
##########
@@ -858,6 +860,12 @@ def _update_and_stage_table(
             enforce_validation=current_table is None,
             metadata_location=current_table.metadata_location if current_table 
else None,
         )
+        io = self._load_file_io(properties=updated_metadata.properties, 
location=updated_metadata.location)
+
+        # 
https://github.com/apache/iceberg/blob/f6faa58/core/src/main/java/org/apache/iceberg/CatalogUtil.java#L527
+        # delete old metadata if METADATA_DELETE_AFTER_COMMIT_ENABLED is set 
to true
+        if current_table is not None:
+            self._delete_old_metadata(io, current_table.metadata, 
updated_metadata)

Review Comment:
   the java implementation uses `deleteAfterCommit` 
   
   
https://github.com/apache/iceberg/blob/d935460bbed5da0ac205a67912b70fa682ee84df/core/src/main/java/org/apache/iceberg/CatalogUtil.java#L533



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to