nicor88 commented on code in PR #140:
URL: https://github.com/apache/iceberg-python/pull/140#discussion_r1423647501


##########
pyiceberg/catalog/glue.py:
##########
@@ -177,6 +191,23 @@ def _create_glue_table(self, database_name: str, 
table_name: str, table_input: T
         except self.glue.exceptions.EntityNotFoundException as e:
             raise NoSuchNamespaceError(f"Database {database_name} does not 
exist") from e
 
+    def _update_glue_table(self, database_name: str, table_name: str, 
table_input: TableInputTypeDef, version_id: str) -> None:
+        try:
+            self.glue.update_table(DatabaseName=database_name, 
TableInput=table_input, VersionId=version_id)

Review Comment:
   every time that a glue table is updated there a a new version is created, 
and the previous versions are retained by default. The amount of table versions 
per AWS account is limited, and I've seen such limited reached many times 
specifically when using iceberg - see also this issue: 
https://github.com/dbt-athena/dbt-athena/issues/524 and this one 
https://github.com/dbt-athena/dbt-athena/pull/522
   
   I'm wondering if you considered using `SkipArchive ` set to True by default?
   Alternatively, you can give the final user control on such parameter. 
   
   Previous table versions are only relevant for debugging e.g. spotting which 
was the old metadata location, but not really helpful for operations like 
snapshot rollback, where you need to use spark for it.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to