Re: [PR] Spark: Avoid client-side delete of metadata file when purging table [iceberg]

via GitHub Wed, 11 Dec 2024 07:18:23 -0800


nastra commented on code in PR #11752:
URL: https://github.com/apache/iceberg/pull/11752#discussion_r1880395932



##########
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkCatalog.java:
##########
@@ -361,32 +359,18 @@ public boolean purgeTable(Identifier ident) {
       ValidationException.check(
           PropertyUtil.propertyAsBoolean(table.properties(), GC_ENABLED, 
GC_ENABLED_DEFAULT),
           "Cannot purge table: GC is disabled (deleting files may corrupt 
other tables)");
-      String metadataFileLocation =
-          ((HasTableOperations) 
table).operations().current().metadataFileLocation();
 
-      boolean dropped = dropTableWithoutPurging(ident);
-
-      if (dropped) {
-        // check whether the metadata file exists because 
HadoopCatalog/HadoopTables
-        // will drop the warehouse directly and ignore the `purge` argument
-        boolean metadataFileExists = 
table.io().newInputFile(metadataFileLocation).exists();
-
-        if (metadataFileExists) {
-          
SparkActions.get().deleteReachableFiles(metadataFileLocation).io(table.io()).execute();

Review Comment:
   for some historical context: this was mainly introduced because 
`CatalogUtil.dropTableData` (which is being called by catalogs during purging) 
was too slow on giant tables



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Spark: Avoid client-side delete of metadata file when purging table [iceberg]

Reply via email to