Re: [PR] Core: Fix cleanup of orphaned statistics files in dropTableData [iceberg]

2025-02-18 Thread via GitHub
lliangyu-lin commented on PR #12132: URL: https://github.com/apache/iceberg/pull/12132#issuecomment-2667399890 @ajantha-bhat has confirmed through the [dev email list](https://lists.apache.org/list.html?d...@iceberg.apache.org) that this is an expected behavior for dropTableData and we shou

Re: [PR] Core: Fix cleanup of orphaned statistics files in dropTableData [iceberg]

2025-02-18 Thread via GitHub
lliangyu-lin closed pull request #12132: Core: Fix cleanup of orphaned statistics files in dropTableData URL: https://github.com/apache/iceberg/pull/12132 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] Core: Fix cleanup of orphaned statistics files in dropTableData [iceberg]

2025-02-04 Thread via GitHub
gaborkaszab commented on PR #12132: URL: https://github.com/apache/iceberg/pull/12132#issuecomment-2634011059 > @gaborkaszab @ebyhr I did some more search and found that it seems Iceberg already have mechanism for clean up unreferenced statistics and partition statistics files as part of th

Re: [PR] Core: Fix cleanup of orphaned statistics files in dropTableData [iceberg]

2025-02-01 Thread via GitHub
lliangyu-lin commented on PR #12132: URL: https://github.com/apache/iceberg/pull/12132#issuecomment-2629089272 @gaborkaszab @ebyhr I did some more search and found that it seems Iceberg already have mechanism for clean up unreferenced statistics and partition statistics files as part of

Re: [PR] Core: Fix cleanup of orphaned statistics files in dropTableData [iceberg]

2025-01-31 Thread via GitHub
gaborkaszab commented on PR #12132: URL: https://github.com/apache/iceberg/pull/12132#issuecomment-2626634873 Another issue I found with the current approach is that if `write.metadata.delete-after-commit.enabled` is true we just keep a certain number of metadata.jsons for a table, controll

Re: [PR] Core: Fix cleanup of orphaned statistics files in dropTableData [iceberg]

2025-01-30 Thread via GitHub
lliangyu-lin commented on PR #12132: URL: https://github.com/apache/iceberg/pull/12132#issuecomment-2625421716 @gaborkaszab I totally agree on what you mentioned. I have my own doubts on the approach as well and it's indeed very expansive, so I was hoping we can get some more eyes and opini

Re: [PR] Core: Fix cleanup of orphaned statistics files in dropTableData [iceberg]

2025-01-30 Thread via GitHub
gaborkaszab commented on code in PR #12132: URL: https://github.com/apache/iceberg/pull/12132#discussion_r1935324663 ## core/src/main/java/org/apache/iceberg/CatalogUtil.java: ## @@ -108,6 +112,31 @@ public static void dropTableData(FileIO io, TableMetadata metadata) { L

Re: [PR] Core: Fix cleanup of orphaned statistics files in dropTableData [iceberg]

2025-01-29 Thread via GitHub
ebyhr commented on code in PR #12132: URL: https://github.com/apache/iceberg/pull/12132#discussion_r1934928382 ## core/src/test/java/org/apache/iceberg/hadoop/TestCatalogUtilDropTable.java: ## @@ -129,6 +130,81 @@ public void dropTableDataDeletesExpectedFiles() throws IOExcepti

Re: [PR] Core: Fix cleanup of orphaned statistics files in dropTableData [iceberg]

2025-01-29 Thread via GitHub
ebyhr commented on code in PR #12132: URL: https://github.com/apache/iceberg/pull/12132#discussion_r1934925978 ## core/src/test/java/org/apache/iceberg/hadoop/TestCatalogUtilDropTable.java: ## @@ -129,6 +130,81 @@ public void dropTableDataDeletesExpectedFiles() throws IOExcepti

Re: [PR] Core: Fix cleanup of orphaned statistics files in dropTableData [iceberg]

2025-01-29 Thread via GitHub
lliangyu-lin commented on code in PR #12132: URL: https://github.com/apache/iceberg/pull/12132#discussion_r1934986266 ## core/src/test/java/org/apache/iceberg/hadoop/TestCatalogUtilDropTable.java: ## @@ -129,6 +130,81 @@ public void dropTableDataDeletesExpectedFiles() throws IO

Re: [PR] Core: Fix cleanup of orphaned statistics files in dropTableData [iceberg]

2025-01-29 Thread via GitHub
ebyhr commented on code in PR #12132: URL: https://github.com/apache/iceberg/pull/12132#discussion_r1934928382 ## core/src/test/java/org/apache/iceberg/hadoop/TestCatalogUtilDropTable.java: ## @@ -129,6 +130,81 @@ public void dropTableDataDeletesExpectedFiles() throws IOExcepti

Re: [PR] Core: Fix cleanup of orphaned statistics files in dropTableData [iceberg]

2025-01-29 Thread via GitHub
ebyhr commented on code in PR #12132: URL: https://github.com/apache/iceberg/pull/12132#discussion_r1934925978 ## core/src/test/java/org/apache/iceberg/hadoop/TestCatalogUtilDropTable.java: ## @@ -129,6 +130,81 @@ public void dropTableDataDeletesExpectedFiles() throws IOExcepti

Re: [PR] Core: Fix cleanup of orphaned statistics files in dropTableData [iceberg]

2025-01-29 Thread via GitHub
ebyhr commented on code in PR #12132: URL: https://github.com/apache/iceberg/pull/12132#discussion_r1934925978 ## core/src/test/java/org/apache/iceberg/hadoop/TestCatalogUtilDropTable.java: ## @@ -129,6 +130,81 @@ public void dropTableDataDeletesExpectedFiles() throws IOExcepti

[PR] Core: Fix cleanup of orphaned statistics files in dropTableData [iceberg]

2025-01-29 Thread via GitHub
lliangyu-lin opened a new pull request, #12132: URL: https://github.com/apache/iceberg/pull/12132 ### Description Currently, Iceberg ```dropTableData()``` does not properly delete statistics files (```.stats```) that are replaced by newer statistics files. When ```updateStatistics()``` i