Re: [PR] List data and metadata directories instead of table root [iceberg]

2025-02-26 Thread via GitHub
anuragmantri commented on PR #12278: URL: https://github.com/apache/iceberg/pull/12278#issuecomment-2686319047 Thanks for attempting to fix this @karuppayya. What happens if the `data` and `metadata` locations evolve (if that's possible)? At some point, users may decide to change the met

Re: [PR] List data and metadata directories instead of table root [iceberg]

2025-02-26 Thread via GitHub
karuppayya commented on code in PR #12278: URL: https://github.com/apache/iceberg/pull/12278#discussion_r1972303374 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/DeleteOrphanFilesSparkAction.java: ## @@ -301,24 +303,34 @@ private Dataset actualFileIdentDS()

Re: [PR] List data and metadata directories instead of table root [iceberg]

2025-02-18 Thread via GitHub
ismailsimsek commented on code in PR #12278: URL: https://github.com/apache/iceberg/pull/12278#discussion_r1960165781 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/DeleteOrphanFilesSparkAction.java: ## @@ -335,6 +347,21 @@ private Dataset listedFileDS() {

Re: [PR] List data and metadata directories instead of table root [iceberg]

2025-02-18 Thread via GitHub
ismailsimsek commented on code in PR #12278: URL: https://github.com/apache/iceberg/pull/12278#discussion_r1960165781 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/DeleteOrphanFilesSparkAction.java: ## @@ -335,6 +347,21 @@ private Dataset listedFileDS() {

Re: [PR] List data and metadata directories instead of table root [iceberg]

2025-02-17 Thread via GitHub
dramaticlly commented on PR #12278: URL: https://github.com/apache/iceberg/pull/12278#issuecomment-2664203151 > How about introducing the UUID in the table path while creating table? This way two tables cannot share a same location. I think that's what trino did with `iceberg.unique-t

Re: [PR] List data and metadata directories instead of table root [iceberg]

2025-02-17 Thread via GitHub
dramaticlly commented on code in PR #12278: URL: https://github.com/apache/iceberg/pull/12278#discussion_r1958833411 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/DeleteOrphanFilesSparkAction.java: ## @@ -84,11 +86,11 @@ * comparing the actual files in tha

Re: [PR] List data and metadata directories instead of table root [iceberg]

2025-02-17 Thread via GitHub
dramaticlly commented on code in PR #12278: URL: https://github.com/apache/iceberg/pull/12278#discussion_r1958833411 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/DeleteOrphanFilesSparkAction.java: ## @@ -84,11 +86,11 @@ * comparing the actual files in tha

Re: [PR] List data and metadata directories instead of table root [iceberg]

2025-02-17 Thread via GitHub
ajantha-bhat commented on code in PR #12278: URL: https://github.com/apache/iceberg/pull/12278#discussion_r1958839422 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/DeleteOrphanFilesSparkAction.java: ## @@ -301,24 +303,34 @@ private Dataset actualFileIdentDS(

Re: [PR] List data and metadata directories instead of table root [iceberg]

2025-02-17 Thread via GitHub
karuppayya commented on code in PR #12278: URL: https://github.com/apache/iceberg/pull/12278#discussion_r1958822862 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/DeleteOrphanFilesSparkAction.java: ## @@ -335,6 +347,21 @@ private Dataset listedFileDS() {

Re: [PR] List data and metadata directories instead of table root [iceberg]

2025-02-17 Thread via GitHub
Fokko commented on code in PR #12278: URL: https://github.com/apache/iceberg/pull/12278#discussion_r1957869157 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/DeleteOrphanFilesSparkAction.java: ## @@ -335,6 +347,21 @@ private Dataset listedFileDS() { retu

Re: [PR] List data and metadata directories instead of table root [iceberg]

2025-02-16 Thread via GitHub
szehon-ho commented on PR #12278: URL: https://github.com/apache/iceberg/pull/12278#issuecomment-2662233970 I think it makes some sense to me, but maybe better to put it behind a flag as its a behavior change? Something like -- listCurrentPaths? Curious what others think? -- This is

Re: [PR] List data and metadata directories instead of table root [iceberg]

2025-02-14 Thread via GitHub
karuppayya commented on PR #12278: URL: https://github.com/apache/iceberg/pull/12278#issuecomment-2660448261 cc: @aokolnychyi @RussellSpitzer @szehon-ho @anuragmantri @dramaticlly -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] List data and metadata directories instead of table root [iceberg]

2025-02-14 Thread via GitHub
karuppayya commented on PR #12278: URL: https://github.com/apache/iceberg/pull/12278#issuecomment-2660441981 This change doesnt solve the following cases: 1. Tables that the same location root 2. Tables that are that have root as data or metadata directories of a different table

[PR] List data and metadata directories instead of table root [iceberg]

2025-02-14 Thread via GitHub
karuppayya opened a new pull request, #12278: URL: https://github.com/apache/iceberg/pull/12278 ### Issue Tables t1 and t2 use a common prefix for their table location `/path/to/shared_root`. t1 has it data and metadata dir -> `/path/to/shared_root/t1` t2 has its data a and metad