Re: [PR] Spark 3.5: Use rolling manifest writers when optimizing metadata [iceberg]

2023-11-03 Thread via GitHub
aokolnychyi commented on PR #8972: URL: https://github.com/apache/iceberg/pull/8972#issuecomment-1793306911 Thanks, @RussellSpitzer @singhpk234! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] Spark 3.5: Use rolling manifest writers when optimizing metadata [iceberg]

2023-11-03 Thread via GitHub
aokolnychyi merged PR #8972: URL: https://github.com/apache/iceberg/pull/8972 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] Spark 3.5: Use rolling manifest writers when optimizing metadata [iceberg]

2023-11-03 Thread via GitHub
aokolnychyi commented on PR #8972: URL: https://github.com/apache/iceberg/pull/8972#issuecomment-1793027078 @RussellSpitzer @singhpk234, could you take another look? I fixed the test. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [PR] Spark 3.5: Use rolling manifest writers when optimizing metadata [iceberg]

2023-11-03 Thread via GitHub
aokolnychyi commented on code in PR #8972: URL: https://github.com/apache/iceberg/pull/8972#discussion_r1382149201 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteManifestsAction.java: ## @@ -406,12 +406,13 @@ public void testRewriteLargeManifestsPa

Re: [PR] Spark 3.5: Use rolling manifest writers when optimizing metadata [iceberg]

2023-11-02 Thread via GitHub
aokolnychyi commented on code in PR #8972: URL: https://github.com/apache/iceberg/pull/8972#discussion_r1380909387 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteManifestsSparkAction.java: ## @@ -183,9 +170,7 @@ private RewriteManifests.Result doExecut

Re: [PR] Spark 3.5: Use rolling manifest writers when optimizing metadata [iceberg]

2023-11-02 Thread via GitHub
singhpk234 commented on code in PR #8972: URL: https://github.com/apache/iceberg/pull/8972#discussion_r1380655906 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteManifestsSparkAction.java: ## @@ -183,9 +170,7 @@ private RewriteManifests.Result doExecute

Re: [PR] Spark 3.5: Use rolling manifest writers when optimizing metadata [iceberg]

2023-11-02 Thread via GitHub
aokolnychyi commented on code in PR #8972: URL: https://github.com/apache/iceberg/pull/8972#discussion_r1380715154 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteManifestsSparkAction.java: ## @@ -307,8 +268,16 @@ private int targetNumManifests(long tot

Re: [PR] Spark 3.5: Use rolling manifest writers when optimizing metadata [iceberg]

2023-11-02 Thread via GitHub
aokolnychyi commented on code in PR #8972: URL: https://github.com/apache/iceberg/pull/8972#discussion_r1380712826 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteManifestsAction.java: ## @@ -406,12 +406,13 @@ public void testRewriteLargeManifestsPa

Re: [PR] Spark 3.5: Use rolling manifest writers when optimizing metadata [iceberg]

2023-11-02 Thread via GitHub
aokolnychyi commented on code in PR #8972: URL: https://github.com/apache/iceberg/pull/8972#discussion_r1380715154 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteManifestsSparkAction.java: ## @@ -307,8 +268,16 @@ private int targetNumManifests(long tot

Re: [PR] Spark 3.5: Use rolling manifest writers when optimizing metadata [iceberg]

2023-11-02 Thread via GitHub
aokolnychyi commented on code in PR #8972: URL: https://github.com/apache/iceberg/pull/8972#discussion_r1380713525 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteManifestsSparkAction.java: ## @@ -221,41 +206,24 @@ private Dataset buildManifestEntryDF(

Re: [PR] Spark 3.5: Use rolling manifest writers when optimizing metadata [iceberg]

2023-11-02 Thread via GitHub
aokolnychyi commented on code in PR #8972: URL: https://github.com/apache/iceberg/pull/8972#discussion_r1380713525 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteManifestsSparkAction.java: ## @@ -221,41 +206,24 @@ private Dataset buildManifestEntryDF(

Re: [PR] Spark 3.5: Use rolling manifest writers when optimizing metadata [iceberg]

2023-11-02 Thread via GitHub
aokolnychyi commented on code in PR #8972: URL: https://github.com/apache/iceberg/pull/8972#discussion_r1380712826 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteManifestsAction.java: ## @@ -406,12 +406,13 @@ public void testRewriteLargeManifestsPa

Re: [PR] Spark 3.5: Use rolling manifest writers when optimizing metadata [iceberg]

2023-11-02 Thread via GitHub
RussellSpitzer commented on code in PR #8972: URL: https://github.com/apache/iceberg/pull/8972#discussion_r1380671597 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteManifestsSparkAction.java: ## @@ -221,41 +206,24 @@ private Dataset buildManifestEntry

Re: [PR] Spark 3.5: Use rolling manifest writers when optimizing metadata [iceberg]

2023-11-02 Thread via GitHub
RussellSpitzer commented on code in PR #8972: URL: https://github.com/apache/iceberg/pull/8972#discussion_r1380662798 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteManifestsAction.java: ## @@ -406,12 +406,13 @@ public void testRewriteLargeManifest

Re: [PR] Spark 3.5: Use rolling manifest writers when optimizing metadata [iceberg]

2023-11-02 Thread via GitHub
RussellSpitzer commented on code in PR #8972: URL: https://github.com/apache/iceberg/pull/8972#discussion_r1380659942 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteManifestsSparkAction.java: ## @@ -307,8 +268,16 @@ private int targetNumManifests(long

Re: [PR] Spark 3.5: Use rolling manifest writers when optimizing metadata [iceberg]

2023-11-01 Thread via GitHub
aokolnychyi commented on code in PR #8972: URL: https://github.com/apache/iceberg/pull/8972#discussion_r1379398550 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteManifestsSparkAction.java: ## @@ -354,104 +323,90 @@ private void deleteFiles(Iterable loc

Re: [PR] Spark 3.5: Use rolling manifest writers when optimizing metadata [iceberg]

2023-11-01 Thread via GitHub
aokolnychyi commented on code in PR #8972: URL: https://github.com/apache/iceberg/pull/8972#discussion_r1379398550 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteManifestsSparkAction.java: ## @@ -354,104 +323,90 @@ private void deleteFiles(Iterable loc

Re: [PR] Spark 3.5: Use rolling manifest writers when optimizing metadata [iceberg]

2023-11-01 Thread via GitHub
aokolnychyi commented on code in PR #8972: URL: https://github.com/apache/iceberg/pull/8972#discussion_r1379397775 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteManifestsSparkAction.java: ## @@ -155,20 +155,7 @@ private RewriteManifests.Result doExecu

[PR] Spark 3.5: Use rolling manifest writers when optimizing metadata [iceberg]

2023-11-01 Thread via GitHub
aokolnychyi opened a new pull request, #8972: URL: https://github.com/apache/iceberg/pull/8972 This PR migrates the action for rewriting manifests to use rolling writers. Right now, we collect all entries in a Spark partition into a list to determine the number of entries that must be writt