[PR] Spark 3.5: Fix testReplacePartitionField for Rewrite Manifests [iceberg]

2023-12-08 Thread via GitHub
bknbkn opened a new pull request, #9250: URL: https://github.com/apache/iceberg/pull/9250 Influenced by patch: https://github.com/apache/iceberg/pull/6695/files, rewrite manifests op will not execute in testReplacePartitionField due to only one record in table. This will cause the function

Re: [I] org.apache.iceberg.hive.RuntimeMetaException: Failed to connect to Hive Metastore at [iceberg]

2023-12-08 Thread via GitHub
ExplorData24 commented on issue #9030: URL: https://github.com/apache/iceberg/issues/9030#issuecomment-1846738796 @whymed Hi, Thank you so much. can you please explain this part more to me ? -- This is an automated message from the Apache Git Service. To respond to the message, p

Re: [PR] Spark SystemFunctions are not pushed down during JOIN [iceberg]

2023-12-08 Thread via GitHub
tmnd1991 commented on PR #9233: URL: https://github.com/apache/iceberg/pull/9233#issuecomment-1846764267 > How do you determine that the SystemFunctions are not pushed down? > > Spark will push down predicate(which includes predicates containing system functions) through join(except f

Re: [PR] Spark: IN clause on system function is not pushed down [iceberg]

2023-12-08 Thread via GitHub
tmnd1991 commented on code in PR #9192: URL: https://github.com/apache/iceberg/pull/9192#discussion_r1420130446 ## spark/v3.4/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestSystemFunctionPushDownDQL.java: ## @@ -224,6 +226,35 @@ private void testBucketLon

Re: [PR] Flink: switch to use SortKey for data statistics [iceberg]

2023-12-08 Thread via GitHub
pvary commented on code in PR #9212: URL: https://github.com/apache/iceberg/pull/9212#discussion_r1420131007 ## flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/sink/shuffle/TestDataStatisticsOperator.java: ## @@ -119,9 +121,9 @@ public void testProcessElement() throws E

Re: [PR] Flink: Fix IcebergSource tableloader lifecycle management in batch mode [iceberg]

2023-12-08 Thread via GitHub
pvary commented on code in PR #9173: URL: https://github.com/apache/iceberg/pull/9173#discussion_r1420143541 ## flink/v1.17/flink/src/main/java/org/apache/iceberg/flink/source/IcebergSource.java: ## @@ -105,12 +107,12 @@ public class IcebergSource implements Sourcehttps://github

[I] Documentation [iceberg-rust]

2023-12-08 Thread via GitHub
Fokko opened a new issue, #114: URL: https://github.com/apache/iceberg-rust/issues/114 It would be great to have a minimal set of docs. I think we can get a lot of inspiration from the PyIceberg project where we use mkdocs: https://github.com/apache/iceberg-python/tree/main/mkdocs We

Re: [I] Documentation [iceberg-rust]

2023-12-08 Thread via GitHub
Xuanwo commented on issue #114: URL: https://github.com/apache/iceberg-rust/issues/114#issuecomment-1846826826 The name `https://rust.iceberg.apache.org/` seems appropriate given that our repository is named `iceberg-rust`. -- This is an automated message from the Apache Git Service. To r

Re: [PR] Spark 3.5: Rework DeleteFileIndexBenchmark [iceberg]

2023-12-08 Thread via GitHub
aokolnychyi commented on code in PR #9165: URL: https://github.com/apache/iceberg/pull/9165#discussion_r1420161446 ## core/src/test/java/org/apache/iceberg/FileGenerationUtil.java: ## @@ -0,0 +1,187 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or mor

Re: [PR] Spark 3.5: Rework DeleteFileIndexBenchmark [iceberg]

2023-12-08 Thread via GitHub
aokolnychyi commented on code in PR #9165: URL: https://github.com/apache/iceberg/pull/9165#discussion_r1420161870 ## core/src/test/java/org/apache/iceberg/FileGenerationUtil.java: ## @@ -0,0 +1,187 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or mor

Re: [PR] Spark 3.5: Rework DeleteFileIndexBenchmark [iceberg]

2023-12-08 Thread via GitHub
aokolnychyi merged PR #9165: URL: https://github.com/apache/iceberg/pull/9165 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] Spark 3.5: Rework DeleteFileIndexBenchmark [iceberg]

2023-12-08 Thread via GitHub
aokolnychyi commented on PR #9165: URL: https://github.com/apache/iceberg/pull/9165#issuecomment-1846842066 Thanks for reviewing, @flyrain! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] Spark 3.5: Fix testReplacePartitionField for Rewrite Manifests [iceberg]

2023-12-08 Thread via GitHub
bknbkn commented on PR #9250: URL: https://github.com/apache/iceberg/pull/9250#issuecomment-1846867489 cc @rdblue @ajantha-bhat -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

Re: [PR] Switch to junit5 for mr [iceberg]

2023-12-08 Thread via GitHub
nastra commented on code in PR #9241: URL: https://github.com/apache/iceberg/pull/9241#discussion_r1420174672 ## mr/src/test/java/org/apache/iceberg/mr/TestCatalogs.java: ## @@ -176,8 +178,10 @@ public void testCreateDropTableToCatalog() throws IOException { HadoopCatalog

Re: [I] Type Promotion: Int/Long to String [iceberg]

2023-12-08 Thread via GitHub
zhongyujiang commented on issue #9064: URL: https://github.com/apache/iceberg/issues/9064#issuecomment-1846875009 Hi @danielcweeks, thank you for opening this. We used to have this need in real use cases, so I think this can be really helpful! In addition, we have also encountered cases

Re: [PR] Flink: Create JUnit5 version of TestFlinkScan [iceberg]

2023-12-08 Thread via GitHub
nastra commented on code in PR #9185: URL: https://github.com/apache/iceberg/pull/9185#discussion_r1420194979 ## flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkScan.java: ## @@ -49,37 +51,28 @@ import org.apache.iceberg.types.Types; import org.apache.

Re: [PR] API, Core: Add sqlFor API to views to handle basic resolution of dialect [iceberg]

2023-12-08 Thread via GitHub
nastra commented on code in PR #9247: URL: https://github.com/apache/iceberg/pull/9247#discussion_r1420208010 ## core/src/test/java/org/apache/iceberg/view/ViewCatalogTests.java: ## @@ -200,6 +200,13 @@ public void completeCreateView() { .build())

Re: [PR] Nessie: Support views for NessieCatalog [iceberg]

2023-12-08 Thread via GitHub
ajantha-bhat commented on code in PR #8909: URL: https://github.com/apache/iceberg/pull/8909#discussion_r1420210944 ## nessie/src/test/java/org/apache/iceberg/nessie/TestNessieView.java: ## @@ -0,0 +1,337 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + *

Re: [PR] Nessie: Support views for NessieCatalog [iceberg]

2023-12-08 Thread via GitHub
nastra commented on code in PR #8909: URL: https://github.com/apache/iceberg/pull/8909#discussion_r1420212153 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java: ## @@ -142,15 +148,27 @@ private UpdateableReference loadReference(String requestedRef, Stri

Re: [PR] Nessie: Support views for NessieCatalog [iceberg]

2023-12-08 Thread via GitHub
ajantha-bhat commented on PR #8909: URL: https://github.com/apache/iceberg/pull/8909#issuecomment-1846896443 @nastra: I have addressed the comments. Thanks for the review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] Nessie: Support views for NessieCatalog [iceberg]

2023-12-08 Thread via GitHub
nastra commented on code in PR #8909: URL: https://github.com/apache/iceberg/pull/8909#discussion_r1420215532 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieUtil.java: ## @@ -190,4 +203,118 @@ public static Optional extractSingleConflict( Conflict conflict = confli

Re: [PR] Nessie: Support views for NessieCatalog [iceberg]

2023-12-08 Thread via GitHub
ajantha-bhat commented on code in PR #8909: URL: https://github.com/apache/iceberg/pull/8909#discussion_r142062 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java: ## @@ -142,15 +148,27 @@ private UpdateableReference loadReference(String requestedRef

Re: [PR] shutdown scheduler [iceberg]

2023-12-08 Thread via GitHub
nastra commented on PR #9150: URL: https://github.com/apache/iceberg/pull/9150#issuecomment-1846903866 @gabrywu can you fix CI failures please? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] shutdown scheduler [iceberg]

2023-12-08 Thread via GitHub
nastra commented on code in PR #9150: URL: https://github.com/apache/iceberg/pull/9150#discussion_r1420224873 ## core/src/main/java/org/apache/iceberg/util/LockManagers.java: ## @@ -153,6 +153,14 @@ public void initialize(Map properties) { CatalogProperties.LOCK_H

Re: [PR] Spark SystemFunctions are not pushed down during JOIN [iceberg]

2023-12-08 Thread via GitHub
advancedxy commented on PR #9233: URL: https://github.com/apache/iceberg/pull/9233#issuecomment-1846909628 > Adding this patch though helps pruning more partitions, this is because the batch scan on the target table cannot prune partitions because the file names (collected as a result of th

Re: [PR] Nessie: Support views for NessieCatalog [iceberg]

2023-12-08 Thread via GitHub
ajantha-bhat commented on code in PR #8909: URL: https://github.com/apache/iceberg/pull/8909#discussion_r1420226948 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieUtil.java: ## @@ -190,4 +203,118 @@ public static Optional extractSingleConflict( Conflict conflict =

Re: [PR] Nessie: Support views for NessieCatalog [iceberg]

2023-12-08 Thread via GitHub
ajantha-bhat commented on code in PR #8909: URL: https://github.com/apache/iceberg/pull/8909#discussion_r1420229851 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java: ## @@ -142,15 +148,27 @@ private UpdateableReference loadReference(String requestedRef

Re: [PR] Spark: IN clause on system function is not pushed down [iceberg]

2023-12-08 Thread via GitHub
advancedxy commented on code in PR #9192: URL: https://github.com/apache/iceberg/pull/9192#discussion_r1420230467 ## spark/v3.4/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestSystemFunctionPushDownDQL.java: ## @@ -224,6 +226,35 @@ private void testBucketL

Re: [PR] Nessie: Support views for NessieCatalog [iceberg]

2023-12-08 Thread via GitHub
nastra commented on code in PR #8909: URL: https://github.com/apache/iceberg/pull/8909#discussion_r1420230637 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java: ## @@ -142,15 +148,27 @@ private UpdateableReference loadReference(String requestedRef, Stri

Re: [PR] Nessie: Support views for NessieCatalog [iceberg]

2023-12-08 Thread via GitHub
nastra commented on code in PR #8909: URL: https://github.com/apache/iceberg/pull/8909#discussion_r1420232303 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieUtil.java: ## @@ -190,4 +203,118 @@ public static Optional extractSingleConflict( Conflict conflict = confli

Re: [PR] Spark: IN clause on system function is not pushed down [iceberg]

2023-12-08 Thread via GitHub
tmnd1991 commented on PR #9192: URL: https://github.com/apache/iceberg/pull/9192#issuecomment-1846982769 Thanks for the review, I addressed your concerns. If I get green light I proceed to copy-paste over 3.4 -- This is an automated message from the Apache Git Service. To respond to the m

Re: [PR] Spark: IN clause on system function is not pushed down [iceberg]

2023-12-08 Thread via GitHub
tmnd1991 commented on code in PR #9192: URL: https://github.com/apache/iceberg/pull/9192#discussion_r1420295367 ## spark/v3.4/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestSystemFunctionPushDownDQL.java: ## @@ -224,6 +226,35 @@ private void testBucketLon

Re: [PR] Spark: IN clause on system function is not pushed down [iceberg]

2023-12-08 Thread via GitHub
tmnd1991 commented on code in PR #9192: URL: https://github.com/apache/iceberg/pull/9192#discussion_r1420295733 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/optimizer/ReplaceStaticInvoke.scala: ## @@ -40,14 +37,20 @@ import org.apache.spark.sql.typ

Re: [PR] Nessie: Support views for NessieCatalog [iceberg]

2023-12-08 Thread via GitHub
nastra commented on code in PR #8909: URL: https://github.com/apache/iceberg/pull/8909#discussion_r1420317210 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieUtil.java: ## @@ -190,4 +203,118 @@ public static Optional extractSingleConflict( Conflict conflict = confli

Re: [I] spark3 can't query iceberg: failed to connect to Hive Metastore [iceberg]

2023-12-08 Thread via GitHub
ExplorData24 commented on issue #2359: URL: https://github.com/apache/iceberg/issues/2359#issuecomment-1847038152 @coolderli @RussellSpitzer @hunter-cloud09 @dixingxing0 @pvary Hello everyone. I am using Hive Catalog to create Iceberg tables with Spark as the e

Re: [PR] Nessie: Support views for NessieCatalog [iceberg]

2023-12-08 Thread via GitHub
ajantha-bhat commented on code in PR #8909: URL: https://github.com/apache/iceberg/pull/8909#discussion_r1420339501 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieCatalog.java: ## @@ -232,17 +232,32 @@ protected String defaultWarehouseLocation(TableIdentifier table) {

Re: [PR] Nessie: Support views for NessieCatalog [iceberg]

2023-12-08 Thread via GitHub
ajantha-bhat commented on code in PR #8909: URL: https://github.com/apache/iceberg/pull/8909#discussion_r142032 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieCatalog.java: ## @@ -232,17 +232,32 @@ protected String defaultWarehouseLocation(TableIdentifier table) {

Re: [I] spark3 can't query iceberg: failed to connect to Hive Metastore [iceberg]

2023-12-08 Thread via GitHub
ExplorData24 commented on issue #2359: URL: https://github.com/apache/iceberg/issues/2359#issuecomment-1847109573 @RussellSpitzer first of all thank you very much for your answer as shown in the screenshots below: 1. I configured hive metastore with the address: thrift://hive-me

Re: [PR] Spark SystemFunctions are not pushed down during JOIN [iceberg]

2023-12-08 Thread via GitHub
tmnd1991 commented on PR #9233: URL: https://github.com/apache/iceberg/pull/9233#issuecomment-1847179494 Sure, let me add a bit of context: I have two table with the exact same schema/layout, partitioned on 3 columns: - identity(MEAS_YM) - identity(MEAS_DD) - bucket(POD, 4) The

Re: [PR] Flink: switch to use SortKey for data statistics [iceberg]

2023-12-08 Thread via GitHub
stevenzwu commented on code in PR #9212: URL: https://github.com/apache/iceberg/pull/9212#discussion_r1420622934 ## flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/sink/shuffle/TestDataStatisticsOperator.java: ## @@ -119,9 +121,9 @@ public void testProcessElement() thro

Re: [I] spark3 can't query iceberg: failed to connect to Hive Metastore [iceberg]

2023-12-08 Thread via GitHub
ExplorData24 commented on issue #2359: URL: https://github.com/apache/iceberg/issues/2359#issuecomment-1847344053 @pvary Thank you very much for your answer In fact, i am using "catalog_hive" Catalog to create Iceberg tables with Spark as the execution engine: import pyspark

Re: [PR] [spark <3.5] [backport from spark 3.5] Support specifying spec_id in RewriteManifestProcedure [iceberg]

2023-12-08 Thread via GitHub
RussellSpitzer merged PR #9243: URL: https://github.com/apache/iceberg/pull/9243 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceb

Re: [PR] [spark <3.5] [backport from spark 3.5] Support specifying spec_id in RewriteManifestProcedure [iceberg]

2023-12-08 Thread via GitHub
RussellSpitzer commented on PR #9243: URL: https://github.com/apache/iceberg/pull/9243#issuecomment-1847363462 Thanks for back-porting, If you can please follow up with a doc PR for the new arg -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] test: Add integration tests for rest catalog. [iceberg-rust]

2023-12-08 Thread via GitHub
liurenjie1024 commented on PR #109: URL: https://github.com/apache/iceberg-rust/pull/109#issuecomment-1847376669 cc @Fokko I've resolved conflicts. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [I] Documentation [iceberg-rust]

2023-12-08 Thread via GitHub
liurenjie1024 commented on issue #114: URL: https://github.com/apache/iceberg-rust/issues/114#issuecomment-1847387741 Just curious, what would be the relationship with crates on docs.rs? -- This is an automated message from the Apache Git Service. To respond to the message, please log on t

Re: [PR] test: Add integration tests for rest catalog. [iceberg-rust]

2023-12-08 Thread via GitHub
Fokko merged PR #109: URL: https://github.com/apache/iceberg-rust/pull/109 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [I] test: Rest catalog integration test. [iceberg-rust]

2023-12-08 Thread via GitHub
Fokko closed issue #100: test: Rest catalog integration test. URL: https://github.com/apache/iceberg-rust/issues/100 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

Re: [PR] Spark SystemFunctions are not pushed down during JOIN [iceberg]

2023-12-08 Thread via GitHub
advancedxy commented on PR #9233: URL: https://github.com/apache/iceberg/pull/9233#issuecomment-1847450030 > ``` == Physical Plan == ReplaceData (13) +- * Sort (12) +- * Project (11) +- MergeRows (10) +- SortMergeJoin FullOuter (9) < Full Outer here

Re: [PR] Spark SystemFunctions are not pushed down during JOIN [iceberg]

2023-12-08 Thread via GitHub
tmnd1991 commented on PR #9233: URL: https://github.com/apache/iceberg/pull/9233#issuecomment-1847513665 yes sorry, there’s also a when not matched statement. i can’t attach the plan, but i’ll push a reproducer soon -- This is an automated message from the Apache Git Service. To respond t

Re: [PR] Flink: Create JUnit5 version of TestFlinkScan [iceberg]

2023-12-08 Thread via GitHub
rodmeneses commented on code in PR #9185: URL: https://github.com/apache/iceberg/pull/9185#discussion_r1420794807 ## data/src/test/java/org/apache/iceberg/data/GenAppenderHelper.java: ## @@ -0,0 +1,144 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

Re: [PR] Flink: Create JUnit5 version of TestFlinkScan [iceberg]

2023-12-08 Thread via GitHub
rodmeneses commented on code in PR #9185: URL: https://github.com/apache/iceberg/pull/9185#discussion_r1420797428 ## flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/TestHelpers.java: ## @@ -193,109 +192,106 @@ private static void assertEquals( return; } -

Re: [PR] Core: Fix null partitions in PartitionSet [iceberg]

2023-12-08 Thread via GitHub
aokolnychyi closed pull request #9248: Core: Fix null partitions in PartitionSet URL: https://github.com/apache/iceberg/pull/9248 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Spark 3.5: Fix testReplacePartitionField for Rewrite Manifests [iceberg]

2023-12-08 Thread via GitHub
aokolnychyi commented on PR #9250: URL: https://github.com/apache/iceberg/pull/9250#issuecomment-1847552754 Thanks, @bknbkn! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Spark 3.5: Fix testReplacePartitionField for Rewrite Manifests [iceberg]

2023-12-08 Thread via GitHub
aokolnychyi merged PR #9250: URL: https://github.com/apache/iceberg/pull/9250 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] Flink: Create JUnit5 version of TestFlinkScan [iceberg]

2023-12-08 Thread via GitHub
rodmeneses commented on code in PR #9185: URL: https://github.com/apache/iceberg/pull/9185#discussion_r1420810926 ## flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/TestHelpers.java: ## @@ -523,89 +510,103 @@ public static void assertEquals(ManifestFile expected, Manif

Re: [PR] Spec: Clarify time travel implementation in Iceberg [iceberg]

2023-12-08 Thread via GitHub
emkornfield commented on PR #8982: URL: https://github.com/apache/iceberg/pull/8982#issuecomment-1847572861 @aokolnychyi did my changes address your feedback properly? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [PR] Spec: Clarify which columns can be used for equality delete files. [iceberg]

2023-12-08 Thread via GitHub
emkornfield commented on PR #8981: URL: https://github.com/apache/iceberg/pull/8981#issuecomment-1847573238 @Fokko or @rdblue would you mind taking a look? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] Switch to junit5 for mr [iceberg]

2023-12-08 Thread via GitHub
lschetanrao commented on code in PR #9241: URL: https://github.com/apache/iceberg/pull/9241#discussion_r1420849913 ## mr/src/test/java/org/apache/iceberg/mr/TestCatalogs.java: ## @@ -176,8 +178,10 @@ public void testCreateDropTableToCatalog() throws IOException { HadoopCat

Re: [PR] Switch to junit5 for mr [iceberg]

2023-12-08 Thread via GitHub
lschetanrao commented on code in PR #9241: URL: https://github.com/apache/iceberg/pull/9241#discussion_r1420849452 ## mr/src/test/java/org/apache/iceberg/mr/TestCatalogs.java: ## @@ -54,9 +53,9 @@ public class TestCatalogs { private Configuration conf; - @Rule public Tem

Re: [PR] Switch to junit5 for mr [iceberg]

2023-12-08 Thread via GitHub
lschetanrao commented on code in PR #9241: URL: https://github.com/apache/iceberg/pull/9241#discussion_r1420851007 ## mr/src/test/java/org/apache/iceberg/mr/TestCatalogs.java: ## @@ -212,11 +216,11 @@ public void testLoadCatalogHive() { InputFormatConfig.catalogProperty

Re: [PR] Switch to junit5 for mr [iceberg]

2023-12-08 Thread via GitHub
lschetanrao commented on code in PR #9241: URL: https://github.com/apache/iceberg/pull/9241#discussion_r1420851226 ## mr/src/test/java/org/apache/iceberg/mr/hive/TestDeserializer.java: ## @@ -35,9 +35,9 @@ import org.apache.iceberg.hive.HiveVersion; import org.apache.iceberg.m

Re: [PR] Switch to junit5 for mr [iceberg]

2023-12-08 Thread via GitHub
lschetanrao commented on code in PR #9241: URL: https://github.com/apache/iceberg/pull/9241#discussion_r1420851466 ## mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergFilterFactory.java: ## @@ -82,10 +80,12 @@ public void testNotEqualsOperand() { UnboundPredicate

Re: [PR] Switch to junit5 for mr [iceberg]

2023-12-08 Thread via GitHub
lschetanrao commented on code in PR #9241: URL: https://github.com/apache/iceberg/pull/9241#discussion_r1420851699 ## mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergOutputCommitter.java: ## @@ -80,7 +79,7 @@ public class TestHiveIcebergOutputCommitter { private st

Re: [PR] Switch to junit5 for mr [iceberg]

2023-12-08 Thread via GitHub
lschetanrao commented on code in PR #9241: URL: https://github.com/apache/iceberg/pull/9241#discussion_r1420850601 ## mr/src/test/java/org/apache/iceberg/mr/TestCatalogs.java: ## @@ -198,11 +202,11 @@ public void testCreateDropTableToCatalog() throws IOException { public voi

Re: [PR] Switch to junit5 for mr [iceberg]

2023-12-08 Thread via GitHub
lschetanrao commented on code in PR #9241: URL: https://github.com/apache/iceberg/pull/9241#discussion_r1420852196 ## mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergOutputCommitter.java: ## @@ -201,22 +203,22 @@ public void writerIsClosedAfterTaskCommitFailure() thro

[PR] Add support for CreateScan and GetScanTasks in RESTCatalog [iceberg]

2023-12-08 Thread via GitHub
rahil-c opened a new pull request, #9252: URL: https://github.com/apache/iceberg/pull/9252 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-ma

Re: [PR] Switch to junit5 for mr [iceberg]

2023-12-08 Thread via GitHub
lschetanrao commented on code in PR #9241: URL: https://github.com/apache/iceberg/pull/9241#discussion_r1420858188 ## mr/src/test/java/org/apache/iceberg/mr/hive/HiveIcebergTestUtils.java: ## @@ -63,7 +63,7 @@ import org.apache.iceberg.relocated.com.google.common.collect.Lists;

Re: [PR] Switch to junit5 for mr [iceberg]

2023-12-08 Thread via GitHub
lschetanrao commented on code in PR #9241: URL: https://github.com/apache/iceberg/pull/9241#discussion_r1420860637 ## mr/src/test/java/org/apache/iceberg/mr/hive/HiveIcebergTestUtils.java: ## @@ -288,9 +287,10 @@ public static void validateFiles(Table table, Configuration conf,

Re: [PR] Switch to junit5 for mr [iceberg]

2023-12-08 Thread via GitHub
lschetanrao commented on code in PR #9241: URL: https://github.com/apache/iceberg/pull/9241#discussion_r1420862455 ## mr/src/test/java/org/apache/iceberg/mr/hive/HiveIcebergTestUtils.java: ## @@ -265,7 +264,7 @@ public static void validateData(List expected, List actual, int

[PR] Add doc for rewriting manifest with spec id [iceberg]

2023-12-08 Thread via GitHub
puchengy opened a new pull request, #9253: URL: https://github.com/apache/iceberg/pull/9253 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-m

Re: [PR] Add doc for rewriting manifest with spec id [iceberg]

2023-12-08 Thread via GitHub
puchengy commented on PR #9253: URL: https://github.com/apache/iceberg/pull/9253#issuecomment-1847647392 @RussellSpitzer Do you know how to format the doc and how to test the doc locally? I can't find the doc, Thanks. -- This is an automated message from the Apache Git Service. To respond

Re: [PR] Flink: Fix IcebergSource tableloader lifecycle management in batch mode [iceberg]

2023-12-08 Thread via GitHub
stevenzwu commented on code in PR #9173: URL: https://github.com/apache/iceberg/pull/9173#discussion_r1420907269 ## flink/v1.17/flink/src/main/java/org/apache/iceberg/flink/source/IcebergSource.java: ## @@ -105,12 +107,12 @@ public class IcebergSource implements Source

Re: [PR] Flink: Fix IcebergSource tableloader lifecycle management in batch mode [iceberg]

2023-12-08 Thread via GitHub
stevenzwu commented on code in PR #9173: URL: https://github.com/apache/iceberg/pull/9173#discussion_r1420908320 ## flink/v1.17/flink/src/main/java/org/apache/iceberg/flink/source/IcebergSource.java: ## @@ -105,12 +107,12 @@ public class IcebergSource implements Source

Re: [PR] Add support for CreateScan and GetScanTasks in RESTCatalog [iceberg]

2023-12-08 Thread via GitHub
jackye1995 commented on code in PR #9252: URL: https://github.com/apache/iceberg/pull/9252#discussion_r1420933770 ## open-api/rest-catalog-open-api.yaml: ## @@ -530,6 +530,111 @@ paths: 5XX: $ref: '#/components/responses/ServerErrorResponse' + /v1/{prefix}

Re: [PR] Add support for CreateScan and GetScanTasks in RESTCatalog [iceberg]

2023-12-08 Thread via GitHub
jackye1995 commented on code in PR #9252: URL: https://github.com/apache/iceberg/pull/9252#discussion_r1420934425 ## open-api/rest-catalog-open-api.yaml: ## @@ -2615,6 +2888,23 @@ components: additionalProperties: type: string +CreateScanRequest: +

Re: [PR] Add support for CreateScan and GetScanTasks in RESTCatalog [iceberg]

2023-12-08 Thread via GitHub
jackye1995 commented on code in PR #9252: URL: https://github.com/apache/iceberg/pull/9252#discussion_r1420934962 ## open-api/rest-catalog-open-api.yaml: ## @@ -2615,6 +2888,23 @@ components: additionalProperties: type: string +CreateScanRequest: +

Re: [PR] Core: Fix null partitions in PartitionSet [iceberg]

2023-12-08 Thread via GitHub
aokolnychyi merged PR #9248: URL: https://github.com/apache/iceberg/pull/9248 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] Core: Fix null partitions in PartitionSet [iceberg]

2023-12-08 Thread via GitHub
aokolnychyi commented on PR #9248: URL: https://github.com/apache/iceberg/pull/9248#issuecomment-1847830279 Thanks, @RussellSpitzer! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] Flink: switch to use SortKey for data statistics [iceberg]

2023-12-08 Thread via GitHub
stevenzwu merged PR #9212: URL: https://github.com/apache/iceberg/pull/9212 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

Re: [PR] Flink: switch to use SortKey for data statistics [iceberg]

2023-12-08 Thread via GitHub
stevenzwu commented on PR #9212: URL: https://github.com/apache/iceberg/pull/9212#issuecomment-1847835239 thanks @pvary for the review -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [I] `GlueTableOperations` retries on Access Denied exceptions from S3, and does not support configuration of exception retry logic [iceberg]

2023-12-08 Thread via GitHub
ExplorData24 commented on issue #9124: URL: https://github.com/apache/iceberg/issues/9124#issuecomment-1847839526 @singhpk234 @lognoel @dacort @electrum @martint Hello. I am using Hive Catalog to create Iceberg tables with Spark as the execution engine: impo

Re: [I] `GlueTableOperations` retries on Access Denied exceptions from S3, and does not support configuration of exception retry logic [iceberg]

2023-12-08 Thread via GitHub
lognoel commented on issue #9124: URL: https://github.com/apache/iceberg/issues/9124#issuecomment-1847845419 > = you're leaking your AWS creds in the comment, I suggest you redact them immediately -- This is an automated message from the Apache Git Service. To respond to the messag

Re: [I] `GlueTableOperations` retries on Access Denied exceptions from S3, and does not support configuration of exception retry logic [iceberg]

2023-12-08 Thread via GitHub
ExplorData24 commented on issue #9124: URL: https://github.com/apache/iceberg/issues/9124#issuecomment-1847855484 @lognoel I deleted them thank you very much. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [I] Unable to write to iceberg table using spark [iceberg]

2023-12-08 Thread via GitHub
ExplorData24 commented on issue #8419: URL: https://github.com/apache/iceberg/issues/8419#issuecomment-1847856969 @palanik1 @di2mot @maulanaady @RussellSpitzer @dacort Hello. I am using Hive Catalog to create Iceberg tables with Spark as the execution engine:

Re: [I] How to realize Write Iceberg Tables via Hive? (Ideas share) [iceberg]

2023-12-08 Thread via GitHub
ExplorData24 commented on issue #2685: URL: https://github.com/apache/iceberg/issues/2685#issuecomment-1847860494 @aimenglin @bitsondatadev @marton-bod @dacort @electrum Hello. I am using Hive Catalog to create Iceberg tables with Spark as the execution engine:

Re: [I] Running iceberg with spark 3 in local mode [iceberg]

2023-12-08 Thread via GitHub
ExplorData24 commented on issue #2176: URL: https://github.com/apache/iceberg/issues/2176#issuecomment-1847862051 @adnanhb @jackye1995 Hello. I am using Hive Catalog to create Iceberg tables with Spark as the execution engine: import pyspark from pyspark.sql import S

Re: [I] How to realize Write Iceberg Tables via Hive? (Ideas share) [iceberg]

2023-12-08 Thread via GitHub
aimenglin commented on issue #2685: URL: https://github.com/apache/iceberg/issues/2685#issuecomment-1847892200 Hi Rym, I understand you're inquiring about writing Iceberg tables using Hive. Based on my experience and experiments, it appears that directly writing to Iceberg tables

Re: [I] How to realize Write Iceberg Tables via Hive? (Ideas share) [iceberg]

2023-12-08 Thread via GitHub
ExplorData24 commented on issue #2685: URL: https://github.com/apache/iceberg/issues/2685#issuecomment-1847923590 Hi @aimenglin. First of all, thank you very much for your availability. I actually followed the part **Custom Iceberg catalogs** of this link: https://iceberg.incubato

Re: [PR] Build: Bump mkdocs-material from 9.4.14 to 9.5.0 [iceberg-python]

2023-12-08 Thread via GitHub
dependabot[bot] commented on PR #196: URL: https://github.com/apache/iceberg-python/pull/196#issuecomment-1847941081 Superseded by #197. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

[PR] Build: Bump mkdocs-material from 9.4.14 to 9.5.1 [iceberg-python]

2023-12-08 Thread via GitHub
dependabot[bot] opened a new pull request, #197: URL: https://github.com/apache/iceberg-python/pull/197 Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 9.4.14 to 9.5.1. Release notes Sourced from https://github.com/squidfunk/mkdocs-material/releases";>mkd

Re: [PR] Build: Bump mkdocs-material from 9.4.14 to 9.5.0 [iceberg-python]

2023-12-08 Thread via GitHub
dependabot[bot] closed pull request #196: Build: Bump mkdocs-material from 9.4.14 to 9.5.0 URL: https://github.com/apache/iceberg-python/pull/196 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] Spark SystemFunctions are not pushed down during JOIN [iceberg]

2023-12-08 Thread via GitHub
tmnd1991 commented on PR #9233: URL: https://github.com/apache/iceberg/pull/9233#issuecomment-1847976194 Finally I got a reproducer inside the codebase, you can find it at `TestSPJWithBucketing`. Spark 3.4 (same as my app) with the condition on the partitions will actually prune the unaf

Re: [I] Build: rebasing a fork triggers redundant CI runs against master commits [iceberg]

2023-12-08 Thread via GitHub
github-actions[bot] commented on issue #7819: URL: https://github.com/apache/iceberg/issues/7819#issuecomment-1847993683 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Docs: Give Metadata Tables their Own Page [iceberg]

2023-12-08 Thread via GitHub
github-actions[bot] commented on issue #7793: URL: https://github.com/apache/iceberg/issues/7793#issuecomment-1847993704 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Docs: Include references GCP libraries [iceberg]

2023-12-08 Thread via GitHub
github-actions[bot] commented on issue #7787: URL: https://github.com/apache/iceberg/issues/7787#issuecomment-1847993726 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] [Spark] Few package dependency issues of Iceberg and AWS [iceberg]

2023-12-08 Thread via GitHub
github-actions[bot] commented on issue #7737: URL: https://github.com/apache/iceberg/issues/7737#issuecomment-1847993762 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] feat(docs): example of multiple catalogs defined in .pyiceberg.yaml [iceberg-python]

2023-12-08 Thread via GitHub
HonahX commented on code in PR #194: URL: https://github.com/apache/iceberg-python/pull/194#discussion_r1421197076 ## mkdocs/docs/api.md: ## @@ -33,6 +33,20 @@ catalog: credential: t-1234:secret ``` +Note that multiple catalogs can be defined in the same `.pyiceberg.yaml

[PR] Add UnboundSortOrder [iceberg-rust]

2023-12-08 Thread via GitHub
fqaiser94 opened a new pull request, #115: URL: https://github.com/apache/iceberg-rust/pull/115 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [I] Parquet file overwritten by spark streaming job in subsequent execution with same spark streaming checkpoint location [iceberg]

2023-12-08 Thread via GitHub
amogh-jahagirdar commented on issue #9172: URL: https://github.com/apache/iceberg/issues/9172#issuecomment-1848220358 Thanks for the details, one key thing stands out to me: ``` I also tested with latest version, iceberg-spark-runtime-3.4_2.12-1.4.2.jar as well, I could see that th

Re: [PR] Spark Streaming: Fix clobbering of files across streaming epochs [iceberg]

2023-12-08 Thread via GitHub
amogh-jahagirdar commented on code in PR #9255: URL: https://github.com/apache/iceberg/pull/9255#discussion_r1421271945 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkWrite.java: ## @@ -673,11 +673,11 @@ public DataWriter createWriter(int partitionId, lo

  1   2   >