Re: [I] to_pandas() API which converts iceberg table scan to a pd.DataFrame will lost datetime data type and row order [iceberg-python]

2023-11-10 Thread via GitHub
zeddit commented on issue #132: URL: https://github.com/apache/iceberg-python/issues/132#issuecomment-1806675152 @Fokko great thanks for your reply. I have carefully read the bullets and I am still not clear if it is realizable for `pyiceberg` to pull out data in a in-order way, i.e.

Re: [I] Flink: Add support for Flink 1.18 [iceberg]

2023-11-10 Thread via GitHub
PrabhuJoseph commented on issue #8930: URL: https://github.com/apache/iceberg/issues/8930#issuecomment-1806668299 The failing test TestFlinkMetaDataTable#testAllFilesPartitioned is due to this a Flink side issue which i reported here - https://issues.apache.org/jira/browse/FLINK-33523. --

Re: [I] Substitue in memory data struct's timestamp type for DataTime rather i64 to simplify usage. [iceberg-rust]

2023-11-10 Thread via GitHub
fqaiser94 commented on issue #90: URL: https://github.com/apache/iceberg-rust/issues/90#issuecomment-1806616364 Welp, I started looking at this ticket before the most recent comments :S I went with `chrono::DateTime` as well but had to wrap it in a NewType to implement some traits. N

[I] org.apache.iceberg.hive.RuntimeMetaException: Failed to connect to Hive Metastore at [iceberg]

2023-11-10 Thread via GitHub
whymed opened a new issue, #9030: URL: https://github.com/apache/iceberg/issues/9030 ### Apache Iceberg version 1.4.2 (latest release) ### Query engine Hive ### Please describe the bug ๐Ÿž My stack: Hadoop 3.3.6, Hive 3.1.3 and Iceberg hive runtime 1.4.2

Re: [I] insert to hive table with icberg table format is failing [iceberg]

2023-11-10 Thread via GitHub
whymed commented on issue #7840: URL: https://github.com/apache/iceberg/issues/7840#issuecomment-1806591748 My stack: Hadoop 3.3.6, Hive 3.1.3 and Iceberg hive runtime 1.4.2 I want to use the same default catalog from hive, so I have not configured any of `iceberg.catalog` mentioned on th

Re: [I] to_pandas() API which converts iceberg table scan to a pd.DataFrame will lost datetime data type and row order [iceberg-python]

2023-11-10 Thread via GitHub
Fokko commented on issue #132: URL: https://github.com/apache/iceberg-python/issues/132#issuecomment-1806447469 @zeddit there is always a trade-off. Fast-append are quicker and are less likely to suffer from conflicts. Please check [this](https://iceberg.apache.org/spec/#snapshots) link, an

[PR] Override useCommitCoordinator to false in Spark3.4 [iceberg]

2023-11-10 Thread via GitHub
huaxingao opened a new pull request, #9028: URL: https://github.com/apache/iceberg/pull/9028 Override `useCommitCoordinator` to false in Spark 3.4 Here is the [PR](https://github.com/apache/iceberg/pull/9017) for Spark3.5 -- This is an automated message from the Apache Git Service.

[PR] Override useCommitCoordinator to false in BaseStreamingWriter [iceberg]

2023-11-10 Thread via GitHub
huaxingao opened a new pull request, #9027: URL: https://github.com/apache/iceberg/pull/9027 Followup PR for https://github.com/apache/iceberg/pull/9017: overriding `useCommitCoordinator` to false in `BaseStreamingWriter` -- This is an automated message from the Apache Git Service. To res

Re: [PR] Add list-refs cli command [iceberg-python]

2023-11-10 Thread via GitHub
Fokko merged PR #137: URL: https://github.com/apache/iceberg-python/pull/137 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

[PR] Docs: Add section on pandas [iceberg-python]

2023-11-10 Thread via GitHub
Fokko opened a new pull request, #138: URL: https://github.com/apache/iceberg-python/pull/138 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e

Re: [PR] Override useCommitCoordinator to false [iceberg]

2023-11-10 Thread via GitHub
huaxingao commented on PR #9017: URL: https://github.com/apache/iceberg/pull/9017#issuecomment-1806335718 Thanks @aokolnychyi I will have a follow up to fix `BaseStreamingWrite`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on t

Re: [PR] Override useCommitCoordinator to false [iceberg]

2023-11-10 Thread via GitHub
aokolnychyi commented on PR #9017: URL: https://github.com/apache/iceberg/pull/9017#issuecomment-1806330150 Thanks, @huaxingao! Could you do a similar fix for `BaseStreamingWrite`? I -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [PR] Override useCommitCoordinator to false [iceberg]

2023-11-10 Thread via GitHub
aokolnychyi merged PR #9017: URL: https://github.com/apache/iceberg/pull/9017 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [I] Spark does not support time datatype [iceberg]

2023-11-10 Thread via GitHub
Fokko commented on issue #9006: URL: https://github.com/apache/iceberg/issues/9006#issuecomment-1806309748 Ah, I didn't realize that this isn't supported by Spark, see https://github.com/apache/spark/pull/25678#issuecomment-531585556 I'm sorry, but I think we can close this one. In the Gith

Re: [I] Spark does not support time datatype [iceberg]

2023-11-10 Thread via GitHub
Fokko closed issue #9006: Spark does not support time datatype URL: https://github.com/apache/iceberg/issues/9006 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe

Re: [PR] Parquet: Remove the row position since parquet row group has it natively [iceberg]

2023-11-10 Thread via GitHub
flyrain commented on PR #6056: URL: https://github.com/apache/iceberg/pull/6056#issuecomment-1806287219 @Fokko, yes, we can revisit this with the new parquet release. It has the change Iceberg required. It should be safe and clean. Although, i'm not sure the issue mentioned by @ricardoperei

Re: [I] Merge into second commit when with no changes [iceberg]

2023-11-10 Thread via GitHub
Fokko commented on issue #9024: URL: https://github.com/apache/iceberg/issues/9024#issuecomment-1806274922 I was able to reproduce this locally: ![image](https://github.com/apache/iceberg/assets/1134248/74d876cb-9b89-403f-abd9-8a36e8b6d12b) @aokolnychyi WDYT? -- This is an a

Re: [PR] Docs: Update spark-queries.md [iceberg]

2023-11-10 Thread via GitHub
Fokko merged PR #9021: URL: https://github.com/apache/iceberg/pull/9021 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

Re: [PR] Parquet: Remove the row position since parquet row group has it natively [iceberg]

2023-11-10 Thread via GitHub
Fokko commented on PR #6056: URL: https://github.com/apache/iceberg/pull/6056#issuecomment-1806267078 @flyrain @chenjunjiedada do we want to revisit this since we're on Parquet 1.13.1? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[I] Spark SQL DESCRIBE not showing proper schema on a branch [iceberg]

2023-11-10 Thread via GitHub
cccs-eric opened a new issue, #9026: URL: https://github.com/apache/iceberg/issues/9026 ### Apache Iceberg version 1.3.1 ### Query engine Spark ### Please describe the bug ๐Ÿž This started with a [question](https://apache-iceberg.slack.com/archives/C025PH0G1D

Re: [PR] Nessie: Support views for NessieCatalog [iceberg]

2023-11-10 Thread via GitHub
ajantha-bhat commented on code in PR #8909: URL: https://github.com/apache/iceberg/pull/8909#discussion_r1389625938 ## nessie/src/test/java/org/apache/iceberg/nessie/TestNessieViewCatalog.java: ## @@ -0,0 +1,222 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] Shift site build to use monorepo and gh-pages [iceberg]

2023-11-10 Thread via GitHub
bitsondatadev commented on code in PR #8919: URL: https://github.com/apache/iceberg/pull/8919#discussion_r1389541886 ## site/README.md: ## @@ -35,107 +35,101 @@ In MkDocs, the [`docs_dir`](https://www.mkdocs.org/user-guide/configuration/#doc ### Iceberg docs layout -In the

Re: [PR] Shift site build to use monorepo and gh-pages [iceberg]

2023-11-10 Thread via GitHub
bitsondatadev commented on code in PR #8919: URL: https://github.com/apache/iceberg/pull/8919#discussion_r1389531071 ## site/mkdocs.yml: ## @@ -23,33 +23,37 @@ theme: logo: assets/images/iceberg-logo-icon.png favicon: assets/images/favicon-96x96.png features: +- con

Re: [PR] Shift site build to use monorepo and gh-pages [iceberg]

2023-11-10 Thread via GitHub
bitsondatadev commented on code in PR #8919: URL: https://github.com/apache/iceberg/pull/8919#discussion_r1389531071 ## site/mkdocs.yml: ## @@ -23,33 +23,37 @@ theme: logo: assets/images/iceberg-logo-icon.png favicon: assets/images/favicon-96x96.png features: +- con

Re: [I] locationProvider should be marked as transient in SerializableTable [iceberg]

2023-11-10 Thread via GitHub
przemekd commented on issue #9025: URL: https://github.com/apache/iceberg/issues/9025#issuecomment-1805863822 @nastra could you assign me to this one please? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

Re: [PR] Nessie: reimplement namespace operations [iceberg]

2023-11-10 Thread via GitHub
adutra commented on PR #8857: URL: https://github.com/apache/iceberg/pull/8857#issuecomment-1805861658 > @adutra have you checked whether all of these changes are still working with the Nessie integration in Trino/Presto? Just executed all tests in `io.trino.plugin.iceberg.catalog.ne

Re: [PR] Nessie: reimplement namespace operations [iceberg]

2023-11-10 Thread via GitHub
adutra commented on code in PR #8857: URL: https://github.com/apache/iceberg/pull/8857#discussion_r1389395174 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java: ## @@ -181,133 +186,220 @@ public IcebergTable table(TableIdentifier tableIdentifier) { }

Re: [PR] Hive: Refactor TestHiveCatalog tests to use the core CatalogTests [iceberg]

2023-11-10 Thread via GitHub
pvary commented on code in PR #8918: URL: https://github.com/apache/iceberg/pull/8918#discussion_r1389371040 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalog.java: ## @@ -500,6 +511,9 @@ protected String defaultWarehouseLocation(TableIdentifier tableIdentifie

Re: [PR] Hive: Refactor TestHiveCatalog tests to use the core CatalogTests [iceberg]

2023-11-10 Thread via GitHub
pvary commented on code in PR #8918: URL: https://github.com/apache/iceberg/pull/8918#discussion_r1389370838 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalog.java: ## @@ -261,6 +261,12 @@ public void renameTable(TableIdentifier from, TableIdentifier originalT

Re: [PR] Core: Enable column statistics filtering after planning [iceberg]

2023-11-10 Thread via GitHub
pvary commented on code in PR #8803: URL: https://github.com/apache/iceberg/pull/8803#discussion_r1389360789 ## core/src/main/java/org/apache/iceberg/BaseFile.java: ## @@ -185,13 +190,25 @@ public PartitionData copy() { this.partitionType = toCopy.partitionType; this.r

Re: [PR] Hive: Refactor HiveTableOperations with common code for View. [iceberg]

2023-11-10 Thread via GitHub
pvary commented on code in PR #9011: URL: https://github.com/apache/iceberg/pull/9011#discussion_r1389355915 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java: ## @@ -156,7 +147,7 @@ protected void doRefresh() { String metadataLocation = null;

Re: [PR] Hive: Refactor HiveTableOperations with common code for View. [iceberg]

2023-11-10 Thread via GitHub
pvary commented on code in PR #9011: URL: https://github.com/apache/iceberg/pull/9011#discussion_r1389352401 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveMetastoreConnector.java: ## @@ -0,0 +1,173 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] Hive: Refactor HiveTableOperations with common code for View. [iceberg]

2023-11-10 Thread via GitHub
pvary commented on code in PR #9011: URL: https://github.com/apache/iceberg/pull/9011#discussion_r1389343659 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveMetastoreConnector.java: ## @@ -0,0 +1,173 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] Hive: Refactor HiveTableOperations with common code for View. [iceberg]

2023-11-10 Thread via GitHub
pvary commented on code in PR #9011: URL: https://github.com/apache/iceberg/pull/9011#discussion_r1389343294 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveMetastoreConnector.java: ## @@ -0,0 +1,173 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] Hive: Refactor HiveTableOperations with common code for View. [iceberg]

2023-11-10 Thread via GitHub
pvary commented on code in PR #9011: URL: https://github.com/apache/iceberg/pull/9011#discussion_r1389343659 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveMetastoreConnector.java: ## @@ -0,0 +1,173 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] Hive: Refactor HiveTableOperations with common code for View. [iceberg]

2023-11-10 Thread via GitHub
pvary commented on code in PR #9011: URL: https://github.com/apache/iceberg/pull/9011#discussion_r1389341091 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveMetastoreConnector.java: ## @@ -0,0 +1,173 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] Hive: Refactor HiveTableOperations with common code for View. [iceberg]

2023-11-10 Thread via GitHub
pvary commented on code in PR #9011: URL: https://github.com/apache/iceberg/pull/9011#discussion_r1389340756 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveMetastoreConnector.java: ## @@ -0,0 +1,173 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] Hive: Refactor HiveTableOperations with common code for View. [iceberg]

2023-11-10 Thread via GitHub
pvary commented on code in PR #9011: URL: https://github.com/apache/iceberg/pull/9011#discussion_r1389339429 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveMetastoreConnector.java: ## @@ -0,0 +1,173 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

[I] Merge into second commit when with no changes [iceberg]

2023-11-10 Thread via GitHub
andreacfm opened a new issue, #9024: URL: https://github.com/apache/iceberg/issues/9024 ### Apache Iceberg version 1.4.1 ### Query engine Spark ### Please describe the bug ๐Ÿž When executing a โ€œmerge intoโ€ query the source data is committed even if not matche

Re: [PR] Hive: Refactor HiveTableOperations with common code for View. [iceberg]

2023-11-10 Thread via GitHub
pvary commented on code in PR #9011: URL: https://github.com/apache/iceberg/pull/9011#discussion_r1389307575 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveMetastoreConnector.java: ## @@ -0,0 +1,173 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [I] insert to hive table with icberg table format is failing [iceberg]

2023-11-10 Thread via GitHub
sickdatascientist commented on issue #7840: URL: https://github.com/apache/iceberg/issues/7840#issuecomment-1805578530 @infa-ibannatt I am experiencing the same issue with I have pseudo-distributed hadoop 3.3.6 cluster, and hive 3.1.3 installed and copied iceberg-hive-runtime.1.4.1.jar. Wha

Re: [PR] Spec: Fix view example [iceberg]

2023-11-10 Thread via GitHub
nastra merged PR #8966: URL: https://github.com/apache/iceberg/pull/8966 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Spec: Fix view example [iceberg]

2023-11-10 Thread via GitHub
nastra commented on code in PR #8966: URL: https://github.com/apache/iceberg/pull/8966#discussion_r1389281815 ## format/view-spec.md: ## @@ -239,12 +239,14 @@ s3://bucket/warehouse/default.db/event_agg/metadata/1-(uuid).metadata.json ``` Each change creates a new metada

Re: [PR] Spec: Fix view example [iceberg]

2023-11-10 Thread via GitHub
nastra commented on code in PR #8966: URL: https://github.com/apache/iceberg/pull/8966#discussion_r1389276940 ## format/view-spec.md: ## @@ -239,12 +239,14 @@ s3://bucket/warehouse/default.db/event_agg/metadata/1-(uuid).metadata.json ``` Each change creates a new metada

Re: [PR] Core: Enable column statistics filtering after planning [iceberg]

2023-11-10 Thread via GitHub
pvary commented on code in PR #8803: URL: https://github.com/apache/iceberg/pull/8803#discussion_r1389233962 ## core/src/main/java/org/apache/iceberg/BaseFile.java: ## @@ -185,13 +190,25 @@ public PartitionData copy() { this.partitionType = toCopy.partitionType; this.r

Re: [PR] Core: Enable column statistics filtering after planning [iceberg]

2023-11-10 Thread via GitHub
pvary commented on code in PR #8803: URL: https://github.com/apache/iceberg/pull/8803#discussion_r1389225623 ## core/src/main/java/org/apache/iceberg/BaseScan.java: ## @@ -165,6 +170,19 @@ public ThisT includeColumnStats() { return newRefinedScan(table, schema, context.sho

Re: [PR] Hive: Refactor TestHiveCatalog tests to use the core CatalogTests [iceberg]

2023-11-10 Thread via GitHub
nk1506 commented on code in PR #8918: URL: https://github.com/apache/iceberg/pull/8918#discussion_r1389225295 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalog.java: ## @@ -261,6 +261,12 @@ public void renameTable(TableIdentifier from, TableIdentifier original

Re: [PR] Core: Enable column statistics filtering after planning [iceberg]

2023-11-10 Thread via GitHub
pvary commented on code in PR #8803: URL: https://github.com/apache/iceberg/pull/8803#discussion_r1389220119 ## api/src/main/java/org/apache/iceberg/ContentFile.java: ## @@ -165,6 +166,20 @@ default Long fileSequenceNumber() { */ F copyWithoutStats(); + /** + * Copie

Re: [PR] Add list-refs cli command [iceberg-python]

2023-11-10 Thread via GitHub
Fokko commented on code in PR #137: URL: https://github.com/apache/iceberg-python/pull/137#discussion_r1389203952 ## pyiceberg/cli/console.py: ## @@ -372,3 +377,50 @@ def table(ctx: Context, identifier: str, property_name: str) -> None: # noqa: F ctx.exit(1) else

Re: [PR] Core: Enable column statistics filtering after planning [iceberg]

2023-11-10 Thread via GitHub
pvary commented on code in PR #8803: URL: https://github.com/apache/iceberg/pull/8803#discussion_r1389171088 ## api/src/main/java/org/apache/iceberg/Scan.java: ## @@ -77,6 +77,21 @@ public interface Scan> { */ ThisT includeColumnStats(); + /** + * Create a new scan f

Re: [I] iceberg reports an error after upgrading to 1.4.2 [iceberg]

2023-11-10 Thread via GitHub
nastra commented on issue #9018: URL: https://github.com/apache/iceberg/issues/9018#issuecomment-1805326371 @huanghanyu-vungle can you also please share your entire catalog configuration? -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [I] NoSuchMethodError: 'scala.Option org.apache.spark.sql.connector.expressions.BucketTransform [iceberg]

2023-11-10 Thread via GitHub
nastra commented on issue #9023: URL: https://github.com/apache/iceberg/issues/9023#issuecomment-1805325131 @DeelFeel it appears you're using `iceberg-spark-runtime-3.2_2.12` which is Spark **3.2** while you're also using Spark **3.3.0** via `org.apache.spark spark-core_2.12 3.3.0`, resulti

Re: [PR] Nessie: reimplement namespace operations [iceberg]

2023-11-10 Thread via GitHub
nastra commented on PR #8857: URL: https://github.com/apache/iceberg/pull/8857#issuecomment-1805311895 @adutra have you checked whether all of these changes are still working with the Nessie integration in Trino/Presto? -- This is an automated message from the Apache Git Service. To resp