Re: [PR] Build: Bump pyarrow from 14.0.0 to 14.0.1 [iceberg-python]

2023-11-09 Thread via GitHub
Fokko merged PR #136: URL: https://github.com/apache/iceberg-python/pull/136 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [I] NoSuchMethodError: 'scala.Option org.apache.spark.sql.connector.expressions.BucketTransform [iceberg]

2023-11-09 Thread via GitHub
DeelFeel commented on issue #9023: URL: https://github.com/apache/iceberg/issues/9023#issuecomment-1805231453 finalDataFrame .sortWithinPartitions($"dt", $"dh", $"event_type") .writeTo(tableName) .option(SparkWriteOptions.SPARK_MERGE_SCHEMA, "tru

Re: [PR] Add log entry for bloom filter [iceberg]

2023-11-09 Thread via GitHub
amogh-jahagirdar commented on code in PR #9010: URL: https://github.com/apache/iceberg/pull/9010#discussion_r1389011616 ## parquet/src/main/java/org/apache/iceberg/parquet/ParquetBloomRowGroupFilter.java: ## @@ -114,10 +119,17 @@ private boolean eval( Set filterRefs =

[I] NoSuchMethodError: 'scala.Option org.apache.spark.sql.connector.expressions.BucketTransform [iceberg]

2023-11-09 Thread via GitHub
DeelFeel opened a new issue, #9023: URL: https://github.com/apache/iceberg/issues/9023 ### Query engine Spark 3.3.0 iceberg 1.4.1 ### Question As the title suggests, I had such trouble. I've been adjusting all day. Can't fix it. Please save the baby. `23/

[PR] Add list-refs cli command [iceberg-python]

2023-11-09 Thread via GitHub
amogh-jahagirdar opened a new pull request, #137: URL: https://github.com/apache/iceberg-python/pull/137 Same as https://github.com/apache/iceberg/pull/8570, forgot to raise the PR here ![list-refs](https://github.com/apache/iceberg-python/assets/87500546/d41bc552-1643-435f-ac0c-3710137

Re: [PR] Spark 3.5: Extend action for rewriting manifests to support deletes [iceberg]

2023-11-09 Thread via GitHub
ajantha-bhat commented on code in PR #9020: URL: https://github.com/apache/iceberg/pull/9020#discussion_r1388758670 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteManifestsSparkAction.java: ## @@ -159,34 +169,49 @@ public RewriteManifests.Result execut

Re: [I] to_pandas() API which converts iceberg table scan to a pd.DataFrame will lost datetime data type and row order [iceberg-python]

2023-11-09 Thread via GitHub
zeddit commented on issue #132: URL: https://github.com/apache/iceberg-python/issues/132#issuecomment-1805097241 @Fokko it seems that trino always using fast-append operation is not a good choice. is there any documentation introducing how each engine modify manifests, and could you giv

Re: [I] Substitue in memory data struct's timestamp type for DataTime rather i64 to simplify usage. [iceberg-rust]

2023-11-09 Thread via GitHub
liurenjie1024 commented on issue #90: URL: https://github.com/apache/iceberg-rust/issues/90#issuecomment-1805050925 > Shall we use ```chrono::DateTime``` here? I think it's great. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [I] table created by pyiceberg could not interoperate well with trino [iceberg-python]

2023-11-09 Thread via GitHub
zeddit commented on issue #134: URL: https://github.com/apache/iceberg-python/issues/134#issuecomment-1805030211 I am using trino version of 423 before. after I upgrade it to 432, the table created in pyiceberg could be accessed by trino. -- This is an automated message from the Apache

Re: [I] table created by pyiceberg could not interoperate well with trino [iceberg-python]

2023-11-09 Thread via GitHub
zeddit closed issue #134: table created by pyiceberg could not interoperate well with trino URL: https://github.com/apache/iceberg-python/issues/134 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [I] Duplicate file name in Iceberg's metadata [iceberg]

2023-11-09 Thread via GitHub
amogh-jahagirdar commented on issue #8953: URL: https://github.com/apache/iceberg/issues/8953#issuecomment-1805005236 That being said, I do see there have been a few issues reported that are very similar as @github-raphael-douyere pointed out. I'm looking into this -- This is an automate

Re: [I] Duplicate file name in Iceberg's metadata [iceberg]

2023-11-09 Thread via GitHub
amogh-jahagirdar commented on issue #8953: URL: https://github.com/apache/iceberg/issues/8953#issuecomment-1805003446 It looks like Spark creates a data writer per task, so we should be good there https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execu

Re: [I] Duplicate file name in Iceberg's metadata [iceberg]

2023-11-09 Thread via GitHub
amogh-jahagirdar commented on issue #8953: URL: https://github.com/apache/iceberg/issues/8953#issuecomment-1805001659 Looking at the code, this shouldn't happen but would need to check more deeply. We create an `OutputFileFactory` per writer, https://github.com/apache/iceberg/blob/main/spa

Re: [I] manifest exception [iceberg]

2023-11-09 Thread via GitHub
wangtaohz commented on issue #8994: URL: https://github.com/apache/iceberg/issues/8994#issuecomment-1805001260 I guess the missing data file was caused by `deleteOrphanFiles set olderThan(System.currentTimeMillis())`, which mistakenly treated the uncommitted data files as orphan files and d

[I] Flink SQL Select ORDER BY Loss Some [iceberg]

2023-11-09 Thread via GitHub
a8356555 opened a new issue, #9022: URL: https://github.com/apache/iceberg/issues/9022 ### Apache Iceberg version 1.4.0 ### Query engine Flink ### Please describe the bug 🐞 FlinkSQL Read From Iceberg using ORDER BY clause caused some data loss. Flink vers

Re: [PR] Core: Enable column statistics filtering after planning [iceberg]

2023-11-09 Thread via GitHub
aokolnychyi commented on code in PR #8803: URL: https://github.com/apache/iceberg/pull/8803#discussion_r1388806220 ## api/src/main/java/org/apache/iceberg/ContentFile.java: ## @@ -165,6 +166,20 @@ default Long fileSequenceNumber() { */ F copyWithoutStats(); + /** + *

Re: [PR] Core: Enable column statistics filtering after planning [iceberg]

2023-11-09 Thread via GitHub
aokolnychyi commented on code in PR #8803: URL: https://github.com/apache/iceberg/pull/8803#discussion_r1388756877 ## api/src/main/java/org/apache/iceberg/ContentFile.java: ## @@ -165,6 +166,20 @@ default Long fileSequenceNumber() { */ F copyWithoutStats(); + /** + *

[PR] Docs: Update spark-queries.md [iceberg]

2023-11-09 Thread via GitHub
ymZhao1001 opened a new pull request, #9021: URL: https://github.com/apache/iceberg/pull/9021 org.apache.spark.sql.DataFrameReader does not take parameters -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] Spark 3.5: Extend action for rewriting manifests to support deletes [iceberg]

2023-11-09 Thread via GitHub
ajantha-bhat commented on code in PR #9020: URL: https://github.com/apache/iceberg/pull/9020#discussion_r1388793385 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteManifestsSparkAction.java: ## @@ -159,34 +169,49 @@ public RewriteManifests.Result execut

Re: [PR] Spark 3.5: Extend action for rewriting manifests to support deletes [iceberg]

2023-11-09 Thread via GitHub
aokolnychyi commented on code in PR #9020: URL: https://github.com/apache/iceberg/pull/9020#discussion_r1388792950 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestSparkContentFiles.java: ## @@ -161,40 +173,132 @@ private void checkSparkDataFile(Table table)

Re: [PR] Spark 3.5: Extend action for rewriting manifests to support deletes [iceberg]

2023-11-09 Thread via GitHub
aokolnychyi commented on code in PR #9020: URL: https://github.com/apache/iceberg/pull/9020#discussion_r1388792770 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteManifestsSparkAction.java: ## @@ -215,41 +240,45 @@ private Dataset buildManifestEntryDF(

Re: [PR] Spark 3.5: Extend action for rewriting manifests to support deletes [iceberg]

2023-11-09 Thread via GitHub
aokolnychyi commented on code in PR #9020: URL: https://github.com/apache/iceberg/pull/9020#discussion_r1388792477 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteManifestsSparkAction.java: ## @@ -159,34 +169,49 @@ public RewriteManifests.Result execute

Re: [PR] Spark 3.5: Extend action for rewriting manifests to support deletes [iceberg]

2023-11-09 Thread via GitHub
aokolnychyi commented on code in PR #9020: URL: https://github.com/apache/iceberg/pull/9020#discussion_r1388791747 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteManifestsSparkAction.java: ## @@ -264,14 +293,15 @@ private U withReusableDS(Dataset ds,

Re: [PR] Spark 3.5: Extend action for rewriting manifests to support deletes [iceberg]

2023-11-09 Thread via GitHub
ajantha-bhat commented on code in PR #9020: URL: https://github.com/apache/iceberg/pull/9020#discussion_r1388776171 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteManifestsAction.java: ## @@ -649,6 +659,243 @@ public void testRewriteLargeManifests

Re: [PR] Spark 3.5: Extend action for rewriting manifests to support deletes [iceberg]

2023-11-09 Thread via GitHub
ajantha-bhat commented on PR #9020: URL: https://github.com/apache/iceberg/pull/9020#issuecomment-1804915887 > @ajantha-bhat, would you have time to review and test to see if it works for your needs? Already on it. -- This is an automated message from the Apache Git Service. To re

Re: [PR] Spark 3.5: Extend action for rewriting manifests to support deletes [iceberg]

2023-11-09 Thread via GitHub
aokolnychyi commented on PR #9020: URL: https://github.com/apache/iceberg/pull/9020#issuecomment-1804906071 Oh, I did not know that one existed. @ajantha-bhat, would you have time to review and test to see if it works for your needs? -- This is an automated message from the Apache Git

Re: [PR] Spark 3.5: Extend action for rewriting manifests to support deletes [iceberg]

2023-11-09 Thread via GitHub
ajantha-bhat commented on PR #9020: URL: https://github.com/apache/iceberg/pull/9020#issuecomment-1804900328 @aokolnychyi: Can we please add this to PR description? Fixes: https://github.com/apache/iceberg/issues/6375 -- This is an automated message from the Apache Git Service. To r

Re: [I] iceberg reports an error after upgrading to 1.4.2 [iceberg]

2023-11-09 Thread via GitHub
amogh-jahagirdar commented on issue #9018: URL: https://github.com/apache/iceberg/issues/9018#issuecomment-1804898116 Details on the query you're running would be helpful if possible. Also is it specific to 1.4.2 or are older versions working for you? -- This is an automated message from

Re: [PR] Spark 3.5: Extend action for rewriting manifests to support deletes [iceberg]

2023-11-09 Thread via GitHub
aokolnychyi commented on code in PR #9020: URL: https://github.com/apache/iceberg/pull/9020#discussion_r1388703856 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteManifestsSparkAction.java: ## @@ -339,18 +369,82 @@ private ManifestWriterFactory manifest

Re: [PR] Add log entry for bloom filter [iceberg]

2023-11-09 Thread via GitHub
huaxingao commented on code in PR #9010: URL: https://github.com/apache/iceberg/pull/9010#discussion_r1388735548 ## parquet/src/main/java/org/apache/iceberg/parquet/ParquetBloomRowGroupFilter.java: ## @@ -120,6 +124,9 @@ private boolean eval( return ROWS_MIGHT_MATCH;

Re: [PR] Add log entry for bloom filter [iceberg]

2023-11-09 Thread via GitHub
huaxingao commented on code in PR #9010: URL: https://github.com/apache/iceberg/pull/9010#discussion_r1388734801 ## parquet/src/main/java/org/apache/iceberg/parquet/ParquetBloomRowGroupFilter.java: ## @@ -120,6 +124,9 @@ private boolean eval( return ROWS_MIGHT_MATCH;

Re: [PR] Docs: Fix Javadoc for ManifestFile [iceberg]

2023-11-09 Thread via GitHub
aokolnychyi commented on PR #9016: URL: https://github.com/apache/iceberg/pull/9016#issuecomment-1804884792 Thanks, @dramaticlly @flyrain! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] Docs: Fix Javadoc for ManifestFile [iceberg]

2023-11-09 Thread via GitHub
aokolnychyi merged PR #9016: URL: https://github.com/apache/iceberg/pull/9016 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [I] Iceberg / Spark writing to s3 warehouse : Unable to load region from any of the providers in the chain software [iceberg]

2023-11-09 Thread via GitHub
github-actions[bot] commented on issue #7570: URL: https://github.com/apache/iceberg/issues/7570#issuecomment-1804872449 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] Core: Avro writers use BlockingBinaryEncoder to enable array/map size calculations. [iceberg]

2023-11-09 Thread via GitHub
aokolnychyi commented on PR #8625: URL: https://github.com/apache/iceberg/pull/8625#issuecomment-1804868833 I'd love to take a look early next week. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [PR] Add dependabot to automatically update the site [iceberg]

2023-11-09 Thread via GitHub
aokolnychyi commented on PR #9004: URL: https://github.com/apache/iceberg/pull/9004#issuecomment-1804867762 Sorry, I misinterpreted the name of the PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] Add log entry for bloom filter [iceberg]

2023-11-09 Thread via GitHub
aokolnychyi commented on code in PR #9010: URL: https://github.com/apache/iceberg/pull/9010#discussion_r1388718109 ## parquet/src/main/java/org/apache/iceberg/parquet/ParquetBloomRowGroupFilter.java: ## @@ -120,6 +124,9 @@ private boolean eval( return ROWS_MIGHT_MATCH;

Re: [PR] Add log entry for bloom filter [iceberg]

2023-11-09 Thread via GitHub
aokolnychyi commented on code in PR #9010: URL: https://github.com/apache/iceberg/pull/9010#discussion_r1388717988 ## parquet/src/main/java/org/apache/iceberg/parquet/ParquetBloomRowGroupFilter.java: ## @@ -120,6 +124,9 @@ private boolean eval( return ROWS_MIGHT_MATCH;

Re: [PR] Add log entry for bloom filter [iceberg]

2023-11-09 Thread via GitHub
huaxingao commented on code in PR #9010: URL: https://github.com/apache/iceberg/pull/9010#discussion_r1388716831 ## parquet/src/main/java/org/apache/iceberg/parquet/ParquetBloomRowGroupFilter.java: ## @@ -48,8 +48,12 @@ import org.apache.parquet.schema.LogicalTypeAnnotation.De

Re: [PR] Add log entry for bloom filter [iceberg]

2023-11-09 Thread via GitHub
aokolnychyi commented on code in PR #9010: URL: https://github.com/apache/iceberg/pull/9010#discussion_r1388712294 ## parquet/src/main/java/org/apache/iceberg/parquet/ParquetBloomRowGroupFilter.java: ## @@ -48,8 +48,12 @@ import org.apache.parquet.schema.LogicalTypeAnnotation.

Re: [PR] Spark 3.5: Extend action for rewriting manifests to support deletes [iceberg]

2023-11-09 Thread via GitHub
aokolnychyi commented on code in PR #9020: URL: https://github.com/apache/iceberg/pull/9020#discussion_r1388703856 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteManifestsSparkAction.java: ## @@ -339,18 +369,82 @@ private ManifestWriterFactory manifest

Re: [PR] Spark 3.5: Extend action for rewriting manifests to support deletes [iceberg]

2023-11-09 Thread via GitHub
aokolnychyi commented on code in PR #9020: URL: https://github.com/apache/iceberg/pull/9020#discussion_r1388703237 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkContentFile.java: ## @@ -0,0 +1,243 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] Spark 3.4: Use rolling manifest writers when optimizing metadata [iceberg]

2023-11-09 Thread via GitHub
aokolnychyi commented on PR #9019: URL: https://github.com/apache/iceberg/pull/9019#issuecomment-1804841191 Thanks, @RussellSpitzer! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] Spark 3.4: Use rolling manifest writers when optimizing metadata [iceberg]

2023-11-09 Thread via GitHub
aokolnychyi merged PR #9019: URL: https://github.com/apache/iceberg/pull/9019 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] Core: Support replacing delete manifests [iceberg]

2023-11-09 Thread via GitHub
aokolnychyi commented on PR #9000: URL: https://github.com/apache/iceberg/pull/9000#issuecomment-1804811157 Thank you, @singhpk234 @RussellSpitzer! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Core: Support replacing delete manifests [iceberg]

2023-11-09 Thread via GitHub
aokolnychyi merged PR #9000: URL: https://github.com/apache/iceberg/pull/9000 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] Spark 3.5: Fix rewriting manifests for evolved unpartitioned V1 tables [iceberg]

2023-11-09 Thread via GitHub
aokolnychyi merged PR #9015: URL: https://github.com/apache/iceberg/pull/9015 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] Spark 3.5: Fix rewriting manifests for evolved unpartitioned V1 tables [iceberg]

2023-11-09 Thread via GitHub
aokolnychyi commented on PR #9015: URL: https://github.com/apache/iceberg/pull/9015#issuecomment-1804730712 Thank you, @singhpk234 @flyrain! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

[PR] Spark 3.4: Use rolling manifest writers when optimizing metadata [iceberg]

2023-11-09 Thread via GitHub
aokolnychyi opened a new pull request, #9019: URL: https://github.com/apache/iceberg/pull/9019 This PR cherry-picks PR #8972 to Spark 3.4. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] Core: Support replacing delete manifests [iceberg]

2023-11-09 Thread via GitHub
aokolnychyi closed pull request #9000: Core: Support replacing delete manifests URL: https://github.com/apache/iceberg/pull/9000 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Shift site build to use monorepo and gh-pages [iceberg]

2023-11-09 Thread via GitHub
Fokko commented on code in PR #8919: URL: https://github.com/apache/iceberg/pull/8919#discussion_r1388598336 ## .github/workflows/site-ci.yml: ## @@ -0,0 +1,40 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the

Re: [PR] Shift site build to use monorepo and gh-pages [iceberg]

2023-11-09 Thread via GitHub
Fokko commented on code in PR #8919: URL: https://github.com/apache/iceberg/pull/8919#discussion_r1388593968 ## site/mkdocs.yml: ## @@ -23,33 +23,37 @@ theme: logo: assets/images/iceberg-logo-icon.png favicon: assets/images/favicon-96x96.png features: +- content.tab

Re: [PR] Shift site build to use monorepo and gh-pages [iceberg]

2023-11-09 Thread via GitHub
Fokko commented on code in PR #8919: URL: https://github.com/apache/iceberg/pull/8919#discussion_r1388588215 ## site/README.md: ## @@ -35,107 +35,101 @@ In MkDocs, the [`docs_dir`](https://www.mkdocs.org/user-guide/configuration/#doc ### Iceberg docs layout -In the Iceberg

Re: [PR] Shift site build to use monorepo and gh-pages [iceberg]

2023-11-09 Thread via GitHub
Fokko commented on PR #8919: URL: https://github.com/apache/iceberg/pull/8919#issuecomment-1804689906 Niiice! Got the `make serve` running in one go now: https://github.com/apache/iceberg/assets/1134248/4d8dd416-48d2-4281-b9e6-f060d6eb8dbd";> -- This is an automated message from

Re: [PR] Nessie: reimplement namespace operations [iceberg]

2023-11-09 Thread via GitHub
adutra commented on code in PR #8857: URL: https://github.com/apache/iceberg/pull/8857#discussion_r1388462986 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java: ## @@ -181,133 +186,239 @@ public IcebergTable table(TableIdentifier tableIdentifier) { }

Re: [PR] Nessie: reimplement namespace operations [iceberg]

2023-11-09 Thread via GitHub
adutra commented on code in PR #8857: URL: https://github.com/apache/iceberg/pull/8857#discussion_r1388456396 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java: ## @@ -181,133 +186,239 @@ public IcebergTable table(TableIdentifier tableIdentifier) { }

Re: [PR] Nessie: reimplement namespace operations [iceberg]

2023-11-09 Thread via GitHub
nastra commented on code in PR #8857: URL: https://github.com/apache/iceberg/pull/8857#discussion_r1388414728 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java: ## @@ -181,133 +186,239 @@ public IcebergTable table(TableIdentifier tableIdentifier) { }

Re: [PR] Nessie: reimplement namespace operations [iceberg]

2023-11-09 Thread via GitHub
nastra commented on code in PR #8857: URL: https://github.com/apache/iceberg/pull/8857#discussion_r1388413439 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java: ## @@ -181,133 +186,239 @@ public IcebergTable table(TableIdentifier tableIdentifier) { }

Re: [PR] Nessie: reimplement namespace operations [iceberg]

2023-11-09 Thread via GitHub
nastra commented on code in PR #8857: URL: https://github.com/apache/iceberg/pull/8857#discussion_r1388412737 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java: ## @@ -181,133 +186,239 @@ public IcebergTable table(TableIdentifier tableIdentifier) { }

Re: [PR] Nessie: reimplement namespace operations [iceberg]

2023-11-09 Thread via GitHub
nastra commented on code in PR #8857: URL: https://github.com/apache/iceberg/pull/8857#discussion_r1388407522 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java: ## @@ -181,133 +186,239 @@ public IcebergTable table(TableIdentifier tableIdentifier) { }

Re: [PR] Add Description on Using a Separate Authorization Server [iceberg]

2023-11-09 Thread via GitHub
syun64 commented on PR #8998: URL: https://github.com/apache/iceberg/pull/8998#issuecomment-1804303632 pinging @danielcweeks for review - thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [PR] Parquet: Remove the row position since parquet row group has it natively [iceberg]

2023-11-09 Thread via GitHub
ricardopereira33 commented on PR #6056: URL: https://github.com/apache/iceberg/pull/6056#issuecomment-1804297603 Hi @flyrain @wypoon ! Are there any updates regarding this issue? We have a case when we write with Trino, and then we do data files compaction (or even just read a file),

Re: [PR] Override useCommitCoordinator to false [iceberg]

2023-11-09 Thread via GitHub
huaxingao commented on PR #9017: URL: https://github.com/apache/iceberg/pull/9017#issuecomment-1804257088 @aokolnychyi I just override `useCommitCoordinator` in `PositionDeltaBatchWrite` inside `SparkPositionDeltaWrite`. `SparkShufflingDataRewriter` and `OrderedWrite` and not

Re: [PR] Core: Support replacing delete manifests [iceberg]

2023-11-09 Thread via GitHub
aokolnychyi commented on code in PR #9000: URL: https://github.com/apache/iceberg/pull/9000#discussion_r1388322693 ## core/src/test/java/org/apache/iceberg/TestRewriteManifests.java: ## @@ -1105,6 +1108,530 @@ public void testRewriteManifestsOnBranchUnsupported() {

Re: [PR] Core: Support replacing delete manifests [iceberg]

2023-11-09 Thread via GitHub
aokolnychyi commented on code in PR #9000: URL: https://github.com/apache/iceberg/pull/9000#discussion_r1388322693 ## core/src/test/java/org/apache/iceberg/TestRewriteManifests.java: ## @@ -1105,6 +1108,530 @@ public void testRewriteManifestsOnBranchUnsupported() {

Re: [PR] Core: Support replacing delete manifests [iceberg]

2023-11-09 Thread via GitHub
RussellSpitzer commented on code in PR #9000: URL: https://github.com/apache/iceberg/pull/9000#discussion_r1388316731 ## core/src/test/java/org/apache/iceberg/TestRewriteManifests.java: ## @@ -1105,6 +1108,530 @@ public void testRewriteManifestsOnBranchUnsupported() {

Re: [PR] Core: Support replacing delete manifests [iceberg]

2023-11-09 Thread via GitHub
RussellSpitzer commented on code in PR #9000: URL: https://github.com/apache/iceberg/pull/9000#discussion_r1388316731 ## core/src/test/java/org/apache/iceberg/TestRewriteManifests.java: ## @@ -1105,6 +1108,530 @@ public void testRewriteManifestsOnBranchUnsupported() {

Re: [PR] Nessie: reimplement namespace operations [iceberg]

2023-11-09 Thread via GitHub
adutra commented on PR #8857: URL: https://github.com/apache/iceberg/pull/8857#issuecomment-1804180229 Thanks @nastra for the review! I think I addressed all your remarks by now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] Nessie: reimplement namespace operations [iceberg]

2023-11-09 Thread via GitHub
adutra commented on code in PR #8857: URL: https://github.com/apache/iceberg/pull/8857#discussion_r1388272964 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java: ## @@ -181,133 +186,239 @@ public IcebergTable table(TableIdentifier tableIdentifier) { }

Re: [PR] Override useCommitCoordinator to false [iceberg]

2023-11-09 Thread via GitHub
aokolnychyi commented on PR #9017: URL: https://github.com/apache/iceberg/pull/9017#issuecomment-1804174776 @huaxingao, shall we also override these places? ``` ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[PR] Override useCommitCoordinator to false [iceberg]

2023-11-09 Thread via GitHub
huaxingao opened a new pull request, #9017: URL: https://github.com/apache/iceberg/pull/9017 In Spark `BatchWrite.java`, `useCommitCoordinator` is default to true. This PR overrides `useCommitCoordinator` to false, so we can rewrite a task after it fails. Here is the [issue](https://github.

Re: [PR] Spark 3.5: Fix rewriting manifests for evolved unpartitioned V1 tables [iceberg]

2023-11-09 Thread via GitHub
aokolnychyi commented on PR #9015: URL: https://github.com/apache/iceberg/pull/9015#issuecomment-1804100316 @singhpk234 @flyrain, could you check this one? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[PR] Spark 3.5: Fix rewriting manifests for evolved unpartitioned V1 tables [iceberg]

2023-11-09 Thread via GitHub
aokolnychyi opened a new pull request, #9015: URL: https://github.com/apache/iceberg/pull/9015 This PR fixes our action for rewriting manifests for evolved unpartitioned V1 tables. There is no need to repartition by range and locally order manifest entries if the spec contains only void tra

Re: [I] Add view support for Hive catalog [iceberg]

2023-11-09 Thread via GitHub
deniskuzZ commented on issue #8698: URL: https://github.com/apache/iceberg/issues/8698#issuecomment-1804095465 > @deniskuzZ: Is there there something like this ongoing in the Hive codebase? hi Peter, we have iceberg MV backed by iceberg table support in Hive where query definitions a

Re: [PR] Add log entry for bloom filter [iceberg]

2023-11-09 Thread via GitHub
huaxingao commented on PR #9010: URL: https://github.com/apache/iceberg/pull/9010#issuecomment-1804081291 cc @RussellSpitzer -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Add dependabot to automatically update the site [iceberg]

2023-11-09 Thread via GitHub
Fokko commented on PR #9004: URL: https://github.com/apache/iceberg/pull/9004#issuecomment-1804078908 Hey @aokolnychyi This PR means that the [`mkdocs` dependencies](https://github.com/apache/iceberg/blob/main/site/requirements.txt) will be kept up to date. @bitsondatadev is working

Re: [PR] Support force option on RegisterTable procedure [iceberg]

2023-11-09 Thread via GitHub
rushilshah1 commented on PR #5327: URL: https://github.com/apache/iceberg/pull/5327#issuecomment-1804076381 I am also interested in this functionality as well! Are there still plans to add this? -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] Add dependabot to automatically update the site [iceberg]

2023-11-09 Thread via GitHub
aokolnychyi commented on PR #9004: URL: https://github.com/apache/iceberg/pull/9004#issuecomment-1804073272 Does it mean the spec will be published prior to the release? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [I] Substitue in memory data struct's timestamp type for DataTime rather i64 to simplify usage. [iceberg-rust]

2023-11-09 Thread via GitHub
my-vegetable-has-exploded commented on issue #90: URL: https://github.com/apache/iceberg-rust/issues/90#issuecomment-1804052188 Shall we use DateTime here? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] Core: Support replacing delete manifests [iceberg]

2023-11-09 Thread via GitHub
aokolnychyi commented on code in PR #9000: URL: https://github.com/apache/iceberg/pull/9000#discussion_r1388159792 ## core/src/test/java/org/apache/iceberg/TestRewriteManifests.java: ## @@ -1105,6 +1108,499 @@ public void testRewriteManifestsOnBranchUnsupported() {

Re: [PR] Core: Support replacing delete manifests [iceberg]

2023-11-09 Thread via GitHub
aokolnychyi commented on code in PR #9000: URL: https://github.com/apache/iceberg/pull/9000#discussion_r1388159792 ## core/src/test/java/org/apache/iceberg/TestRewriteManifests.java: ## @@ -1105,6 +1108,499 @@ public void testRewriteManifestsOnBranchUnsupported() {

Re: [PR] Core: Use InMemoryCatalog as backend catalog [iceberg]

2023-11-09 Thread via GitHub
nastra commented on PR #9014: URL: https://github.com/apache/iceberg/pull/9014#issuecomment-1803965545 thanks for the reviews @bryanck and @Fokko -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Core: Use InMemoryCatalog as backend catalog [iceberg]

2023-11-09 Thread via GitHub
nastra merged PR #9014: URL: https://github.com/apache/iceberg/pull/9014 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [I] to_pandas() API which converts iceberg table scan to a pd.DataFrame will lost datetime data type and row order [iceberg-python]

2023-11-09 Thread via GitHub
zeddit commented on issue #132: URL: https://github.com/apache/iceberg-python/issues/132#issuecomment-1803817044 @Fokko great thanks for your help. I do some more experiments according to your advices. I wonder if I correctly understand the `global sort` right. I think it's a way t

Re: [PR] Core: Support replacing delete manifests [iceberg]

2023-11-09 Thread via GitHub
RussellSpitzer commented on code in PR #9000: URL: https://github.com/apache/iceberg/pull/9000#discussion_r1387975982 ## core/src/test/java/org/apache/iceberg/TestRewriteManifests.java: ## @@ -1105,6 +1108,499 @@ public void testRewriteManifestsOnBranchUnsupported() {

Re: [PR] Data: TableMigrationUtil shouldn't use path to get partition values [iceberg]

2023-11-09 Thread via GitHub
gaborkaszab closed pull request #7738: Data: TableMigrationUtil shouldn't use path to get partition values URL: https://github.com/apache/iceberg/pull/7738 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] Core: Improve view/table detection when replacing a table/view [iceberg]

2023-11-09 Thread via GitHub
nastra commented on PR #9012: URL: https://github.com/apache/iceberg/pull/9012#issuecomment-1803644558 The test failures are because the backend catalog in `TestRESTCatalog` doesn't support views, hence I switched it to `InMemoryCatalog` in https://github.com/apache/iceberg/pull/9014 --

[PR] Core: Use InMemoryCatalog as backend catalog [iceberg]

2023-11-09 Thread via GitHub
nastra opened a new pull request, #9014: URL: https://github.com/apache/iceberg/pull/9014 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mai

Re: [I] Creating a hive Managed Table? [iceberg]

2023-11-09 Thread via GitHub
melin commented on issue #9013: URL: https://github.com/apache/iceberg/issues/9013#issuecomment-1803627550 If a user drops table, no data is deleted and storage space is occupied. It is not very convenient to delete files from s3 separately. If catalog supports adding parameters, set manage

Re: [I] Creating a hive Managed Table? [iceberg]

2023-11-09 Thread via GitHub
nastra commented on issue #9013: URL: https://github.com/apache/iceberg/issues/9013#issuecomment-1803614187 @melin thanks for opening the issue, however it's not really clear what you're asking for and what problem you're facing. Could you please re-phrase your question so that we can bette

Re: [I] table created by pyiceberg could not interoperate well with trino [iceberg-python]

2023-11-09 Thread via GitHub
Fokko commented on issue #134: URL: https://github.com/apache/iceberg-python/issues/134#issuecomment-1803574403 @zeddit which version of Trino are you using? I don't see any issues with the latest version: ``` trino> describe "fokkos-warehouse"."default"."table1"; Column | Type

[I] Creating a hive Managed Table? [iceberg]

2023-11-09 Thread via GitHub
melin opened a new issue, #9013: URL: https://github.com/apache/iceberg/issues/9013 ### Query engine spark ### Question The iceberg table is created using spark sql, and the hms metadata is the EXTERNAL table, which I hope is the managed table -- This is an automated

Re: [PR] Nessie: Support views for NessieCatalog [iceberg]

2023-11-09 Thread via GitHub
ajantha-bhat commented on code in PR #8909: URL: https://github.com/apache/iceberg/pull/8909#discussion_r1387790517 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieViewOperations.java: ## @@ -0,0 +1,159 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under on

Re: [PR] Nessie: Support views for NessieCatalog [iceberg]

2023-11-09 Thread via GitHub
ajantha-bhat commented on code in PR #8909: URL: https://github.com/apache/iceberg/pull/8909#discussion_r1387788306 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java: ## @@ -540,4 +630,72 @@ public void close() { api.close(); } } + + publ

Re: [PR] Nessie: Support views for NessieCatalog [iceberg]

2023-11-09 Thread via GitHub
ajantha-bhat commented on code in PR #8909: URL: https://github.com/apache/iceberg/pull/8909#discussion_r1387787475 ## core/src/test/java/org/apache/iceberg/view/ViewCatalogTests.java: ## @@ -400,8 +400,15 @@ public void replaceTableViaTransactionThatAlreadyExistsAsView() {

Re: [PR] Hive: Use timezone of the object inspector while constructing primitive Java objects for timestamp with local time zone [iceberg]

2023-11-09 Thread via GitHub
deniskuzZ commented on PR #8113: URL: https://github.com/apache/iceberg/pull/8113#issuecomment-1803541545 > > From @deniskuzZ comment (thanks Denys for the information), we can see that by default the session timezone used by Hive is the JVM timezone. > > The question isn't how the de

Re: [PR] Spec: Clarify which columns can be used for equality delete files. [iceberg]

2023-11-09 Thread via GitHub
gaborkaszab commented on code in PR #8981: URL: https://github.com/apache/iceberg/pull/8981#discussion_r1387779741 ## format/spec.md: ## @@ -842,7 +842,8 @@ The rows in the delete file must be sorted by `file_path` then `pos` to optimize Equality delete files identify delete

Re: [I] java.lang.IllegalArgumentException: requirement failed while read migrated parquet table [iceberg]

2023-11-09 Thread via GitHub
Hathoute commented on issue #8863: URL: https://github.com/apache/iceberg/issues/8863#issuecomment-1803532887 Upgrading to 1.4.1 or later **AND** rewriting manifests seems to resolve this. -- This is an automated message from the Apache Git Service. To respond to the message, please log o

Re: [PR] Nessie: Support views for NessieCatalog [iceberg]

2023-11-09 Thread via GitHub
nastra commented on code in PR #8909: URL: https://github.com/apache/iceberg/pull/8909#discussion_r1387752354 ## core/src/test/java/org/apache/iceberg/view/ViewCatalogTests.java: ## @@ -400,8 +400,15 @@ public void replaceTableViaTransactionThatAlreadyExistsAsView() {

[PR] Core: Improve view/table detection when replacing a table/view [iceberg]

2023-11-09 Thread via GitHub
nastra opened a new pull request, #9012: URL: https://github.com/apache/iceberg/pull/9012 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mai

  1   2   >