Re: [PR] Nessie: reimplement create and drop namespace operations [iceberg]

2023-10-19 Thread via GitHub
adutra commented on code in PR #8857: URL: https://github.com/apache/iceberg/pull/8857#discussion_r1365027806 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java: ## @@ -181,21 +182,76 @@ public IcebergTable table(TableIdentifier tableIdentifier) { }

Re: [PR] Nessie: reimplement create and drop namespace operations [iceberg]

2023-10-19 Thread via GitHub
adutra commented on code in PR #8857: URL: https://github.com/apache/iceberg/pull/8857#discussion_r1365035897 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java: ## @@ -181,23 +182,84 @@ public IcebergTable table(TableIdentifier tableIdentifier) { }

Re: [PR] Nessie: reimplement create and drop namespace operations [iceberg]

2023-10-19 Thread via GitHub
ajantha-bhat commented on code in PR #8857: URL: https://github.com/apache/iceberg/pull/8857#discussion_r1365041518 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java: ## @@ -181,23 +182,84 @@ public IcebergTable table(TableIdentifier tableIdentifier) {

Re: [PR] Nessie: reimplement create and drop namespace operations [iceberg]

2023-10-19 Thread via GitHub
adutra commented on code in PR #8857: URL: https://github.com/apache/iceberg/pull/8857#discussion_r1365042954 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java: ## @@ -181,23 +182,84 @@ public IcebergTable table(TableIdentifier tableIdentifier) { }

Re: [PR] Flink: Read parquet BINARY column as String for expected [iceberg]

2023-10-19 Thread via GitHub
pvary commented on PR #8808: URL: https://github.com/apache/iceberg/pull/8808#issuecomment-1770228648 @RussellSpitzer: Could you please take a look at the Spark change? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

[I] Iceberg vs Parquet [iceberg]

2023-10-19 Thread via GitHub
hieuLapTop77 opened a new issue, #8876: URL: https://github.com/apache/iceberg/issues/8876 ### Query engine I am using Spark version 3.4.1 ### Question I wonder about the strengths of Apache Iceberg compared to Apache Parquet. I have used Spark SQL to compare query speed

Re: [PR] Core: Enable column statistics filtering after planning [iceberg]

2023-10-19 Thread via GitHub
pvary commented on PR #8803: URL: https://github.com/apache/iceberg/pull/8803#issuecomment-1770262566 > I understand the current code is for metadata column selection/projection, not the columns selected to include stats My understanding is that the planning might need stats which are

Re: [I] Add view support for Hive catalog [iceberg]

2023-10-19 Thread via GitHub
nk1506 commented on issue #8698: URL: https://github.com/apache/iceberg/issues/8698#issuecomment-1770264064 As part of this I have couple of questions ? For IcebergTable we use TableType as [EXTERNAL_TABLE](https://github.com/apache/hive/blob/b02cef4fe943b9aba597dcdfd3b8f3d3a5efca3e/stand

Re: [PR] Add missing license headers [iceberg]

2023-10-19 Thread via GitHub
jbonofre commented on PR #8875: URL: https://github.com/apache/iceberg/pull/8875#issuecomment-1770269623 @Fokko I discussed with @nastra : I have a couple of PRs related to rat and fix proposals (for binary and license checks). -- This is an automated message from the Apache Git Service.

Re: [PR] Build: Replace Thread.Sleep() usage with org.Awaitility from Tests. [iceberg]

2023-10-19 Thread via GitHub
nastra commented on code in PR #8804: URL: https://github.com/apache/iceberg/pull/8804#discussion_r1365076551 ## aws/src/integration/java/org/apache/iceberg/aws/glue/TestGlueCatalogLock.java: ## @@ -139,14 +141,11 @@ public void testParallelCommitMultiThreadMultiCommit() {

Re: [PR] Build: Replace Thread.Sleep() usage with org.Awaitility from Tests. [iceberg]

2023-10-19 Thread via GitHub
nastra commented on code in PR #8804: URL: https://github.com/apache/iceberg/pull/8804#discussion_r1365077424 ## aws/src/integration/java/org/apache/iceberg/aws/lakeformation/LakeFormationTestBase.java: ## @@ -357,8 +359,20 @@ String getRandomTableName() { return LF_TEST_TA

Re: [PR] Build: Replace Thread.Sleep() usage with org.Awaitility from Tests. [iceberg]

2023-10-19 Thread via GitHub
nastra commented on code in PR #8804: URL: https://github.com/apache/iceberg/pull/8804#discussion_r1365078356 ## aws/src/integration/java/org/apache/iceberg/aws/lakeformation/LakeFormationTestBase.java: ## @@ -357,8 +359,20 @@ String getRandomTableName() { return LF_TEST_TA

Re: [PR] Build: Replace Thread.Sleep() usage with org.Awaitility from Tests. [iceberg]

2023-10-19 Thread via GitHub
nastra commented on code in PR #8804: URL: https://github.com/apache/iceberg/pull/8804#discussion_r1365081118 ## aws/src/integration/java/org/apache/iceberg/aws/TestAssumeRoleAwsClientFactory.java: ## @@ -189,7 +192,18 @@ public void testAssumeRoleS3FileIO() throws Exception {

Re: [PR] Build: Replace Thread.Sleep() usage with org.Awaitility from Tests. [iceberg]

2023-10-19 Thread via GitHub
nastra commented on code in PR #8804: URL: https://github.com/apache/iceberg/pull/8804#discussion_r1365080413 ## core/src/test/java/org/apache/iceberg/jdbc/TestJdbcTableConcurrency.java: ## @@ -92,14 +94,11 @@ public synchronized void testConcurrentFastAppends() throws IOExcept

Re: [PR] Add missing license headers [iceberg]

2023-10-19 Thread via GitHub
nastra merged PR #8875: URL: https://github.com/apache/iceberg/pull/8875 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Build: Replace Thread.Sleep() usage with org.Awaitility from Tests. [iceberg]

2023-10-19 Thread via GitHub
nastra commented on code in PR #8804: URL: https://github.com/apache/iceberg/pull/8804#discussion_r1365080751 ## core/src/test/java/org/apache/iceberg/jdbc/TestJdbcTableConcurrency.java: ## @@ -92,14 +94,11 @@ public synchronized void testConcurrentFastAppends() throws IOExcept

Re: [PR] AWS: Glue catalog strip trailing slash on DB URI [iceberg]

2023-10-19 Thread via GitHub
nastra commented on code in PR #8870: URL: https://github.com/apache/iceberg/pull/8870#discussion_r1365085396 ## aws/src/main/java/org/apache/iceberg/aws/glue/GlueCatalog.java: ## @@ -281,6 +281,7 @@ protected String defaultWarehouseLocation(TableIdentifier tableIdentifier) {

Re: [PR] AWS: Glue catalog strip trailing slash on DB URI [iceberg]

2023-10-19 Thread via GitHub
nastra commented on PR #8870: URL: https://github.com/apache/iceberg/pull/8870#issuecomment-1770287991 Just a wild guess, but could it be possible that we're missing to strip a trailing slash in https://github.com/apache/iceberg/blob/81bf8d30766b1b129b87abde15239645cb127046/core/src/main/ja

Re: [I] Support Nessie catalog [iceberg-python]

2023-10-19 Thread via GitHub
zeddit commented on issue #19: URL: https://github.com/apache/iceberg-python/issues/19#issuecomment-1770291671 looking forward for this feature to conduct testing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

Re: [PR] Nessie: reimplement create and drop namespace operations [iceberg]

2023-10-19 Thread via GitHub
snazy commented on code in PR #8857: URL: https://github.com/apache/iceberg/pull/8857#discussion_r1365087588 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java: ## @@ -223,27 +284,57 @@ namespace, getRef().getName()), } public boolean dropNamespac

Re: [PR] Flink 1.17: Use awaitility instead of Thread.sleep() [iceberg]

2023-10-19 Thread via GitHub
nk1506 commented on PR #8852: URL: https://github.com/apache/iceberg/pull/8852#issuecomment-1770333911 Yeah sure @nastra . I will create PR with other versions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] Core: Improvements around View catalog tests [iceberg]

2023-10-19 Thread via GitHub
nastra commented on code in PR #8865: URL: https://github.com/apache/iceberg/pull/8865#discussion_r1365154084 ## core/src/test/java/org/apache/iceberg/view/ViewCatalogTests.java: ## @@ -1446,7 +1544,14 @@ public void updateViewLocationConflict() { // the view was already

Re: [PR] Core: Improvements around View catalog tests [iceberg]

2023-10-19 Thread via GitHub
nastra commented on code in PR #8865: URL: https://github.com/apache/iceberg/pull/8865#discussion_r1365171588 ## core/src/test/java/org/apache/iceberg/view/ViewCatalogTests.java: ## @@ -225,8 +243,9 @@ public void createViewErrorCases() { .withQuery(trino.di

Re: [PR] feat: support ser/deser of value [iceberg-rust]

2023-10-19 Thread via GitHub
liurenjie1024 commented on code in PR #82: URL: https://github.com/apache/iceberg-rust/pull/82#discussion_r1365064528 ## crates/iceberg/src/avro/schema.rs: ## @@ -203,8 +197,8 @@ impl SchemaVisitor for SchemaToAvroSchema { PrimitiveType::Timestamp => AvroSchema::Tim

Re: [PR] Core: Enable column statistics filtering after planning [iceberg]

2023-10-19 Thread via GitHub
pvary commented on code in PR #8803: URL: https://github.com/apache/iceberg/pull/8803#discussion_r1365196518 ## core/src/main/java/org/apache/iceberg/BaseScan.java: ## @@ -121,6 +121,10 @@ protected boolean shouldReturnColumnStats() { return context().returnColumnStats();

Re: [PR] Core: Enable column statistics filtering after planning [iceberg]

2023-10-19 Thread via GitHub
pvary commented on code in PR #8803: URL: https://github.com/apache/iceberg/pull/8803#discussion_r1365195690 ## api/src/main/java/org/apache/iceberg/ContentFile.java: ## @@ -177,4 +191,26 @@ default Long fileSequenceNumber() { default F copy(boolean withStats) { return w

Re: [PR] Core: Enable column statistics filtering after planning [iceberg]

2023-10-19 Thread via GitHub
pvary commented on code in PR #8803: URL: https://github.com/apache/iceberg/pull/8803#discussion_r1365196098 ## api/src/main/java/org/apache/iceberg/ContentFile.java: ## @@ -165,6 +166,19 @@ default Long fileSequenceNumber() { */ F copyWithoutStats(); + /** + * Copie

Re: [PR] Core: Enable column statistics filtering after planning [iceberg]

2023-10-19 Thread via GitHub
pvary commented on code in PR #8803: URL: https://github.com/apache/iceberg/pull/8803#discussion_r1365196782 ## core/src/main/java/org/apache/iceberg/BaseFile.java: ## @@ -174,8 +176,9 @@ public PartitionData copy() { * * @param toCopy a generic data file to copy. *

Re: [PR] Core: Enable column statistics filtering after planning [iceberg]

2023-10-19 Thread via GitHub
pvary commented on code in PR #8803: URL: https://github.com/apache/iceberg/pull/8803#discussion_r1365197840 ## core/src/main/java/org/apache/iceberg/BaseFile.java: ## @@ -186,12 +189,23 @@ public PartitionData copy() { this.recordCount = toCopy.recordCount; this.fileS

Re: [I] Iceberg vs Parquet [iceberg]

2023-10-19 Thread via GitHub
pvary commented on issue #8876: URL: https://github.com/apache/iceberg/issues/8876#issuecomment-1770427907 @hieuLapTop77: Apache Parquet is a file format - it describes how the data is written to files, while Apache Iceberg is a table format - while it defines how the data is written to fil

Re: [I] Iceberg vs Parquet [iceberg]

2023-10-19 Thread via GitHub
pvary closed issue #8876: Iceberg vs Parquet URL: https://github.com/apache/iceberg/issues/8876 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-u

Re: [I] Add view support for Hive catalog [iceberg]

2023-10-19 Thread via GitHub
pvary commented on issue #8698: URL: https://github.com/apache/iceberg/issues/8698#issuecomment-1770431955 @deniskuzZ: Is there there something like this ongoing in the Hive codebase? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [PR] Build: Bump urllib3 from 1.26.17 to 1.26.18 [iceberg-python]

2023-10-19 Thread via GitHub
Fokko merged PR #84: URL: https://github.com/apache/iceberg-python/pull/84 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Doc: Fix "Verifying Checksums" script in verify-release.md [iceberg-python]

2023-10-19 Thread via GitHub
Fokko commented on PR #82: URL: https://github.com/apache/iceberg-python/pull/82#issuecomment-1770436467 I just checked the Iceberg Java 1.4.1 release and noticed this as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] Add sort_order_id to STATS_COLUMNS to address null sort order ID in p… [iceberg]

2023-10-19 Thread via GitHub
nastra commented on code in PR #8873: URL: https://github.com/apache/iceberg/pull/8873#discussion_r1365243595 ## core/src/main/java/org/apache/iceberg/BaseScan.java: ## @@ -57,7 +57,8 @@ abstract class BaseScan> "nan_value_counts", "lower_bounds",

Re: [PR] Add sort_order_id to STATS_COLUMNS to address null sort order ID in p… [iceberg]

2023-10-19 Thread via GitHub
nastra commented on code in PR #8873: URL: https://github.com/apache/iceberg/pull/8873#discussion_r1365245858 ## core/src/test/java/org/apache/iceberg/ScanTestBase.java: ## @@ -224,4 +224,34 @@ public void testReAddingPartitionField() throws Exception { } } } + +

Re: [PR] Nessie: reimplement create and drop namespace operations [iceberg]

2023-10-19 Thread via GitHub
adutra commented on code in PR #8857: URL: https://github.com/apache/iceberg/pull/8857#discussion_r1365252557 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java: ## @@ -181,23 +182,83 @@ public IcebergTable table(TableIdentifier tableIdentifier) { }

Re: [PR] Nessie: reimplement create and drop namespace operations [iceberg]

2023-10-19 Thread via GitHub
adutra commented on code in PR #8857: URL: https://github.com/apache/iceberg/pull/8857#discussion_r1365251661 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java: ## @@ -181,23 +182,83 @@ public IcebergTable table(TableIdentifier tableIdentifier) { }

Re: [PR] Nessie: reimplement create and drop namespace operations [iceberg]

2023-10-19 Thread via GitHub
adutra commented on code in PR #8857: URL: https://github.com/apache/iceberg/pull/8857#discussion_r1365256655 ## nessie/src/main/java/org/apache/iceberg/nessie/NessieIcebergClient.java: ## @@ -223,27 +284,57 @@ namespace, getRef().getName()), } public boolean dropNamespa

Re: [PR] Nessie: reimplement create and drop namespace operations [iceberg]

2023-10-19 Thread via GitHub
adutra commented on PR #8857: URL: https://github.com/apache/iceberg/pull/8857#issuecomment-1770502216 > I suspect, the setProperties + removeProperties methods need the same changes? Yes... that was lurking around since the beginning, I guess I'm in to change them too :-D -- This

Re: [PR] Make to_arrow function capable of handling parquet files with sanitized name due to Avro restirction [iceberg-python]

2023-10-19 Thread via GitHub
Fokko commented on code in PR #83: URL: https://github.com/apache/iceberg-python/pull/83#discussion_r1363628652 ## pyiceberg/schema.py: ## @@ -1273,6 +1273,102 @@ def primitive(self, primitive: PrimitiveType) -> PrimitiveType: return primitive +# Implementation cop

Re: [PR] Doc: Fix "Verifying Checksums" script in verify-release.md [iceberg-python]

2023-10-19 Thread via GitHub
Fokko merged PR #82: URL: https://github.com/apache/iceberg-python/pull/82 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] feat: First version of rest catalog. [iceberg-rust]

2023-10-19 Thread via GitHub
liurenjie1024 commented on code in PR #78: URL: https://github.com/apache/iceberg-rust/pull/78#discussion_r1365327942 ## crates/catalog/rest/src/catalog.rs: ## @@ -0,0 +1,845 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreem

Re: [I] Add view support for Hive catalog [iceberg]

2023-10-19 Thread via GitHub
ajantha-bhat commented on issue #8698: URL: https://github.com/apache/iceberg/issues/8698#issuecomment-1770694291 > Please share your thoughts. CC: @ajantha-bhat , @jbonofre . Also tag the relevant people. I think we can use `VIRTUAL_VIEW` type to avoid adding custom properties to

Re: [PR] Nessie: Adapt to Nessie 0.71.1 release [iceberg]

2023-10-19 Thread via GitHub
ajantha-bhat commented on PR #8798: URL: https://github.com/apache/iceberg/pull/8798#issuecomment-1770707163 can this be merged? @nastra, @Fokko -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [I] Bug: PostgreSql integration [iceberg-python]

2023-10-19 Thread via GitHub
gkaretka commented on issue #78: URL: https://github.com/apache/iceberg-python/issues/78#issuecomment-1770718118 👍 same issue -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Add sort_order_id to STATS_COLUMNS to address null sort order ID in p… [iceberg]

2023-10-19 Thread via GitHub
zhangminglei commented on code in PR #8873: URL: https://github.com/apache/iceberg/pull/8873#discussion_r1365374541 ## core/src/main/java/org/apache/iceberg/BaseScan.java: ## @@ -57,7 +57,8 @@ abstract class BaseScan> "nan_value_counts", "lower_bounds",

Re: [I] Bug: PostgreSql integration [iceberg-python]

2023-10-19 Thread via GitHub
Fokko commented on issue #78: URL: https://github.com/apache/iceberg-python/issues/78#issuecomment-1770729964 @mobley-trent @gkaretka Thanks for reaching out here. The tables are not created by default, but I think that might be the wrong behaviour since you both expected them to be created

Re: [PR] Nessie: Adapt to Nessie 0.71.1 release [iceberg]

2023-10-19 Thread via GitHub
Fokko merged PR #8798: URL: https://github.com/apache/iceberg/pull/8798 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

Re: [PR] Nessie: Adapt to Nessie 0.71.1 release [iceberg]

2023-10-19 Thread via GitHub
Fokko commented on PR #8798: URL: https://github.com/apache/iceberg/pull/8798#issuecomment-1770736345 Certainly @ajantha-bhat, thanks for pinging me. Thanks for the PR @nk1506 👍 and @dimas-b for the review -- This is an automated message from the Apache Git Service. To respond to the mess

Re: [PR] Add sort_order_id to STATS_COLUMNS to address null sort order ID in p… [iceberg]

2023-10-19 Thread via GitHub
zhangminglei commented on code in PR #8873: URL: https://github.com/apache/iceberg/pull/8873#discussion_r1365382850 ## core/src/main/java/org/apache/iceberg/BaseScan.java: ## @@ -57,7 +57,8 @@ abstract class BaseScan> "nan_value_counts", "lower_bounds",

Re: [PR] Add sort_order_id to STATS_COLUMNS to address null sort order ID in p… [iceberg]

2023-10-19 Thread via GitHub
zhangminglei commented on PR #8873: URL: https://github.com/apache/iceberg/pull/8873#issuecomment-1770739966 @nastra Could you please take a look again ? Thanks . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [I] Bug: PostgreSql integration [iceberg-python]

2023-10-19 Thread via GitHub
gkaretka commented on issue #78: URL: https://github.com/apache/iceberg-python/issues/78#issuecomment-1770748027 Thanks 👍 was actually trying to investigate a little further and came to the same conclusion. Creating tables by calling: `catalog.create_tables()` solved my problem

Re: [PR] Update CachingCatalog to use expireAfterWrite instead of expireAfterAccess [iceberg]

2023-10-19 Thread via GitHub
zhangminglei commented on PR #8844: URL: https://github.com/apache/iceberg/pull/8844#issuecomment-1770804416 @nastra Can you please take a finally review ? 😺 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

Re: [PR] AWS: Glue catalog strip trailing slash on DB URI [iceberg]

2023-10-19 Thread via GitHub
amogh-jahagirdar commented on PR #8870: URL: https://github.com/apache/iceberg/pull/8870#issuecomment-1770872534 > Just a wild guess, but could it be possible that we're missing to strip a trailing slash in > > https://github.com/apache/iceberg/blob/81bf8d30766b1b129b87abde15239645cb

Re: [PR] Add sort_order_id to STATS_COLUMNS to address null sort order ID in p… [iceberg]

2023-10-19 Thread via GitHub
nastra commented on code in PR #8873: URL: https://github.com/apache/iceberg/pull/8873#discussion_r1365453843 ## core/src/main/java/org/apache/iceberg/BaseScan.java: ## @@ -57,7 +57,8 @@ abstract class BaseScan> "nan_value_counts", "lower_bounds",

Re: [PR] AWS: Glue catalog strip trailing slash on DB URI [iceberg]

2023-10-19 Thread via GitHub
amogh-jahagirdar commented on code in PR #8870: URL: https://github.com/apache/iceberg/pull/8870#discussion_r1365470046 ## aws/src/main/java/org/apache/iceberg/aws/glue/GlueCatalog.java: ## @@ -281,6 +281,7 @@ protected String defaultWarehouseLocation(TableIdentifier tableIdent

Re: [PR] AWS: Glue catalog strip trailing slash on DB URI [iceberg]

2023-10-19 Thread via GitHub
amogh-jahagirdar commented on code in PR #8870: URL: https://github.com/apache/iceberg/pull/8870#discussion_r1365470046 ## aws/src/main/java/org/apache/iceberg/aws/glue/GlueCatalog.java: ## @@ -281,6 +281,7 @@ protected String defaultWarehouseLocation(TableIdentifier tableIdent

Re: [PR] Build: Bump org.apache.pig:pig from 0.14.0 to 0.17.0 [iceberg]

2023-10-19 Thread via GitHub
nastra merged PR #8774: URL: https://github.com/apache/iceberg/pull/8774 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Add sort_order_id to SCAN_COLUMNS to address null sort order ID in p… [iceberg]

2023-10-19 Thread via GitHub
zhangminglei commented on PR #8873: URL: https://github.com/apache/iceberg/pull/8873#issuecomment-1770931874 Thanks @nastra @Fokko for review! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [I] Rest Catalog UpdateTableRequest IOException handling could cause data discrepancy in case of response getting lost [iceberg]

2023-10-19 Thread via GitHub
nastra closed issue #6778: Rest Catalog UpdateTableRequest IOException handling could cause data discrepancy in case of response getting lost URL: https://github.com/apache/iceberg/issues/6778 -- This is an automated message from the Apache Git Service. To respond to the message, please log o

[PR] Disable merging explicitly [iceberg]

2023-10-19 Thread via GitHub
Fokko opened a new pull request, #8878: URL: https://github.com/apache/iceberg/pull/8878 This was done earlier before there was a `.asf.yaml` through an INFRA ticket, but I think it is good to also add it to the `yaml`. -- This is an automated message from the Apache Git Service. To respo

Re: [I] Rest Catalog UpdateTableRequest IOException handling could cause data discrepancy in case of response getting lost [iceberg]

2023-10-19 Thread via GitHub
nastra commented on issue #6778: URL: https://github.com/apache/iceberg/issues/6778#issuecomment-1770949997 I believe this issue should be fixed by https://github.com/apache/iceberg/pull/8397 and https://github.com/apache/iceberg/pull/8599, where we only perform cleanup on exceptions that

[PR] Update release template [iceberg]

2023-10-19 Thread via GitHub
Fokko opened a new pull request, #8879: URL: https://github.com/apache/iceberg/pull/8879 I think we should remove the excluding weekends part of the wait period. With a patch release like we're doing now, I don't think we want to be constrained by this. In practice, I don't think many relea

Re: [PR] Update release template [iceberg]

2023-10-19 Thread via GitHub
Fokko commented on PR #8879: URL: https://github.com/apache/iceberg/pull/8879#issuecomment-1770987327 Some historical context: https://github.com/apache/iceberg-docs/pull/187#discussion_r1086933656 -- This is an automated message from the Apache Git Service. To respond to the message, ple

[PR] Update pre-commit [iceberg-python]

2023-10-19 Thread via GitHub
Fokko opened a new pull request, #85: URL: https://github.com/apache/iceberg-python/pull/85 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-m

Re: [PR] AWS: Glue catalog strip trailing slash on DB URI [iceberg]

2023-10-19 Thread via GitHub
amogh-jahagirdar merged PR #8870: URL: https://github.com/apache/iceberg/pull/8870 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ic

Re: [PR] API, Core: Add UUID API to Table [iceberg]

2023-10-19 Thread via GitHub
amogh-jahagirdar commented on code in PR #8800: URL: https://github.com/apache/iceberg/pull/8800#discussion_r1365617906 ## api/src/main/java/org/apache/iceberg/Table.java: ## @@ -333,6 +333,15 @@ default UpdateStatistics updateStatistics() { */ Map refs(); + /** + *

[PR] Add flake8-pie to ruff [iceberg-python]

2023-10-19 Thread via GitHub
Fokko opened a new pull request, #86: URL: https://github.com/apache/iceberg-python/pull/86 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-m

[PR] Add Refurb to ruff [iceberg-python]

2023-10-19 Thread via GitHub
Fokko opened a new pull request, #87: URL: https://github.com/apache/iceberg-python/pull/87 Seems to do some nice checks: https://docs.astral.sh/ruff/rules/#refurb-furb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

Re: [I] Iceberg Rest Catalog Support for a Separate OIDC Authorization Server URI [iceberg]

2023-10-19 Thread via GitHub
syun64 commented on issue #8869: URL: https://github.com/apache/iceberg/issues/8869#issuecomment-1771122573 > @syun64 In my org, we have very similar situation where we, unfortunately, can only use an internal procedure to grab auth token (that is quite different from OIDC flow). Based on w

Re: [PR] Flink 1.16: Use awaitility instead of Thread.sleep() [iceberg]

2023-10-19 Thread via GitHub
nk1506 commented on PR #8880: URL: https://github.com/apache/iceberg/pull/8880#issuecomment-1771122007 @nastra , Please take a look. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Flink 1.15: Use awaitility instead of Thread.sleep() [iceberg]

2023-10-19 Thread via GitHub
nk1506 commented on PR #8877: URL: https://github.com/apache/iceberg/pull/8877#issuecomment-1771122305 @nastra , Please take a look. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Flink 1.15: Use awaitility instead of Thread.sleep() [iceberg]

2023-10-19 Thread via GitHub
nastra merged PR #8877: URL: https://github.com/apache/iceberg/pull/8877 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Flink 1.16: Use awaitility instead of Thread.sleep() [iceberg]

2023-10-19 Thread via GitHub
nastra merged PR #8880: URL: https://github.com/apache/iceberg/pull/8880 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Build: Replace Thread.Sleep() usage with org.Awaitility from Tests. [iceberg]

2023-10-19 Thread via GitHub
nastra commented on code in PR #8804: URL: https://github.com/apache/iceberg/pull/8804#discussion_r1365689434 ## api/src/test/java/org/apache/iceberg/metrics/TestDefaultTimer.java: ## @@ -20,6 +20,7 @@ import static java.util.concurrent.Executors.newFixedThreadPool; import s

Re: [I] Replace deprecated `set-output` command with environment file [iceberg]

2023-10-19 Thread via GitHub
nastra closed issue #8665: Replace deprecated `set-output` command with environment file URL: https://github.com/apache/iceberg/issues/8665 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] Build: Replace deprecated command with environment file [iceberg]

2023-10-19 Thread via GitHub
nastra merged PR #8666: URL: https://github.com/apache/iceberg/pull/8666 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [I] Bug: PostgreSql integration [iceberg-python]

2023-10-19 Thread via GitHub
gkaretka commented on issue #78: URL: https://github.com/apache/iceberg-python/issues/78#issuecomment-1771206442 Maybe this issue can be renamed into: _documentation_ rather than _bug_. What do you think @mobley-trent -- This is an automated message from the Apache Git Service. To respon

Re: [I] Bug: PostgreSql integration [iceberg-python]

2023-10-19 Thread via GitHub
Fokko commented on issue #78: URL: https://github.com/apache/iceberg-python/issues/78#issuecomment-1771214572 I have the same feeling, I think this should be done once, and not every time you initialize the catalog (if you run a lot of jobs in parallel using Airflow, these calls can add up)

[I] fast_forward command not merging branches within AWS Glue [iceberg]

2023-10-19 Thread via GitHub
lime-squeeze opened a new issue, #8881: URL: https://github.com/apache/iceberg/issues/8881 ### Query engine Spark 3.3 within AWS Glue 4.0 and using iceberg-spark-runtime-3.3_2.12-1.4.0.jar ### Question I am attempting to test branching via SparkSQL within AWS Glue. I am

Re: [PR] Make to_arrow function capable of handling parquet files with sanitized name due to Avro restirction [iceberg-python]

2023-10-19 Thread via GitHub
puchengy commented on code in PR #83: URL: https://github.com/apache/iceberg-python/pull/83#discussion_r1365868994 ## pyiceberg/schema.py: ## @@ -1273,6 +1273,102 @@ def primitive(self, primitive: PrimitiveType) -> PrimitiveType: return primitive +# Implementation

Re: [PR] Make to_arrow function capable of handling parquet files with sanitized name due to Avro restirction [iceberg-python]

2023-10-19 Thread via GitHub
puchengy commented on code in PR #83: URL: https://github.com/apache/iceberg-python/pull/83#discussion_r1365869931 ## pyiceberg/schema.py: ## @@ -1273,6 +1273,102 @@ def primitive(self, primitive: PrimitiveType) -> PrimitiveType: return primitive +# Implementation

Re: [PR] Core: Reduce unnecessary add operations in deletedPaths set [iceberg]

2023-10-19 Thread via GitHub
RussellSpitzer commented on PR #8868: URL: https://github.com/apache/iceberg/pull/8868#issuecomment-1771557957 Not sure I understand this change. Seems like you are removing an optimization to avoid re computing whether a path is deleted if we already determined it is deleted? -- This i

Re: [PR] Add Refurb to ruff [iceberg-python]

2023-10-19 Thread via GitHub
jayceslesar commented on PR #87: URL: https://github.com/apache/iceberg-python/pull/87#issuecomment-1771566427 You might want to have a config here that enables and disables certain checks but that can come with trial and error -- This is an automated message from the Apache Git Service.

Re: [PR] Flink: Read parquet BINARY column as String for expected [iceberg]

2023-10-19 Thread via GitHub
RussellSpitzer commented on PR #8808: URL: https://github.com/apache/iceberg/pull/8808#issuecomment-1771569346 I'm also a little nervous about this change, how are we guaranteed that the binary is parsable as UTF8 bytes? Seems like we should just be fixing the type annotations rather than c

Re: [I] Add view support for Hive catalog [iceberg]

2023-10-19 Thread via GitHub
pvary commented on issue #8698: URL: https://github.com/apache/iceberg/issues/8698#issuecomment-1771644929 @ajantha-bhat: The question is whether the Hive 3 HMS API is enough for the integration, or not. I would prefer if it would be enough, but we should definitely try to involve the Hive

Re: [PR] Add Refurb to ruff [iceberg-python]

2023-10-19 Thread via GitHub
Fokko commented on PR #87: URL: https://github.com/apache/iceberg-python/pull/87#issuecomment-1771655439 @jayceslesar Thanks for chiming in here. I believe that all checks are enabled when the plugin is enabled. In the current codebase, there are no violations. Enabling this will make sure

Re: [I] [BUG] to_arrow conversion does not support iceberg table column name containing slash [iceberg-python]

2023-10-19 Thread via GitHub
Fokko closed issue #81: [BUG] to_arrow conversion does not support iceberg table column name containing slash URL: https://github.com/apache/iceberg-python/issues/81 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [I] [BUG] to_arrow conversion does not support iceberg table column name containing slash [iceberg-python]

2023-10-19 Thread via GitHub
Fokko commented on issue #81: URL: https://github.com/apache/iceberg-python/issues/81#issuecomment-1771666165 Thanks for fixing this @puchengy -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] Make to_arrow function capable of handling parquet files with sanitized name due to Avro restirction [iceberg-python]

2023-10-19 Thread via GitHub
Fokko merged PR #83: URL: https://github.com/apache/iceberg-python/pull/83 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [I] Docs: PostgreSql integration [iceberg-python]

2023-10-19 Thread via GitHub
Fokko commented on issue #78: URL: https://github.com/apache/iceberg-python/issues/78#issuecomment-1771667298 Are you interested in providing a PR? :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[PR] Doc: fix iceberg-javadoc link [iceberg]

2023-10-19 Thread via GitHub
dramaticlly opened a new pull request, #8885: URL: https://github.com/apache/iceberg/pull/8885 latest javadoc can be found in https://iceberg.apache.org/javadoc/latest Can you help take a look @nastra @Fokko @amogh-jahagirdar -- This is an automated message from the Apache Git Ser

Re: [PR] Doc: fix iceberg-javadoc link [iceberg]

2023-10-19 Thread via GitHub
Fokko merged PR #8885: URL: https://github.com/apache/iceberg/pull/8885 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

Re: [PR] Phase 1 - New Docs Deployment [iceberg]

2023-10-19 Thread via GitHub
bitsondatadev commented on PR #8659: URL: https://github.com/apache/iceberg/pull/8659#issuecomment-1771793120 @rdblue I have this now in a good state relative to the 1.4.0 docs in nightly, could you PTAL? Thanks! -- This is an automated message from the Apache Git Service. To respond to t

Re: [I] Some question about zorder [iceberg]

2023-10-19 Thread via GitHub
github-actions[bot] commented on issue #7405: URL: https://github.com/apache/iceberg/issues/7405#issuecomment-1771870475 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Migrate catalog from hive catalog to jdbc/rest catalog [iceberg]

2023-10-19 Thread via GitHub
github-actions[bot] closed issue #7208: Migrate catalog from hive catalog to jdbc/rest catalog URL: https://github.com/apache/iceberg/issues/7208 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [I] iceberg supports incremental computation based on iceberg tags and data changes between tags in computing engines such as spark and presto. [iceberg]

2023-10-19 Thread via GitHub
github-actions[bot] closed issue #7193: iceberg supports incremental computation based on iceberg tags and data changes between tags in computing engines such as spark and presto. URL: https://github.com/apache/iceberg/issues/7193 -- This is an automated message from the Apache Git Service.

Re: [I] Migrate catalog from hive catalog to jdbc/rest catalog [iceberg]

2023-10-19 Thread via GitHub
github-actions[bot] commented on issue #7208: URL: https://github.com/apache/iceberg/issues/7208#issuecomment-1771870633 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] iceberg supports incremental computation based on iceberg tags and data changes between tags in computing engines such as spark and presto. [iceberg]

2023-10-19 Thread via GitHub
github-actions[bot] commented on issue #7193: URL: https://github.com/apache/iceberg/issues/7193#issuecomment-1771870665 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

  1   2   >