Re: [PR] Build: Bump mkdocstrings-python from 1.7.1 to 1.7.2 [iceberg-python]

2023-10-09 Thread via GitHub
Fokko merged PR #52: URL: https://github.com/apache/iceberg-python/pull/52 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Build: Bump fastavro from 1.8.3 to 1.8.4 [iceberg-python]

2023-10-09 Thread via GitHub
Fokko merged PR #51: URL: https://github.com/apache/iceberg-python/pull/51 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Build: Bump pypa/cibuildwheel from 2.16.0 to 2.16.2 [iceberg-python]

2023-10-09 Thread via GitHub
Fokko merged PR #47: URL: https://github.com/apache/iceberg-python/pull/47 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Dell : Migrate Files using TestRule to Junit5. [iceberg]

2023-10-09 Thread via GitHub
ashutosh-roy commented on code in PR #8707: URL: https://github.com/apache/iceberg/pull/8707#discussion_r1349930946 ## dell/src/test/java/org/apache/iceberg/dell/mock/ecs/EcsS3MockRule.java: ## @@ -178,4 +163,16 @@ public String bucket() { public String randomObjectName() {

Re: [PR] Dell : Migrate Files using TestRule to Junit5. [iceberg]

2023-10-09 Thread via GitHub
ashutosh-roy commented on code in PR #8707: URL: https://github.com/apache/iceberg/pull/8707#discussion_r1349931419 ## build.gradle: ## @@ -912,7 +912,6 @@ project(':iceberg-dell') { implementation project(':iceberg-common') implementation project(path: ':iceberg-bundl

Re: [PR] Build: Bump psycopg2-binary from 2.9.8 to 2.9.9 [iceberg-python]

2023-10-09 Thread via GitHub
Fokko merged PR #49: URL: https://github.com/apache/iceberg-python/pull/49 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Build: Bump coverage from 7.3.1 to 7.3.2 [iceberg-python]

2023-10-09 Thread via GitHub
Fokko merged PR #50: URL: https://github.com/apache/iceberg-python/pull/50 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Nessie: Remove dead code in NessieCatalog [iceberg]

2023-10-09 Thread via GitHub
nastra merged PR #8750: URL: https://github.com/apache/iceberg/pull/8750 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Build: Bump cython from 3.0.2 to 3.0.3 [iceberg-python]

2023-10-09 Thread via GitHub
Fokko merged PR #48: URL: https://github.com/apache/iceberg-python/pull/48 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [I] write.target-file-size-bytes isn't respected when writing data [iceberg]

2023-10-09 Thread via GitHub
paulpaul1076 commented on issue #8729: URL: https://github.com/apache/iceberg/issues/8729#issuecomment-1752528574 Thanks @RussellSpitzer you helped me with this in slack. I understand it now. I think the doc should add some extra explanation about this though. -- This is an automated mes

Re: [PR] Docs: Update spark-getting-started.md [iceberg]

2023-10-09 Thread via GitHub
Priyansh121096 commented on code in PR #8748: URL: https://github.com/apache/iceberg/pull/8748#discussion_r1349984957 ## docs/spark-getting-started.md: ## @@ -69,7 +69,7 @@ To create your first Iceberg table in Spark, use the `spark-sql` shell or `spark ```sql -- local is t

Re: [I] Unable to write to iceberg table using spark [iceberg]

2023-10-09 Thread via GitHub
di2mot commented on issue #8419: URL: https://github.com/apache/iceberg/issues/8419#issuecomment-1752621192 Are you sure it's this way `sparkConf=(SparkConf() .set("spark.jars.packages", "org.apache.iceberg:iceberg-spark-runtime-3.3_2.12:1.3.1,software.amazon.awssdk:bundle:2.20.18,sof

Re: [PR] Dell : Migrate Files using TestRule to Junit5. [iceberg]

2023-10-09 Thread via GitHub
nastra commented on code in PR #8707: URL: https://github.com/apache/iceberg/pull/8707#discussion_r1350083511 ## build.gradle: ## @@ -987,5 +987,4 @@ String getJavadocVersion() { apply from: 'jmh.gradle' apply from: 'baseline.gradle' apply from: 'deploy.gradle' -apply from: '

Re: [I] S3 compression Issue with Iceberg [iceberg]

2023-10-09 Thread via GitHub
swat1234 commented on issue #8713: URL: https://github.com/apache/iceberg/issues/8713#issuecomment-1752720100 Hi @jhchee , Thanks for your response. We are mainly looking for the compression using SNAPPY. But snappy is increasing the file size. -- This is an automated message from

Re: [PR] Build: Bump org.immutables:value from 2.9.2 to 2.10.0 [iceberg]

2023-10-09 Thread via GitHub
nastra merged PR #8736: URL: https://github.com/apache/iceberg/pull/8736 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Build: Bump slf4j from 1.7.36 to 2.0.9 [iceberg]

2023-10-09 Thread via GitHub
nastra commented on PR #8737: URL: https://github.com/apache/iceberg/pull/8737#issuecomment-1752755466 @dependabot rebase -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[PR] Docs: Fix repo name and url [iceberg-python]

2023-10-09 Thread via GitHub
manuzhang opened a new pull request, #54: URL: https://github.com/apache/iceberg-python/pull/54 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [PR] Spec: add view-uuid to view spec [iceberg]

2023-10-09 Thread via GitHub
nastra commented on PR #6551: URL: https://github.com/apache/iceberg/pull/6551#issuecomment-1752805845 Closing this as it has been addressed as part of https://github.com/apache/iceberg/pull/8591 -- This is an automated message from the Apache Git Service. To respond to the message, pleas

Re: [PR] Spec: add view-uuid to view spec [iceberg]

2023-10-09 Thread via GitHub
nastra closed pull request #6551: Spec: add view-uuid to view spec URL: https://github.com/apache/iceberg/pull/6551 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

Re: [I] Support Hudi `DeltaStreamer` compatible feature [iceberg]

2023-10-09 Thread via GitHub
pvary commented on issue #8724: URL: https://github.com/apache/iceberg/issues/8724#issuecomment-1752813481 > Yes, Flink is great but still we need to write some code for ingestion, right..? Yes, you need to write Flink the job's code. If your goal is just a simple dump, then it could

Re: [I] Upsert support for keyless Apache Flink tables [iceberg]

2023-10-09 Thread via GitHub
pvary commented on issue #8719: URL: https://github.com/apache/iceberg/issues/8719#issuecomment-1752968284 Upsert (append only) stream - by definition -, only contains Inserts in case of an update. So I would say that the example above is expected. If you need the `-U` records, you ne

Re: [I] Null support in Apache Flink [iceberg]

2023-10-09 Thread via GitHub
pvary commented on issue #8720: URL: https://github.com/apache/iceberg/issues/8720#issuecomment-1752978742 Nice! 😄 Thanks, for testing this out. If we want to officially support it, we need to write unit tests for it for all of the supported file types, and this way we could be sure t

Re: [PR] Core: Optimize computing user-facing state in data task [iceberg]

2023-10-09 Thread via GitHub
findepi commented on code in PR #8346: URL: https://github.com/apache/iceberg/pull/8346#discussion_r1350301245 ## core/src/main/java/org/apache/iceberg/BaseFileScanTask.java: ## @@ -45,31 +50,67 @@ protected FileScanTask self() { @Override protected FileScanTask newSplit

[PR] API: Fix REST Catalog schema error: uniqueItems is not valid on type object [iceberg]

2023-10-09 Thread via GitHub
johanhenriksson opened a new pull request, #8751: URL: https://github.com/apache/iceberg/pull/8751 OpenAPI schema fails code generation on row 1173: ``` - rest-catalog-open-api.yaml:1173:24 -> uniqueItems: unexpected field for type "object" ``` As far as I can tell `uniqueI

[PR] Core: Use visibility string instead of enum for Immutable visibility [iceberg]

2023-10-09 Thread via GitHub
nastra opened a new pull request, #8752: URL: https://github.com/apache/iceberg/pull/8752 Now that https://github.com/immutables/immutables/pull/1474 has been fixed and was shipped as part of Immutables [2.10.0](https://github.com/immutables/immutables/releases), we can switch to using the

Re: [PR] API: Fix REST Catalog schema error: uniqueItems is not valid on type object [iceberg]

2023-10-09 Thread via GitHub
nastra merged PR #8751: URL: https://github.com/apache/iceberg/pull/8751 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

[PR] OpenAPI: Add description for AssignUUID [iceberg]

2023-10-09 Thread via GitHub
nastra opened a new pull request, #8753: URL: https://github.com/apache/iceberg/pull/8753 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mai

Re: [PR] Dell : Migrate Files using TestRule to Junit5. [iceberg]

2023-10-09 Thread via GitHub
nastra commented on code in PR #8707: URL: https://github.com/apache/iceberg/pull/8707#discussion_r1350381873 ## build.gradle: ## @@ -988,4 +988,3 @@ apply from: 'jmh.gradle' apply from: 'baseline.gradle' apply from: 'deploy.gradle' apply from: 'tasks.gradle' - Review Commen

Re: [PR] Dell : Migrate Files using TestRule to Junit5. [iceberg]

2023-10-09 Thread via GitHub
nastra commented on code in PR #8707: URL: https://github.com/apache/iceberg/pull/8707#discussion_r1350383443 ## dell/src/test/java/org/apache/iceberg/dell/mock/ecs/EcsS3MockRule.java: ## @@ -33,16 +33,16 @@ import org.apache.iceberg.dell.DellClientFactories; import org.apache

[I] Some questions about Iceberg's capabilities in Flink [iceberg]

2023-10-09 Thread via GitHub
jonathf opened a new issue, #8754: URL: https://github.com/apache/iceberg/issues/8754 ### Query engine Flink 1.15.2 ### Question There might be a page for this, but I am wondering if there exists an overview of capabilities of iceberg on Flink. Here are some that come to

Re: [PR] Dell : Migrate Files using TestRule to Junit5. [iceberg]

2023-10-09 Thread via GitHub
ashutosh-roy commented on code in PR #8707: URL: https://github.com/apache/iceberg/pull/8707#discussion_r1350435267 ## dell/src/test/java/org/apache/iceberg/dell/mock/ecs/EcsS3MockRule.java: ## @@ -33,16 +33,16 @@ import org.apache.iceberg.dell.DellClientFactories; import org.

Re: [PR] Dell : Migrate Files using TestRule to Junit5. [iceberg]

2023-10-09 Thread via GitHub
nastra commented on code in PR #8707: URL: https://github.com/apache/iceberg/pull/8707#discussion_r1350451880 ## dell/src/test/java/org/apache/iceberg/dell/mock/ecs/EcsS3MockRule.java: ## @@ -33,16 +33,16 @@ import org.apache.iceberg.dell.DellClientFactories; import org.apache

Re: [I] Some questions about Iceberg's capabilities in Flink [iceberg]

2023-10-09 Thread via GitHub
pvary commented on issue #8754: URL: https://github.com/apache/iceberg/issues/8754#issuecomment-1753220042 > 1. Does iceberg support being a lookup table? No, I do not think so. > 2. If so, does it support processing time, event time or both? See above > 3. Does Ic

Re: [PR] Core: Use visibility string instead of enum for Immutable visibility [iceberg]

2023-10-09 Thread via GitHub
danielcweeks merged PR #8752: URL: https://github.com/apache/iceberg/pull/8752 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceber

Re: [PR] Spec: add types timstamp_ns and timestamptz_ns [iceberg]

2023-10-09 Thread via GitHub
jacobmarble commented on code in PR #8683: URL: https://github.com/apache/iceberg/pull/8683#discussion_r1350502762 ## format/spec.md: ## @@ -167,30 +167,34 @@ A **`map`** is a collection of key-value pairs with a key type and a value type. Primitive Types -| Primitive

Re: [PR] Standard key manager [iceberg]

2023-10-09 Thread via GitHub
rdblue commented on code in PR #6884: URL: https://github.com/apache/iceberg/pull/6884#discussion_r1350549932 ## core/src/main/java/org/apache/iceberg/encryption/PlaintextEncryptionManager.java: ## @@ -18,26 +18,32 @@ */ package org.apache.iceberg.encryption; -import java.n

Re: [PR] OpenAPI: Add description for AssignUUID [iceberg]

2023-10-09 Thread via GitHub
singhpk234 commented on code in PR #8753: URL: https://github.com/apache/iceberg/pull/8753#discussion_r1350563256 ## open-api/rest-catalog-open-api.py: ## @@ -230,6 +230,10 @@ class BaseUpdate(BaseModel): class AssignUUIDUpdate(BaseUpdate): +""" +Assigning a UUID to

[PR] [WIP] API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors [iceberg]

2023-10-09 Thread via GitHub
aokolnychyi opened a new pull request, #8755: URL: https://github.com/apache/iceberg/pull/8755 This WIP PR has code to parallelize reading of deletes and enable caching them on executors. If we decide to go ahead with this change, it will be split into multiple smaller PRs with docs and tes

Re: [PR] [WIP] API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors [iceberg]

2023-10-09 Thread via GitHub
aokolnychyi commented on code in PR #8755: URL: https://github.com/apache/iceberg/pull/8755#discussion_r1350591568 ## core/src/main/java/org/apache/iceberg/SystemConfigs.java: ## @@ -42,6 +42,13 @@ private SystemConfigs() {} Math.max(2, Runtime.getRuntime().availableP

Re: [PR] [WIP] API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors [iceberg]

2023-10-09 Thread via GitHub
aokolnychyi commented on code in PR #8755: URL: https://github.com/apache/iceberg/pull/8755#discussion_r1350591568 ## core/src/main/java/org/apache/iceberg/SystemConfigs.java: ## @@ -42,6 +42,13 @@ private SystemConfigs() {} Math.max(2, Runtime.getRuntime().availableP

Re: [PR] [WIP] API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors [iceberg]

2023-10-09 Thread via GitHub
aokolnychyi commented on code in PR #8755: URL: https://github.com/apache/iceberg/pull/8755#discussion_r1350592657 ## api/src/main/java/org/apache/iceberg/util/CharSequenceMap.java: ## @@ -0,0 +1,208 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or mo

Re: [PR] [WIP] API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors [iceberg]

2023-10-09 Thread via GitHub
aokolnychyi commented on code in PR #8755: URL: https://github.com/apache/iceberg/pull/8755#discussion_r1350593264 ## core/src/main/java/org/apache/iceberg/TableProperties.java: ## @@ -236,6 +236,8 @@ private TableProperties() {} public static final String DELETE_PLANNING_MOD

Re: [PR] [WIP] API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors [iceberg]

2023-10-09 Thread via GitHub
aokolnychyi commented on code in PR #8755: URL: https://github.com/apache/iceberg/pull/8755#discussion_r1350597273 ## core/src/main/java/org/apache/iceberg/deletes/Deletes.java: ## @@ -125,6 +126,25 @@ public static StructLikeSet toEqualitySet( } } + public static Ch

Re: [PR] [WIP] API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors [iceberg]

2023-10-09 Thread via GitHub
aokolnychyi commented on code in PR #8755: URL: https://github.com/apache/iceberg/pull/8755#discussion_r1350597273 ## core/src/main/java/org/apache/iceberg/deletes/Deletes.java: ## @@ -125,6 +126,25 @@ public static StructLikeSet toEqualitySet( } } + public static Ch

Re: [PR] [WIP] API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors [iceberg]

2023-10-09 Thread via GitHub
aokolnychyi commented on code in PR #8755: URL: https://github.com/apache/iceberg/pull/8755#discussion_r1350598721 ## core/src/main/java/org/apache/iceberg/util/ThreadPools.java: ## @@ -59,6 +65,20 @@ public static ExecutorService getWorkerPool() { return WORKER_POOL; }

Re: [PR] [WIP] API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors [iceberg]

2023-10-09 Thread via GitHub
aokolnychyi commented on code in PR #8755: URL: https://github.com/apache/iceberg/pull/8755#discussion_r1350598111 ## core/src/main/java/org/apache/iceberg/deletes/PositionDeleteIndex.java: ## @@ -44,4 +47,23 @@ public interface PositionDeleteIndex { /** Returns true if thi

Re: [PR] [WIP] API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors [iceberg]

2023-10-09 Thread via GitHub
aokolnychyi commented on code in PR #8755: URL: https://github.com/apache/iceberg/pull/8755#discussion_r1350599382 ## data/src/main/java/org/apache/iceberg/data/BaseDeleteLoader.java: ## @@ -0,0 +1,209 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

Re: [PR] [WIP] API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors [iceberg]

2023-10-09 Thread via GitHub
aokolnychyi commented on code in PR #8755: URL: https://github.com/apache/iceberg/pull/8755#discussion_r1350599664 ## data/src/main/java/org/apache/iceberg/data/BaseDeleteLoader.java: ## @@ -0,0 +1,209 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

Re: [PR] [WIP] API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors [iceberg]

2023-10-09 Thread via GitHub
aokolnychyi commented on code in PR #8755: URL: https://github.com/apache/iceberg/pull/8755#discussion_r1350602336 ## data/src/main/java/org/apache/iceberg/data/DeleteFilter.java: ## @@ -243,20 +236,9 @@ private CloseableIterable applyPosDeletes(CloseableIterable records) {

Re: [PR] [WIP] API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors [iceberg]

2023-10-09 Thread via GitHub
aokolnychyi commented on code in PR #8755: URL: https://github.com/apache/iceberg/pull/8755#discussion_r1350603878 ## data/src/main/java/org/apache/iceberg/data/DeleteLoader.java: ## @@ -0,0 +1,31 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more

Re: [PR] Dell : Migrate Files using TestRule to Junit5. [iceberg]

2023-10-09 Thread via GitHub
ashutosh-roy commented on code in PR #8707: URL: https://github.com/apache/iceberg/pull/8707#discussion_r1350632269 ## dell/src/test/java/org/apache/iceberg/dell/mock/ecs/EcsS3MockRule.java: ## @@ -33,16 +33,16 @@ import org.apache.iceberg.dell.DellClientFactories; import org.

Re: [PR] Spec: Add partition stats spec [iceberg]

2023-10-09 Thread via GitHub
aokolnychyi commented on PR #7105: URL: https://github.com/apache/iceberg/pull/7105#issuecomment-1753540465 Getting to this today! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [I] Some questions about Iceberg's capabilities in Flink [iceberg]

2023-10-09 Thread via GitHub
pvary commented on issue #8754: URL: https://github.com/apache/iceberg/issues/8754#issuecomment-1753543389 Sorry for the confusion 😄 I meant Out Of The Box, but missed the T -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] Add Spark UI metrics from Iceberg scan metrics [iceberg]

2023-10-09 Thread via GitHub
aokolnychyi commented on PR #8717: URL: https://github.com/apache/iceberg/pull/8717#issuecomment-1753550229 Let me take a look. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

[I] Cannot create a V1 table with `CREATE OR REPLACE TABLE` [iceberg]

2023-10-09 Thread via GitHub
Fokko opened a new issue, #8756: URL: https://github.com/apache/iceberg/issues/8756 ### Apache Iceberg version 1.4.0 (latest release) ### Query engine Spark ### Please describe the bug 🐞 Backed by the rest catalog: ``` spark-sql ()> CREATE OR REPLACE T

Re: [PR] Docs: Fix repo name and url [iceberg-python]

2023-10-09 Thread via GitHub
Fokko merged PR #54: URL: https://github.com/apache/iceberg-python/pull/54 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

[PR] Run integration tests with Iceberg 1.4.0 [iceberg-python]

2023-10-09 Thread via GitHub
Fokko opened a new pull request, #56: URL: https://github.com/apache/iceberg-python/pull/56 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-m

[PR] Add logic for table format-version updates [iceberg-python]

2023-10-09 Thread via GitHub
Fokko opened a new pull request, #55: URL: https://github.com/apache/iceberg-python/pull/55 Add a few more tests -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

Re: [PR] Construct a writer tree [iceberg-python]

2023-10-09 Thread via GitHub
Fokko commented on code in PR #40: URL: https://github.com/apache/iceberg-python/pull/40#discussion_r1350771798 ## pyiceberg/avro/resolver.py: ## @@ -233,7 +255,107 @@ def skip(self, decoder: BinaryDecoder) -> None: pass -class SchemaResolver(PrimitiveWithPartnerVis

Re: [PR] Construct a writer tree [iceberg-python]

2023-10-09 Thread via GitHub
Fokko commented on code in PR #40: URL: https://github.com/apache/iceberg-python/pull/40#discussion_r1350771798 ## pyiceberg/avro/resolver.py: ## @@ -233,7 +255,107 @@ def skip(self, decoder: BinaryDecoder) -> None: pass -class SchemaResolver(PrimitiveWithPartnerVis

Re: [PR] Construct a writer tree [iceberg-python]

2023-10-09 Thread via GitHub
Fokko commented on code in PR #40: URL: https://github.com/apache/iceberg-python/pull/40#discussion_r1350771798 ## pyiceberg/avro/resolver.py: ## @@ -233,7 +255,107 @@ def skip(self, decoder: BinaryDecoder) -> None: pass -class SchemaResolver(PrimitiveWithPartnerVis

Re: [PR] Add Spark UI metrics from Iceberg scan metrics [iceberg]

2023-10-09 Thread via GitHub
aokolnychyi commented on code in PR #8717: URL: https://github.com/apache/iceberg/pull/8717#discussion_r1350766227 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkScan.java: ## @@ -207,6 +223,15 @@ public CustomTaskMetric[] reportDriverMetrics() { dri

Re: [PR] Spark: Parameterize backup suffix in migrate procedure [iceberg]

2023-10-09 Thread via GitHub
edgarRd commented on PR #7121: URL: https://github.com/apache/iceberg/pull/7121#issuecomment-1753919401 I suppose we were okay with changing the backup name on `migrate` after all: https://github.com/apache/iceberg/pull/8227/files - closing this PR. -- This is an automated message from th

Re: [PR] Spark: Parameterize backup suffix in migrate procedure [iceberg]

2023-10-09 Thread via GitHub
edgarRd closed pull request #7121: Spark: Parameterize backup suffix in migrate procedure URL: https://github.com/apache/iceberg/pull/7121 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] Spec: Clarify spec_id field in Data File [iceberg]

2023-10-09 Thread via GitHub
aokolnychyi commented on code in PR #8730: URL: https://github.com/apache/iceberg/pull/8730#discussion_r1350806829 ## format/spec.md: ## @@ -443,13 +443,13 @@ The schema of a manifest file is a struct called `manifest_entry` with the follo | _optional_ | _optional_ | **`132 s

Re: [PR] Update spark-getting-started.md [iceberg-docs]

2023-10-09 Thread via GitHub
Priyansh121096 commented on PR #281: URL: https://github.com/apache/iceberg-docs/pull/281#issuecomment-1753937357 Closing as per https://github.com/apache/iceberg/pull/8748#issuecomment-1752381404. -- This is an automated message from the Apache Git Service. To respond to the message, ple

Re: [PR] Update spark-getting-started.md [iceberg-docs]

2023-10-09 Thread via GitHub
Priyansh121096 closed pull request #281: Update spark-getting-started.md URL: https://github.com/apache/iceberg-docs/pull/281 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Docs: Update spark-getting-started.md [iceberg]

2023-10-09 Thread via GitHub
Priyansh121096 commented on code in PR #8748: URL: https://github.com/apache/iceberg/pull/8748#discussion_r1350931538 ## docs/spark-getting-started.md: ## @@ -69,7 +69,7 @@ To create your first Iceberg table in Spark, use the `spark-sql` shell or `spark ```sql -- local is t

Re: [PR] Run integration tests with Iceberg 1.4.0 [iceberg-python]

2023-10-09 Thread via GitHub
rdblue commented on PR #56: URL: https://github.com/apache/iceberg-python/pull/56#issuecomment-1754065952 Thanks, @Fokko! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Run integration tests with Iceberg 1.4.0 [iceberg-python]

2023-10-09 Thread via GitHub
rdblue merged PR #56: URL: https://github.com/apache/iceberg-python/pull/56 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

Re: [PR] Add logic for table format-version updates [iceberg-python]

2023-10-09 Thread via GitHub
rdblue commented on PR #55: URL: https://github.com/apache/iceberg-python/pull/55#issuecomment-1754068424 Looks good to me. I personally like using more descriptive verbs, like "upgrade" instead of "set" since this only allows upgrading the version, but I don't think it's a blocker (especia

Re: [PR] Construct a writer tree [iceberg-python]

2023-10-09 Thread via GitHub
rdblue commented on code in PR #40: URL: https://github.com/apache/iceberg-python/pull/40#discussion_r1351003830 ## pyiceberg/avro/resolver.py: ## @@ -233,7 +255,107 @@ def skip(self, decoder: BinaryDecoder) -> None: pass -class SchemaResolver(PrimitiveWithPartnerVi

Re: [PR] Construct a writer tree [iceberg-python]

2023-10-09 Thread via GitHub
rdblue commented on code in PR #40: URL: https://github.com/apache/iceberg-python/pull/40#discussion_r1351005623 ## pyiceberg/avro/resolver.py: ## @@ -192,7 +195,26 @@ def visit_binary(self, binary_type: BinaryType) -> Writer: return BinaryWriter() -def resolve( +CO

Re: [PR] Construct a writer tree [iceberg-python]

2023-10-09 Thread via GitHub
rdblue commented on code in PR #40: URL: https://github.com/apache/iceberg-python/pull/40#discussion_r1351005850 ## pyiceberg/avro/resolver.py: ## @@ -192,7 +195,26 @@ def visit_binary(self, binary_type: BinaryType) -> Writer: return BinaryWriter() -def resolve( +CO

Re: [PR] Construct a writer tree [iceberg-python]

2023-10-09 Thread via GitHub
rdblue commented on code in PR #40: URL: https://github.com/apache/iceberg-python/pull/40#discussion_r1351006286 ## pyiceberg/avro/resolver.py: ## @@ -192,7 +194,28 @@ def visit_binary(self, binary_type: BinaryType) -> Writer: return BinaryWriter() -def resolve( +CO

Re: [PR] Construct a writer tree [iceberg-python]

2023-10-09 Thread via GitHub
rdblue commented on code in PR #40: URL: https://github.com/apache/iceberg-python/pull/40#discussion_r1351007903 ## pyiceberg/avro/resolver.py: ## @@ -233,7 +256,93 @@ def skip(self, decoder: BinaryDecoder) -> None: pass -class SchemaResolver(PrimitiveWithPartnerVis

Re: [PR] Construct a writer tree [iceberg-python]

2023-10-09 Thread via GitHub
rdblue commented on code in PR #40: URL: https://github.com/apache/iceberg-python/pull/40#discussion_r1351008344 ## pyiceberg/avro/resolver.py: ## @@ -233,7 +256,93 @@ def skip(self, decoder: BinaryDecoder) -> None: pass -class SchemaResolver(PrimitiveWithPartnerVis

Re: [PR] Construct a writer tree [iceberg-python]

2023-10-09 Thread via GitHub
rdblue commented on code in PR #40: URL: https://github.com/apache/iceberg-python/pull/40#discussion_r1351010254 ## pyiceberg/avro/resolver.py: ## @@ -233,7 +256,93 @@ def skip(self, decoder: BinaryDecoder) -> None: pass -class SchemaResolver(PrimitiveWithPartnerVis

Re: [PR] Construct a writer tree [iceberg-python]

2023-10-09 Thread via GitHub
rdblue commented on code in PR #40: URL: https://github.com/apache/iceberg-python/pull/40#discussion_r1351007903 ## pyiceberg/avro/resolver.py: ## @@ -233,7 +256,93 @@ def skip(self, decoder: BinaryDecoder) -> None: pass -class SchemaResolver(PrimitiveWithPartnerVis

Re: [PR] Construct a writer tree [iceberg-python]

2023-10-09 Thread via GitHub
rdblue commented on code in PR #40: URL: https://github.com/apache/iceberg-python/pull/40#discussion_r1351008344 ## pyiceberg/avro/resolver.py: ## @@ -233,7 +256,93 @@ def skip(self, decoder: BinaryDecoder) -> None: pass -class SchemaResolver(PrimitiveWithPartnerVis

Re: [PR] Construct a writer tree [iceberg-python]

2023-10-09 Thread via GitHub
rdblue commented on code in PR #40: URL: https://github.com/apache/iceberg-python/pull/40#discussion_r1351007903 ## pyiceberg/avro/resolver.py: ## @@ -233,7 +256,93 @@ def skip(self, decoder: BinaryDecoder) -> None: pass -class SchemaResolver(PrimitiveWithPartnerVis

Re: [PR] Core: Add View support for REST catalog [iceberg]

2023-10-09 Thread via GitHub
rdblue commented on code in PR #7913: URL: https://github.com/apache/iceberg/pull/7913#discussion_r1351018901 ## core/src/main/java/org/apache/iceberg/catalog/BaseSessionCatalog.java: ## @@ -30,8 +30,10 @@ import org.apache.iceberg.exceptions.NamespaceNotEmptyException; import

Re: [PR] Core: Add View support for REST catalog [iceberg]

2023-10-09 Thread via GitHub
rdblue commented on code in PR #7913: URL: https://github.com/apache/iceberg/pull/7913#discussion_r1351025602 ## core/src/main/java/org/apache/iceberg/rest/CatalogHandlers.java: ## @@ -374,4 +385,107 @@ static TableMetadata commit(TableOperations ops, UpdateTableRequest request

Re: [I] How to improve performance of RewriteManifests procedure? [iceberg]

2023-10-09 Thread via GitHub
github-actions[bot] commented on issue #7325: URL: https://github.com/apache/iceberg/issues/7325#issuecomment-1754094275 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Manifest List Writer Design [iceberg-rust]

2023-10-09 Thread via GitHub
barronw commented on issue #72: URL: https://github.com/apache/iceberg-rust/issues/72#issuecomment-1754114530 > The ManifestList is a simple wrapper of Vec, so I think providing a method for iterator of entries would not be huge effort. I'm not sure I'm following. To clarify, I was wo

Re: [PR] Spec: Add partition stats spec [iceberg]

2023-10-09 Thread via GitHub
aokolnychyi commented on code in PR #7105: URL: https://github.com/apache/iceberg/pull/7105#discussion_r1351007189 ## format/spec.md: ## @@ -702,6 +703,58 @@ Blob metadata is a struct with the following fields: | _optional_ | _optional_ | **`properties`** | `map` | Additional

Re: [PR] Spec: Add partition stats spec [iceberg]

2023-10-09 Thread via GitHub
aokolnychyi commented on code in PR #7105: URL: https://github.com/apache/iceberg/pull/7105#discussion_r1351072100 ## format/spec.md: ## @@ -702,6 +703,58 @@ Blob metadata is a struct with the following fields: | _optional_ | _optional_ | **`properties`** | `map` | Additional

Re: [PR] Spec: Add partition stats spec [iceberg]

2023-10-09 Thread via GitHub
aokolnychyi commented on code in PR #7105: URL: https://github.com/apache/iceberg/pull/7105#discussion_r1351072100 ## format/spec.md: ## @@ -702,6 +703,58 @@ Blob metadata is a struct with the following fields: | _optional_ | _optional_ | **`properties`** | `map` | Additional

Re: [PR] [WIP] API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors [iceberg]

2023-10-09 Thread via GitHub
singhpk234 commented on PR #8755: URL: https://github.com/apache/iceberg/pull/8755#issuecomment-1754135275 Should we apply some intelligence on how we are distributing the taskGroups so that we could utilize the max from the executor cache ? For ex : lets say we could prefer sending those s

Re: [PR] [WIP] API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors [iceberg]

2023-10-09 Thread via GitHub
aokolnychyi commented on PR #8755: URL: https://github.com/apache/iceberg/pull/8755#issuecomment-1754147825 > Should we apply some intelligence on how we are distributing the tasks so that we could utilize the max from the executor cache ? For ex : lets say we could prefer sending those set

Re: [PR] [WIP] API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors [iceberg]

2023-10-09 Thread via GitHub
singhpk234 commented on PR #8755: URL: https://github.com/apache/iceberg/pull/8755#issuecomment-1754148814 Thanks @aokolnychyi looking forward to it :) ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [WIP] API, Core, Spark 3.5: Parallelize reading of deletes and cache them on executors [iceberg]

2023-10-09 Thread via GitHub
aokolnychyi commented on PR #8755: URL: https://github.com/apache/iceberg/pull/8755#issuecomment-1754155151 I tested this PR on a cluster a bit. It would be nice if someone could also play around with it in their environment. -- This is an automated message from the Apache Git Service. To

Re: [I] Manifest List Writer Design [iceberg-rust]

2023-10-09 Thread via GitHub
liurenjie1024 commented on issue #72: URL: https://github.com/apache/iceberg-rust/issues/72#issuecomment-1754168678 > > The ManifestList is a simple wrapper of Vec, so I think providing a method for iterator of entries would not be huge effort. > > I'm not sure I'm following. To clari

Re: [PR] Flink:backport PR to 1.15 #7360: Implement data statistics coordinator to aggregate data statistics from operator subtasks [iceberg]

2023-10-09 Thread via GitHub
yegangy0718 commented on code in PR #8749: URL: https://github.com/apache/iceberg/pull/8749#discussion_r1351169031 ## flink/v1.15/flink/src/main/java/org/apache/iceberg/flink/sink/shuffle/DataStatisticsCoordinator.java: ## @@ -0,0 +1,322 @@ +/* + * Licensed to the Apache Softwar

[I] How is iceberg compatible with hive's tez engine [iceberg]

2023-10-09 Thread via GitHub
dragon-feng opened a new issue, #8757: URL: https://github.com/apache/iceberg/issues/8757 ### Query engine _No response_ ### Question How is iceberg compatible with hive's tez engine -- This is an automated message from the Apache Git Service. To respond to the message

Re: [I] flink1.14.4+iceberg0.13.1+hive-metastore3.1.2+minio(S3) error! [iceberg]

2023-10-09 Thread via GitHub
ramdas-jagtap commented on issue #4743: URL: https://github.com/apache/iceberg/issues/4743#issuecomment-1754378062 I have same setup of [flink1.13.2+iceberg0.13.0+hive-metastore3.0.0+minio(S3). How are specifying s3 storage properties while creating tables? CREATE TABLE test_table (

Re: [I] How is iceberg compatible with hive's tez engine [iceberg]

2023-10-09 Thread via GitHub
pvary commented on issue #8757: URL: https://github.com/apache/iceberg/issues/8757#issuecomment-1754397636 There are some missing features in Hive3's Tez dependency, so DML operations are not supported with Hive3 and Tez. See: https://iceberg.apache.org/docs/latest/hive/#feature-support

Re: [I] flink1.14.4+iceberg0.13.1+hive-metastore3.1.2+minio(S3) error! [iceberg]

2023-10-09 Thread via GitHub
pvary commented on issue #4743: URL: https://github.com/apache/iceberg/issues/4743#issuecomment-1754400735 This last error seems different than the original one in the issue. 403 means that the authentication is failing, and S3 access is not available. -- This is an automated messa

Re: [I] flink1.14.4+iceberg0.13.1+hive-metastore3.1.2+minio(S3) error! [iceberg]

2023-10-09 Thread via GitHub
ramdas-jagtap commented on issue #4743: URL: https://github.com/apache/iceberg/issues/4743#issuecomment-1754409710 @pvary I know the error is diff than the issue. Do we have document for Flink on how to configure flink with Flink with Iceberg, Hive, and Minio. I am more interested in config