Re: [PR] feat: suport read/write Manifest [iceberg-rust]

2023-10-16 Thread via GitHub
JanKaul commented on code in PR #79: URL: https://github.com/apache/iceberg-rust/pull/79#discussion_r1360204175 ## crates/iceberg/src/spec/values.rs: ## @@ -966,6 +978,547 @@ mod timestamptz { } } +mod serde { Review Comment: I think using the RawLiteral makes sense.

Re: [PR] Build: Bump ray from 2.7.0 to 2.7.1 [iceberg-python]

2023-10-16 Thread via GitHub
Fokko merged PR #77: URL: https://github.com/apache/iceberg-python/pull/77 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Build: Bump griffe from 0.36.4 to 0.36.5 [iceberg-python]

2023-10-16 Thread via GitHub
Fokko merged PR #76: URL: https://github.com/apache/iceberg-python/pull/76 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Build: Bump mypy-boto3-glue from 1.28.36 to 1.28.63 [iceberg-python]

2023-10-16 Thread via GitHub
Fokko merged PR #75: URL: https://github.com/apache/iceberg-python/pull/75 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Build: Bump mkdocstrings-python from 1.7.2 to 1.7.3 [iceberg-python]

2023-10-16 Thread via GitHub
Fokko merged PR #74: URL: https://github.com/apache/iceberg-python/pull/74 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Build: Bump moto from 4.2.5 to 4.2.6 [iceberg-python]

2023-10-16 Thread via GitHub
Fokko merged PR #73: URL: https://github.com/apache/iceberg-python/pull/73 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] feat: suport read/write Manifest [iceberg-rust]

2023-10-16 Thread via GitHub
JanKaul commented on code in PR #79: URL: https://github.com/apache/iceberg-rust/pull/79#discussion_r1360224491 ## crates/iceberg/src/spec/manifest.rs: ## @@ -0,0 +1,671 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements.

Re: [PR] Infra: Cleanup labeler.yml [iceberg]

2023-10-16 Thread via GitHub
Fokko commented on code in PR #8795: URL: https://github.com/apache/iceberg/pull/8795#discussion_r1360239997 ## .github/labeler.yml: ## @@ -55,16 +55,12 @@ HIVE: - hive3/**/* - hive-metastore/**/* - hive-runtime/**/* + - hive3-orc-bundle/**/* DATA: - data/**/* SPA

Re: [PR] Infra: Cleanup labeler.yml [iceberg]

2023-10-16 Thread via GitHub
Fokko merged PR #8795: URL: https://github.com/apache/iceberg/pull/8795 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

Re: [I] Manage dependencies using workspace. [iceberg-rust]

2023-10-16 Thread via GitHub
liurenjie1024 commented on issue #24: URL: https://github.com/apache/iceberg-rust/issues/24#issuecomment-1763911352 We'll do this after #78 get merged. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] feat: In memory catalog [iceberg-rust]

2023-10-16 Thread via GitHub
JanKaul closed pull request #74: feat: In memory catalog URL: https://github.com/apache/iceberg-rust/pull/74 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-m

Re: [PR] Core: Add View support for REST catalog [iceberg]

2023-10-16 Thread via GitHub
nastra commented on code in PR #7913: URL: https://github.com/apache/iceberg/pull/7913#discussion_r1360265755 ## open-api/rest-catalog-open-api.yaml: ## @@ -1630,6 +1990,102 @@ components: metadata-log: $ref: '#/components/schemas/MetadataLog' +SQLViewR

Re: [PR] Build: Bump org.xerial:sqlite-jdbc from 3.43.0.0 to 3.43.2.0 [iceberg]

2023-10-16 Thread via GitHub
nastra merged PR #8837: URL: https://github.com/apache/iceberg/pull/8837 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [I] Manage dependencies using workspace. [iceberg-rust]

2023-10-16 Thread via GitHub
Xuanwo commented on issue #24: URL: https://github.com/apache/iceberg-rust/issues/24#issuecomment-1763934264 > Following #15 , we should provide workspace wide dependency management. From my daily routine, I find it inconvenient when a library has workspace-wide dependencies. I have t

Re: [PR] Build: add gradle configuration to enforce reproducible build [iceberg]

2023-10-16 Thread via GitHub
nastra merged PR #8826: URL: https://github.com/apache/iceberg/pull/8826 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [I] Build: enforce reproducible build [iceberg]

2023-10-16 Thread via GitHub
nastra closed issue #8825: Build: enforce reproducible build URL: https://github.com/apache/iceberg/issues/8825 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [I] Manage dependencies using workspace. [iceberg-rust]

2023-10-16 Thread via GitHub
liurenjie1024 commented on issue #24: URL: https://github.com/apache/iceberg-rust/issues/24#issuecomment-1763942829 > > Following #15 , we should provide workspace wide dependency management. > > From my daily routine, I find it inconvenient when a library has workspace-wide dependenc

Re: [PR] feat: First version of rest catalog. [iceberg-rust]

2023-10-16 Thread via GitHub
Xuanwo commented on code in PR #78: URL: https://github.com/apache/iceberg-rust/pull/78#discussion_r1360330437 ## crates/iceberg-rest/Cargo.toml: ## @@ -0,0 +1,41 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the

Re: [PR] feat: First version of rest catalog. [iceberg-rust]

2023-10-16 Thread via GitHub
liurenjie1024 commented on code in PR #78: URL: https://github.com/apache/iceberg-rust/pull/78#discussion_r1360337146 ## crates/iceberg-rest/Cargo.toml: ## @@ -0,0 +1,41 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements.

Re: [PR] feat: First version of rest catalog. [iceberg-rust]

2023-10-16 Thread via GitHub
Xuanwo commented on code in PR #78: URL: https://github.com/apache/iceberg-rust/pull/78#discussion_r1360351583 ## crates/iceberg-rest/Cargo.toml: ## @@ -0,0 +1,41 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the

Re: [PR] feat: First version of rest catalog. [iceberg-rust]

2023-10-16 Thread via GitHub
liurenjie1024 commented on code in PR #78: URL: https://github.com/apache/iceberg-rust/pull/78#discussion_r1360370023 ## crates/iceberg-rest/Cargo.toml: ## @@ -0,0 +1,41 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements.

Re: [I] [JdbcCatalog] Issue with Namespace Exists [iceberg]

2023-10-16 Thread via GitHub
amogh-jahagirdar commented on issue #8832: URL: https://github.com/apache/iceberg/issues/8832#issuecomment-1764106743 > Sounds like a similar issue to #8321. @amogh-jahagirdar should we get #8321 reviewed first and then we can address any leftovers in this PR? Ooh missed #8321, yes le

Re: [PR] JDBC: JDBC Catalog should do exact namespace search for get namespace queries [iceberg]

2023-10-16 Thread via GitHub
amogh-jahagirdar commented on PR #8833: URL: https://github.com/apache/iceberg/pull/8833#issuecomment-1764118823 Looks like a similar PR was already put up in the past https://github.com/apache/iceberg/pull/8340, we can just review that. -- This is an automated message from the Apache Git

Re: [PR] feat: suport read/write Manifest [iceberg-rust]

2023-10-16 Thread via GitHub
ZENOTME commented on code in PR #79: URL: https://github.com/apache/iceberg-rust/pull/79#discussion_r1360421087 ## crates/iceberg/src/spec/manifest.rs: ## @@ -0,0 +1,671 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements.

Re: [PR] Core: fix reading of split offsets in manifests [iceberg]

2023-10-16 Thread via GitHub
bryanck commented on PR #8834: URL: https://github.com/apache/iceberg/pull/8834#issuecomment-1764216831 > Other than to revert the optimize in #8336, is it better to invalidate the cached `splitOffsetList`? The proposed change is in the `org.apache.iceberg.BaseFile#put` function: > `

Re: [PR] Add an expireAfterWrite cache eviction policy to CachingCatalog [iceberg]

2023-10-16 Thread via GitHub
nastra commented on code in PR #8844: URL: https://github.com/apache/iceberg/pull/8844#discussion_r1360480900 ## core/src/main/java/org/apache/iceberg/CachingCatalog.java: ## @@ -110,6 +110,8 @@ private Cache createTableCache(Ticker ticker) { .removalListener(new Met

Re: [PR] Add an expireAfterWrite cache eviction policy to CachingCatalog [iceberg]

2023-10-16 Thread via GitHub
nastra commented on code in PR #8844: URL: https://github.com/apache/iceberg/pull/8844#discussion_r1360480900 ## core/src/main/java/org/apache/iceberg/CachingCatalog.java: ## @@ -110,6 +110,8 @@ private Cache createTableCache(Ticker ticker) { .removalListener(new Met

Re: [PR] push down min/max/count to iceberg [iceberg]

2023-10-16 Thread via GitHub
atifiu commented on PR #6252: URL: https://github.com/apache/iceberg/pull/6252#issuecomment-1764246045 @huaxingao Based on your suggestion, I have narrowed the filter criteria so that even considering the timezone problem, we dont filter on more than two partitions so that filter can be pus

Re: [PR] Core: fix reading of split offsets in manifests [iceberg]

2023-10-16 Thread via GitHub
nastra commented on code in PR #8834: URL: https://github.com/apache/iceberg/pull/8834#discussion_r1360559226 ## core/src/test/java/org/apache/iceberg/TestManifestReader.java: ## @@ -61,16 +70,14 @@ public void testReaderWithFilterWithoutSelect() throws IOException { Manif

Re: [PR] Core: fix reading of split offsets in manifests [iceberg]

2023-10-16 Thread via GitHub
advancedxy commented on PR #8834: URL: https://github.com/apache/iceberg/pull/8834#issuecomment-1764363047 > > Other than to revert the optimize in #8336, is it better to invalidate the cached `splitOffsetList`? The proposed change is in the `org.apache.iceberg.BaseFile#put` function: >

Re: [PR] Add an expireAfterWrite cache eviction policy to CachingCatalog [iceberg]

2023-10-16 Thread via GitHub
nastra commented on code in PR #8844: URL: https://github.com/apache/iceberg/pull/8844#discussion_r1360588962 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestCachingCatalogExpirationAfterWrite.java: ## @@ -0,0 +1,82 @@ +/* + * Licensed to the

Re: [PR] Add an expireAfterWrite cache eviction policy to CachingCatalog [iceberg]

2023-10-16 Thread via GitHub
nastra commented on code in PR #8844: URL: https://github.com/apache/iceberg/pull/8844#discussion_r1360589513 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestCachingCatalogExpirationAfterWrite.java: ## @@ -0,0 +1,82 @@ +/* + * Licensed to the

Re: [PR] Add an expireAfterWrite cache eviction policy to CachingCatalog [iceberg]

2023-10-16 Thread via GitHub
nastra commented on code in PR #8844: URL: https://github.com/apache/iceberg/pull/8844#discussion_r1360589867 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestCachingCatalogExpirationAfterWrite.java: ## @@ -0,0 +1,82 @@ +/* + * Licensed to the

Re: [PR] Add an expireAfterWrite cache eviction policy to CachingCatalog [iceberg]

2023-10-16 Thread via GitHub
nastra commented on code in PR #8844: URL: https://github.com/apache/iceberg/pull/8844#discussion_r1360600506 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestCachingCatalogExpirationAfterWrite.java: ## @@ -0,0 +1,82 @@ +/* + * Licensed to the

Re: [PR] Nessie: Adapt to Nessie 0.71.1 release [iceberg]

2023-10-16 Thread via GitHub
dimas-b commented on code in PR #8798: URL: https://github.com/apache/iceberg/pull/8798#discussion_r1360635935 ## nessie/src/test/java/org/apache/iceberg/nessie/TestCustomNessieClient.java: ## @@ -78,30 +77,11 @@ public void testNonExistentCustomClient() {

Re: [PR] feat: manifest list writer [iceberg-rust]

2023-10-16 Thread via GitHub
Fokko commented on code in PR #76: URL: https://github.com/apache/iceberg-rust/pull/76#discussion_r1360636839 ## crates/iceberg/src/spec/manifest_list.rs: ## @@ -940,4 +1025,108 @@ mod test { r#"[{"manifest_path":"s3a://icebergdata/demo/s1/t1/metadata/05ffe08b-810f

[I] Bug: PostgreSql integration [iceberg-python]

2023-10-16 Thread via GitHub
mobley-trent opened a new issue, #78: URL: https://github.com/apache/iceberg-python/issues/78 ### Apache Iceberg version 0.5.0 (latest release) ### Please describe the bug 🐞 Python = 3.11 PostgreSql = v16 I'm having issues setting up the initial connection to po

Re: [PR] Add an expireAfterWrite cache eviction policy to CachingCatalog [iceberg]

2023-10-16 Thread via GitHub
zhangminglei commented on code in PR #8844: URL: https://github.com/apache/iceberg/pull/8844#discussion_r1360727462 ## core/src/main/java/org/apache/iceberg/CachingCatalog.java: ## @@ -110,6 +110,8 @@ private Cache createTableCache(Ticker ticker) { .removalListener(n

Re: [PR] Add an expireAfterWrite cache eviction policy to CachingCatalog [iceberg]

2023-10-16 Thread via GitHub
zhangminglei commented on code in PR #8844: URL: https://github.com/apache/iceberg/pull/8844#discussion_r1360798631 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestCachingCatalogExpirationAfterWrite.java: ## @@ -0,0 +1,82 @@ +/* + * Licensed

Re: [PR] Flink: Read parquet BINARY column as String for expected [iceberg]

2023-10-16 Thread via GitHub
nastra commented on PR #8808: URL: https://github.com/apache/iceberg/pull/8808#issuecomment-1764681891 > Some systems like older versions of Impala do not annotate String type as UTF-8 columns in Parquet files. When importing these Parquet files into Iceberg, reading these Binary columns wi

Re: [PR] Core: fix reading of split offsets in manifests [iceberg]

2023-10-16 Thread via GitHub
bryanck commented on code in PR #8834: URL: https://github.com/apache/iceberg/pull/8834#discussion_r1360817555 ## core/src/test/java/org/apache/iceberg/TestManifestReader.java: ## @@ -61,16 +70,14 @@ public void testReaderWithFilterWithoutSelect() throws IOException { Mani

Re: [PR] feat: Implement Iceberg values [iceberg-rust]

2023-10-16 Thread via GitHub
ZENOTME commented on code in PR #20: URL: https://github.com/apache/iceberg-rust/pull/20#discussion_r1360827510 ## crates/iceberg/src/spec/values.rs: ## @@ -0,0 +1,964 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements.

Re: [PR] Spark 3.4: Fix issue when partitioning by UUID [iceberg]

2023-10-16 Thread via GitHub
nastra commented on code in PR #8250: URL: https://github.com/apache/iceberg/pull/8250#discussion_r1360824697 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/source/InternalRowWrapper.java: ## @@ -71,8 +77,12 @@ public void set(int pos, T value) { row.update(pos

Re: [PR] feat: Implement Iceberg values [iceberg-rust]

2023-10-16 Thread via GitHub
ZENOTME commented on code in PR #20: URL: https://github.com/apache/iceberg-rust/pull/20#discussion_r1360827510 ## crates/iceberg/src/spec/values.rs: ## @@ -0,0 +1,964 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements.

Re: [PR] Spec: add nanosecond timestamp types [iceberg]

2023-10-16 Thread via GitHub
rdblue commented on code in PR #8683: URL: https://github.com/apache/iceberg/pull/8683#discussion_r1360863357 ## format/spec.md: ## @@ -862,10 +864,12 @@ Maps with non-string keys must use an array representation with the `map` logica |**`float`**|`float`|| |**`double`**|`dou

Re: [PR] Spec: add nanosecond timestamp types [iceberg]

2023-10-16 Thread via GitHub
rdblue commented on code in PR #8683: URL: https://github.com/apache/iceberg/pull/8683#discussion_r1360863357 ## format/spec.md: ## @@ -862,10 +864,12 @@ Maps with non-string keys must use an array representation with the `map` logica |**`float`**|`float`|| |**`double`**|`dou

Re: [PR] Spec: add nanosecond timestamp types [iceberg]

2023-10-16 Thread via GitHub
rdblue commented on code in PR #8683: URL: https://github.com/apache/iceberg/pull/8683#discussion_r1360866229 ## format/spec.md: ## @@ -874,6 +878,11 @@ Maps with non-string keys must use an array representation with the `map` logica |**`list`**|`array`|| |**`map`**|`array` o

Re: [PR] Spec: add nanosecond timestamp types [iceberg]

2023-10-16 Thread via GitHub
rdblue commented on code in PR #8683: URL: https://github.com/apache/iceberg/pull/8683#discussion_r1360870420 ## format/spec.md: ## @@ -948,6 +961,7 @@ Lists must use the [3-level representation](https://github.com/apache/parquet-fo Notes: 1. ORC's [TimestampColumnVector](

Re: [PR] Spec: add nanosecond timestamp types [iceberg]

2023-10-16 Thread via GitHub
rdblue commented on code in PR #8683: URL: https://github.com/apache/iceberg/pull/8683#discussion_r1360871361 ## format/spec.md: ## @@ -971,8 +985,10 @@ The 32-bit hash implementation is 32-bit Murmur3 hash, x86 variant, seeded with | **`decimal(P,S)`** | `hashBytes(minBigEndi

Re: [PR] Spec: add nanosecond timestamp types [iceberg]

2023-10-16 Thread via GitHub
rdblue commented on code in PR #8683: URL: https://github.com/apache/iceberg/pull/8683#discussion_r1360873696 ## format/spec.md: ## @@ -177,8 +177,10 @@ A **`map`** is a collection of key-value pairs with a key type and a value type. | **`decimal(P,S)`** | Fixed-point decimal;

Re: [PR] Core: fix reading of split offsets in manifests [iceberg]

2023-10-16 Thread via GitHub
rdblue commented on code in PR #8834: URL: https://github.com/apache/iceberg/pull/8834#discussion_r1360880736 ## core/src/main/java/org/apache/iceberg/BaseFile.java: ## @@ -463,11 +460,7 @@ public ByteBuffer keyMetadata() { @Override public List splitOffsets() { -if

Re: [PR] Core: fix reading of split offsets in manifests [iceberg]

2023-10-16 Thread via GitHub
rdblue merged PR #8834: URL: https://github.com/apache/iceberg/pull/8834 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

[PR] Core: Do not use a lazy split offset list in manifests (#8834) [iceberg]

2023-10-16 Thread via GitHub
nastra opened a new pull request, #8845: URL: https://github.com/apache/iceberg/pull/8845 This backports #8834 to 1.4.1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [PR] Spec: add nanosecond timestamp types [iceberg]

2023-10-16 Thread via GitHub
Fokko commented on code in PR #8683: URL: https://github.com/apache/iceberg/pull/8683#discussion_r1360913216 ## format/spec.md: ## @@ -874,6 +878,11 @@ Maps with non-string keys must use an array representation with the `map` logica |**`list`**|`array`|| |**`map`**|`array` of

Re: [PR] Spec: add nanosecond timestamp types [iceberg]

2023-10-16 Thread via GitHub
jacobmarble commented on code in PR #8683: URL: https://github.com/apache/iceberg/pull/8683#discussion_r1360930771 ## format/spec.md: ## @@ -874,6 +878,11 @@ Maps with non-string keys must use an array representation with the `map` logica |**`list`**|`array`|| |**`map`**|`arr

Re: [PR] Spec: add nanosecond timestamp types [iceberg]

2023-10-16 Thread via GitHub
Fokko commented on code in PR #8683: URL: https://github.com/apache/iceberg/pull/8683#discussion_r1360934237 ## format/spec.md: ## @@ -874,6 +878,11 @@ Maps with non-string keys must use an array representation with the `map` logica |**`list`**|`array`|| |**`map`**|`array` of

Re: [PR] Spec: add nanosecond timestamp types [iceberg]

2023-10-16 Thread via GitHub
jacobmarble commented on code in PR #8683: URL: https://github.com/apache/iceberg/pull/8683#discussion_r1360938772 ## format/spec.md: ## @@ -874,6 +878,11 @@ Maps with non-string keys must use an array representation with the `map` logica |**`list`**|`array`|| |**`map`**|`arr

Re: [PR] push down min/max/count to iceberg [iceberg]

2023-10-16 Thread via GitHub
huaxingao commented on PR #6252: URL: https://github.com/apache/iceberg/pull/6252#issuecomment-1764861269 @atifiu Based on the log, only `IsNotNull(initial_page_view_dtm)` is completely evaluated on iceberg side. Both `(initial_page_view_dtm#3 >= 2023-06-02 06:00:00)` and `initial_page_view

Re: [PR] Spec: add nanosecond timestamp types [iceberg]

2023-10-16 Thread via GitHub
Fokko commented on code in PR #8683: URL: https://github.com/apache/iceberg/pull/8683#discussion_r1360945526 ## format/spec.md: ## @@ -874,6 +878,11 @@ Maps with non-string keys must use an array representation with the `map` logica |**`list`**|`array`|| |**`map`**|`array` of

Re: [PR] Remove python working directory [iceberg-python]

2023-10-16 Thread via GitHub
rdblue merged PR #71: URL: https://github.com/apache/iceberg-python/pull/71 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

Re: [PR] JDBC catalog fix namespaceExists check [iceberg]

2023-10-16 Thread via GitHub
amogh-jahagirdar commented on code in PR #8340: URL: https://github.com/apache/iceberg/pull/8340#discussion_r1360982717 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcUtil.java: ## @@ -135,7 +135,7 @@ final class JdbcUtil { + CATALOG_NAME + " = ? AND "

Re: [PR] Spec: add nanosecond timestamp types [iceberg]

2023-10-16 Thread via GitHub
jacobmarble commented on code in PR #8683: URL: https://github.com/apache/iceberg/pull/8683#discussion_r1360997998 ## format/spec.md: ## @@ -971,8 +985,10 @@ The 32-bit hash implementation is 32-bit Murmur3 hash, x86 variant, seeded with | **`decimal(P,S)`** | `hashBytes(minBi

Re: [PR] push down min/max/count to iceberg [iceberg]

2023-10-16 Thread via GitHub
atifiu commented on PR #6252: URL: https://github.com/apache/iceberg/pull/6252#issuecomment-1764912976 @huaxingao So finally it is working but without `between `and `<=` operators. Yes, I have to tweak my query to adjust the timezone so that entire partition is picked by query. ```

Re: [PR] JDBC catalog fix namespaceExists check [iceberg]

2023-10-16 Thread via GitHub
dramaticlly commented on code in PR #8340: URL: https://github.com/apache/iceberg/pull/8340#discussion_r1361000799 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcUtil.java: ## @@ -135,7 +135,7 @@ final class JdbcUtil { + CATALOG_NAME + " = ? AND "

Re: [PR] Spec: add nanosecond timestamp types [iceberg]

2023-10-16 Thread via GitHub
jacobmarble commented on code in PR #8683: URL: https://github.com/apache/iceberg/pull/8683#discussion_r1361003504 ## format/spec.md: ## @@ -177,8 +177,10 @@ A **`map`** is a collection of key-value pairs with a key type and a value type. | **`decimal(P,S)`** | Fixed-point dec

Re: [PR] Spec: add nanosecond timestamp types [iceberg]

2023-10-16 Thread via GitHub
jacobmarble commented on code in PR #8683: URL: https://github.com/apache/iceberg/pull/8683#discussion_r1361004482 ## format/spec.md: ## @@ -948,6 +961,7 @@ Lists must use the [3-level representation](https://github.com/apache/parquet-fo Notes: 1. ORC's [TimestampColumnVec

Re: [PR] Spec: add nanosecond timestamp types [iceberg]

2023-10-16 Thread via GitHub
jacobmarble commented on code in PR #8683: URL: https://github.com/apache/iceberg/pull/8683#discussion_r1361006657 ## format/spec.md: ## @@ -874,6 +878,11 @@ Maps with non-string keys must use an array representation with the `map` logica |**`list`**|`array`|| |**`map`**|`arr

[PR] Remove`example` since it is deprecated [iceberg-python]

2023-10-16 Thread via GitHub
Fokko opened a new pull request, #79: URL: https://github.com/apache/iceberg-python/pull/79 ``` E pydantic.warnings.PydanticDeprecatedSince20: Using extra keyword arguments on `Field` is deprecated and will be removed. Use `json_schema_extra` instead. (Extra keys: 'example'). Depreca

Re: [PR] Spec: add nanosecond timestamp types [iceberg]

2023-10-16 Thread via GitHub
jacobmarble commented on code in PR #8683: URL: https://github.com/apache/iceberg/pull/8683#discussion_r1361135225 ## format/spec.md: ## @@ -971,8 +985,10 @@ The 32-bit hash implementation is 32-bit Murmur3 hash, x86 variant, seeded with | **`decimal(P,S)`** | `hashBytes(minBi

Re: [I] [bug] Spark SQL phase optimization failed on concurrent write attempt [iceberg]

2023-10-16 Thread via GitHub
kangyang-wang commented on issue #7800: URL: https://github.com/apache/iceberg/issues/7800#issuecomment-1765093921 Got the same issue here while trying to write to s3... Any solutions? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[PR] Don't fail on warning when releasing [iceberg-python]

2023-10-16 Thread via GitHub
Fokko opened a new pull request, #80: URL: https://github.com/apache/iceberg-python/pull/80 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-m

Re: [PR] JDBC catalog fix namespaceExists check [iceberg]

2023-10-16 Thread via GitHub
ismailsimsek commented on code in PR #8340: URL: https://github.com/apache/iceberg/pull/8340#discussion_r1361153332 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcUtil.java: ## @@ -135,7 +135,7 @@ final class JdbcUtil { + CATALOG_NAME + " = ? AND "

Re: [PR] Don't fail on warning when releasing [iceberg-python]

2023-10-16 Thread via GitHub
rdblue commented on PR #80: URL: https://github.com/apache/iceberg-python/pull/80#issuecomment-1765238294 Should we make -Werror part of CI? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [PR] Don't fail on warning when releasing [iceberg-python]

2023-10-16 Thread via GitHub
Fokko commented on PR #80: URL: https://github.com/apache/iceberg-python/pull/80#issuecomment-1765242399 @rdblue Yes, this effort is going on at https://github.com/apache/iceberg-python/pull/33. It is tricky because it also catches warnings from external libraries (Ray threw some warnings),

Re: [PR] Don't fail on warning when releasing [iceberg-python]

2023-10-16 Thread via GitHub
Fokko merged PR #80: URL: https://github.com/apache/iceberg-python/pull/80 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Remove `example` since it is deprecated [iceberg-python]

2023-10-16 Thread via GitHub
Fokko commented on PR #79: URL: https://github.com/apache/iceberg-python/pull/79#issuecomment-1765243284 Thanks @rdblue ! 🙌 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Remove `example` since it is deprecated [iceberg-python]

2023-10-16 Thread via GitHub
Fokko merged PR #79: URL: https://github.com/apache/iceberg-python/pull/79 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [I] De-Duping Rows While Compacting [iceberg]

2023-10-16 Thread via GitHub
W-I-D-EE commented on issue #8702: URL: https://github.com/apache/iceberg/issues/8702#issuecomment-1765308407 Further to this, i have actually had a lot of trouble getting delete from or merge into working with removing duplicate rows. Today the only way i have been able to remove deuplicat

[I] Enable Partition Transforms and/or Spark SQL In Spark `rewrite_data_files` Procedure [iceberg]

2023-10-16 Thread via GitHub
RLashofRegas opened a new issue, #8846: URL: https://github.com/apache/iceberg/issues/8846 ### Feature Request / Improvement I am using iceberg v0.14.0 w/ Spark 3.3.0 on Amazon EMR 6.8.0. We are trying to implement regular table maintenance on a table that uses partition transf

Re: [PR] Python: Add support for Python 3.12 [iceberg-python]

2023-10-16 Thread via GitHub
jayceslesar commented on PR #35: URL: https://github.com/apache/iceberg-python/pull/35#issuecomment-1765396170 @steinsgateted looks like there are no 3.12 wheels yet see the discussion on https://github.com/aio-libs/aiohttp/issues/7639 -- This is an automated message from the Apache G

Re: [I] Flink: revert the automatic custom partitioner for bucketing column with hash distribution [iceberg]

2023-10-16 Thread via GitHub
rdblue commented on issue #8847: URL: https://github.com/apache/iceberg/issues/8847#issuecomment-1765411414 @stevenzwu, can you help us understand what is a problem with this and why it should be removed from the 1.4.1 release? -- This is an automated message from the Apache Git Service.

Re: [I] Flink: revert the automatic application of custom partitioner for bucketing column with hash distribution [iceberg]

2023-10-16 Thread via GitHub
stevenzwu commented on issue #8847: URL: https://github.com/apache/iceberg/issues/8847#issuecomment-1765429213 @rdblue here is the recap from the discussions on the PR #7161. https://github.com/apache/iceberg/pull/7161#issuecomment-1761169778 PR #7161 automatically apply the custom bu

Re: [I] DeleteOrphanFilesSparkAction doesn't use the Catalog's FileIO [iceberg]

2023-10-16 Thread via GitHub
github-actions[bot] commented on issue #7280: URL: https://github.com/apache/iceberg/issues/7280#issuecomment-1765457345 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Error when add custom Spark logicalPlan using injectResolutionRule [iceberg]

2023-10-16 Thread via GitHub
github-actions[bot] closed issue #7271: Error when add custom Spark logicalPlan using injectResolutionRule URL: https://github.com/apache/iceberg/issues/7271 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [I] DeleteOrphanFilesSparkAction doesn't use the Catalog's FileIO [iceberg]

2023-10-16 Thread via GitHub
github-actions[bot] closed issue #7280: DeleteOrphanFilesSparkAction doesn't use the Catalog's FileIO URL: https://github.com/apache/iceberg/issues/7280 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [I] Error when add custom Spark logicalPlan using injectResolutionRule [iceberg]

2023-10-16 Thread via GitHub
github-actions[bot] commented on issue #7271: URL: https://github.com/apache/iceberg/issues/7271#issuecomment-1765457375 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Flink: revert the automatic application of custom partitioner for bucketing column with hash distribution [iceberg]

2023-10-16 Thread via GitHub
rdblue commented on issue #8847: URL: https://github.com/apache/iceberg/issues/8847#issuecomment-1765460065 Thanks, @stevenzwu! I agree that reverting the behavior change makes the most sense. We should be careful about default behavior changes and rolling back the change (but not the featu

Re: [I] Enable Partition Transforms and/or Spark SQL In Spark `rewrite_data_files` Procedure [iceberg]

2023-10-16 Thread via GitHub
RussellSpitzer commented on issue #8846: URL: https://github.com/apache/iceberg/issues/8846#issuecomment-1765475158 This is supported in Iceberg 1.4 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[I] [BUG] to_arrow conversion does not support iceberg table column name containing slash [iceberg-python]

2023-10-16 Thread via GitHub
puchengy opened a new issue, #81: URL: https://github.com/apache/iceberg-python/issues/81 ### Apache Iceberg version main (development) ### Please describe the bug 🐞 PR to reproduce https://github.com/puchengy/iceberg-python/commit/68081491641b0d7bada13a18b98ded3e08e127a

Re: [PR] Support timestamp type in partition string when importing files [iceberg]

2023-10-16 Thread via GitHub
camper42 commented on PR #7291: URL: https://github.com/apache/iceberg/pull/7291#issuecomment-1765491318 Any progress on this PR? We're having the same problem. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [PR] Flink: Read parquet BINARY column as String for expected [iceberg]

2023-10-16 Thread via GitHub
fengjiajie commented on PR #8808: URL: https://github.com/apache/iceberg/pull/8808#issuecomment-1765499071 > > Some systems like older versions of Impala do not annotate String type as UTF-8 columns in Parquet files. When importing these Parquet files into Iceberg, reading these Binary colu

Re: [I] DeleteOrphanFiles or ExpireSnapshots outofmemory [iceberg]

2023-10-16 Thread via GitHub
RLashofRegas commented on issue #3703: URL: https://github.com/apache/iceberg/issues/3703#issuecomment-1765512670 @dchristle What was your solution that fixed the `Cannot broadcast the table that is larger than 8GB` issue? I just ran into the same problem on expire snapshots. I am using `ma

Re: [PR] feat: manifest list writer [iceberg-rust]

2023-10-16 Thread via GitHub
liurenjie1024 commented on code in PR #76: URL: https://github.com/apache/iceberg-rust/pull/76#discussion_r1361411731 ## crates/iceberg/src/spec/manifest_list.rs: ## @@ -940,4 +1025,108 @@ mod test { r#"[{"manifest_path":"s3a://icebergdata/demo/s1/t1/metadata/05ffe

Re: [PR] Flink: Custom partitioner for bucket partitions [iceberg]

2023-10-16 Thread via GitHub
chenwyi2 commented on PR #7161: URL: https://github.com/apache/iceberg/pull/7161#issuecomment-1765530967 In normal conditition, only the data of current minute will be written. However, if the data is delayed, for example, at 11:50, the data has not been written until 11:55, then at 11:56

Re: [PR] feat: manifest list writer [iceberg-rust]

2023-10-16 Thread via GitHub
barronw commented on code in PR #76: URL: https://github.com/apache/iceberg-rust/pull/76#discussion_r1361422612 ## crates/iceberg/src/spec/manifest_list.rs: ## @@ -940,4 +1025,108 @@ mod test { r#"[{"manifest_path":"s3a://icebergdata/demo/s1/t1/metadata/05ffe08b-81

Re: [PR] feat: manifest list writer [iceberg-rust]

2023-10-16 Thread via GitHub
barronw commented on code in PR #76: URL: https://github.com/apache/iceberg-rust/pull/76#discussion_r1361425336 ## crates/iceberg/src/spec/manifest_list.rs: ## @@ -940,4 +1025,108 @@ mod test { r#"[{"manifest_path":"s3a://icebergdata/demo/s1/t1/metadata/05ffe08b-81

Re: [PR] feat: manifest list writer [iceberg-rust]

2023-10-16 Thread via GitHub
barronw commented on code in PR #76: URL: https://github.com/apache/iceberg-rust/pull/76#discussion_r1361422612 ## crates/iceberg/src/spec/manifest_list.rs: ## @@ -940,4 +1025,108 @@ mod test { r#"[{"manifest_path":"s3a://icebergdata/demo/s1/t1/metadata/05ffe08b-81

Re: [PR] feat: manifest list writer [iceberg-rust]

2023-10-16 Thread via GitHub
barronw commented on code in PR #76: URL: https://github.com/apache/iceberg-rust/pull/76#discussion_r1361422612 ## crates/iceberg/src/spec/manifest_list.rs: ## @@ -940,4 +1025,108 @@ mod test { r#"[{"manifest_path":"s3a://icebergdata/demo/s1/t1/metadata/05ffe08b-81

Re: [PR] feat: manifest list writer [iceberg-rust]

2023-10-16 Thread via GitHub
barronw commented on code in PR #76: URL: https://github.com/apache/iceberg-rust/pull/76#discussion_r1361422612 ## crates/iceberg/src/spec/manifest_list.rs: ## @@ -940,4 +1025,108 @@ mod test { r#"[{"manifest_path":"s3a://icebergdata/demo/s1/t1/metadata/05ffe08b-81

Re: [PR] feat: manifest list writer [iceberg-rust]

2023-10-16 Thread via GitHub
barronw commented on code in PR #76: URL: https://github.com/apache/iceberg-rust/pull/76#discussion_r1361422612 ## crates/iceberg/src/spec/manifest_list.rs: ## @@ -940,4 +1025,108 @@ mod test { r#"[{"manifest_path":"s3a://icebergdata/demo/s1/t1/metadata/05ffe08b-81

  1   2   >