Re: [PR] [WIP] Add `ManifestEvaluator` to allow filtering of files in a table scan (Issue #152) [iceberg-rust]

2024-04-08 Thread via GitHub
sdd commented on PR #241: URL: https://github.com/apache/iceberg-rust/pull/241#issuecomment-2044262473 Closing this PR as it has been split into https://github.com/apache/iceberg-rust/pull/317, https://github.com/apache/iceberg-rust/pull/320, https://github.com/apache/iceberg-rust/pull/321

Re: [PR] [WIP] Add `ManifestEvaluator` to allow filtering of files in a table scan (Issue #152) [iceberg-rust]

2024-04-08 Thread via GitHub
sdd closed pull request #241: [WIP] Add `ManifestEvaluator` to allow filtering of files in a table scan (Issue #152) URL: https://github.com/apache/iceberg-rust/pull/241 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [I] Discussion: Next steps / requirements to support `append` files [iceberg-rust]

2024-04-08 Thread via GitHub
sdd commented on issue #329: URL: https://github.com/apache/iceberg-rust/issues/329#issuecomment-2044235422 > This should probably accept a RecordBatch as a param, create a new Transaction, and delegates further action to the transaction. Is there a reason why append wouldn't take a `

Re: [I] [JDBC Catalog] Table commit fails if iceberg_type field is NULL [iceberg]

2024-04-08 Thread via GitHub
jbonofre commented on issue #10046: URL: https://github.com/apache/iceberg/issues/10046#issuecomment-2044233172 I have two changes in the fix: - set `iceberg_type` to `TABLE` instead of `NULL` (and update the other SQL statement accordingly) when upgrading the schema as it can be problema

Re: [I] [JDBC Catalog] Table commit fails if iceberg_type field is NULL [iceberg]

2024-04-08 Thread via GitHub
jbonofre commented on issue #10046: URL: https://github.com/apache/iceberg/issues/10046#issuecomment-2044198936 @jurossiar Thanks for the report. It's normal that the `iceberg_type` is added and the records updated with `NULL` if you define `jdbc.schema-version=V1` (`V0` "preserves" the "ol

Re: [PR] feat: init iceberg writer [iceberg-rust]

2024-04-08 Thread via GitHub
ZENOTME commented on code in PR #275: URL: https://github.com/apache/iceberg-rust/pull/275#discussion_r1556935437 ## crates/iceberg/src/writer/base_writer/data_file_writer.rs: ## @@ -0,0 +1,323 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contrib

Re: [I] iceberg-flink: Switch tests to JUnit5 + AssertJ-style assertions [iceberg]

2024-04-08 Thread via GitHub
tomtongue commented on issue #9087: URL: https://github.com/apache/iceberg/issues/9087#issuecomment-2044183666 @nastra If no one is assigned with this, could you assign this with me? I believe there are three remaining tasks like flink, spark other versions, data for the migration completio

Re: [PR] feat: init iceberg writer [iceberg-rust]

2024-04-08 Thread via GitHub
Xuanwo commented on code in PR #275: URL: https://github.com/apache/iceberg-rust/pull/275#discussion_r1556917470 ## crates/iceberg/src/writer/base_writer/data_file_writer.rs: ## @@ -0,0 +1,323 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contribu

Re: [I] Implement `Closable` interface for class `HiveCatalog` and `HiveClientPool` [iceberg]

2024-04-08 Thread via GitHub
yuqi1129 commented on issue #10100: URL: https://github.com/apache/iceberg/issues/10100#issuecomment-2044152196 > @yuqi1129: Do I understand correctly, that your main issue is that the cleaner threads are remaining, and those are the only ones causing the issue? Yeah, I want a mechani

Re: [I] [BUG] Valid column characters fail on to_arrow() or to_pandas() ArrowInvalid: No match for FieldRef.Name [iceberg-python]

2024-04-08 Thread via GitHub
kevinjqliu commented on issue #584: URL: https://github.com/apache/iceberg-python/issues/584#issuecomment-2044152708 > I would argue that the Python one is correct Yeah me too. But I think Java Iceberg doesn't support this since parquet files with `ABC-GG-1-A` column will be read as I

Re: [I] Implement `Closable` interface for class `HiveCatalog` and `HiveClientPool` [iceberg]

2024-04-08 Thread via GitHub
yuqi1129 commented on issue #10100: URL: https://github.com/apache/iceberg/issues/10100#issuecomment-2044151328 > Can you use the daemon thread from the jdk (via systemScheduler)? Then how can I use the `systemScheduler` here? I can't change the following code from an external project

Re: [I] Implement `Closable` interface for class `HiveCatalog` and `HiveClientPool` [iceberg]

2024-04-08 Thread via GitHub
pvary commented on issue #10100: URL: https://github.com/apache/iceberg/issues/10100#issuecomment-2044147391 @yuqi1129: Do I understand correctly, that your main issue is that the cleaner threads are remaining, and those are the only ones causing the issue? -- This is an automated message

Re: [I] Implement `Closable` interface for class `HiveCatalog` and `HiveClientPool` [iceberg]

2024-04-08 Thread via GitHub
ben-manes commented on issue #10100: URL: https://github.com/apache/iceberg/issues/10100#issuecomment-2044074963 Can you use the daemon thread from the jdk (via systemScheduler)? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] Add Pagination To List Apis [iceberg]

2024-04-08 Thread via GitHub
rahil-c commented on code in PR #9782: URL: https://github.com/apache/iceberg/pull/9782#discussion_r1556673446 ## core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java: ## @@ -490,12 +522,29 @@ public void createNamespace( @Override public List listNamespace

Re: [I] Residuals cannot be applied to dates/times/timestamps read by BaseParquetReaders [iceberg]

2024-04-08 Thread via GitHub
github-actions[bot] closed issue #2253: Residuals cannot be applied to dates/times/timestamps read by BaseParquetReaders URL: https://github.com/apache/iceberg/issues/2253 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

Re: [I] Unset no longer present table props while propagating props to HMS [iceberg]

2024-04-08 Thread via GitHub
github-actions[bot] closed issue #2249: Unset no longer present table props while propagating props to HMS URL: https://github.com/apache/iceberg/issues/2249 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [I] Parquet: Vectorized reads should support reading INT96 timestamps [iceberg]

2024-04-08 Thread via GitHub
github-actions[bot] closed issue #2236: Parquet: Vectorized reads should support reading INT96 timestamps URL: https://github.com/apache/iceberg/issues/2236 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [I] NPE while closing OrcFileAppender [iceberg]

2024-04-08 Thread via GitHub
github-actions[bot] commented on issue #2470: URL: https://github.com/apache/iceberg/issues/2470#issuecomment-2043917692 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Residuals cannot be applied to dates/times/timestamps read by BaseParquetReaders [iceberg]

2024-04-08 Thread via GitHub
github-actions[bot] commented on issue #2253: URL: https://github.com/apache/iceberg/issues/2253#issuecomment-2043917536 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Unset no longer present table props while propagating props to HMS [iceberg]

2024-04-08 Thread via GitHub
github-actions[bot] commented on issue #2249: URL: https://github.com/apache/iceberg/issues/2249#issuecomment-2043917514 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Parquet: Vectorized reads should support reading INT96 timestamps [iceberg]

2024-04-08 Thread via GitHub
github-actions[bot] commented on issue #2236: URL: https://github.com/apache/iceberg/issues/2236#issuecomment-2043917494 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [I] Support creating tags [iceberg-python]

2024-04-08 Thread via GitHub
enkidulan commented on issue #573: URL: https://github.com/apache/iceberg-python/issues/573#issuecomment-2043897402 > Are you interested in creating the API for this? :) I am. I'll try to create a PR this week. -- This is an automated message from the Apache Git Service. To respond

Re: [I] [JDBC Catalog] Table commit fails if iceberg_type field is NULL [iceberg]

2024-04-08 Thread via GitHub
jurossiar commented on issue #10046: URL: https://github.com/apache/iceberg/issues/10046#issuecomment-2043653883 Notice: We had the same issue. The table iceberg_tables was updated adding the `iceberg_type` column (with NULL values) even without set the environment variable `jdbc.schema-ver

Re: [PR] Docs: Document support for binary in truncate transform [iceberg]

2024-04-08 Thread via GitHub
amogh-jahagirdar commented on PR #10079: URL: https://github.com/apache/iceberg/pull/10079#issuecomment-2043652460 Merged. Thanks for addressing this @TheNeuralBit! thanks for the reviews @Fokko @rdblue -- This is an automated message from the Apache Git Service. To respond to the me

Re: [I] Discrepancy between partition truncation transform in spec vs code (org.apache.iceberg.transforms.Transform) [iceberg]

2024-04-08 Thread via GitHub
amogh-jahagirdar closed issue #5251: Discrepancy between partition truncation transform in spec vs code (org.apache.iceberg.transforms.Transform) URL: https://github.com/apache/iceberg/issues/5251 -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] Docs: Document support for binary in truncate transform [iceberg]

2024-04-08 Thread via GitHub
amogh-jahagirdar merged PR #10079: URL: https://github.com/apache/iceberg/pull/10079 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@

Re: [PR] Add Pagination To List Apis [iceberg]

2024-04-08 Thread via GitHub
rahil-c commented on code in PR #9782: URL: https://github.com/apache/iceberg/pull/9782#discussion_r1556425249 ## core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java: ## @@ -274,14 +277,34 @@ public void setConf(Object newConf) { @Override public List listTa

Re: [PR] Add Pagination To List Apis [iceberg]

2024-04-08 Thread via GitHub
rahil-c commented on code in PR #9782: URL: https://github.com/apache/iceberg/pull/9782#discussion_r1556399763 ## core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java: ## @@ -224,6 +226,7 @@ public void initialize(String name, Map unresolved) { clie

[PR] StructType field `required` optional by default [iceberg-python]

2024-04-08 Thread via GitHub
MehulBatra opened a new pull request, #592: URL: https://github.com/apache/iceberg-python/pull/592 * Type change default = false * Updated test cases to align with recent changes in the codebase (Unit + Integration) -- This is an automated message from the Apache Git Service. To respon

Re: [PR] Docs: Fix On-screen display issues and minor expressions on Branching and Tagging DDL [iceberg]

2024-04-08 Thread via GitHub
lawofcycles commented on code in PR #10091: URL: https://github.com/apache/iceberg/pull/10091#discussion_r1556378194 ## docs/docs/spark-ddl.md: ## @@ -499,17 +500,18 @@ AS OF VERSION 1234 -- CREATE audit-branch at snapshot 1234, retain audit-branch for 31 days, and retain th

Re: [PR] Add Pagination To List Apis [iceberg]

2024-04-08 Thread via GitHub
danielcweeks commented on code in PR #9782: URL: https://github.com/apache/iceberg/pull/9782#discussion_r1556370624 ## core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java: ## @@ -224,6 +226,7 @@ public void initialize(String name, Map unresolved) {

Re: [PR] Add Pagination To List Apis [iceberg]

2024-04-08 Thread via GitHub
danielcweeks commented on code in PR #9782: URL: https://github.com/apache/iceberg/pull/9782#discussion_r1556366558 ## core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java: ## @@ -274,14 +277,34 @@ public void setConf(Object newConf) { @Override public List l

Re: [PR] Docs: Added Upsolver to vendor list [iceberg]

2024-04-08 Thread via GitHub
jasonf20 commented on code in PR #10096: URL: https://github.com/apache/iceberg/pull/10096#discussion_r1556353763 ## site/docs/vendors.md: ## @@ -71,3 +71,7 @@ Starburst is a commercial offering for the [Trino query engine](https://trino.io ### [Tabular](https://tabular.io)

Re: [PR] Docs: Added Upsolver to vendor list [iceberg]

2024-04-08 Thread via GitHub
bitsondatadev commented on code in PR #10096: URL: https://github.com/apache/iceberg/pull/10096#discussion_r1556290485 ## site/docs/vendors.md: ## @@ -71,3 +71,7 @@ Starburst is a commercial offering for the [Trino query engine](https://trino.io ### [Tabular](https://tabular.i

Re: [PR] Docs: Added Upsolver to vendor list [iceberg]

2024-04-08 Thread via GitHub
bitsondatadev commented on code in PR #10096: URL: https://github.com/apache/iceberg/pull/10096#discussion_r1556290485 ## site/docs/vendors.md: ## @@ -71,3 +71,7 @@ Starburst is a commercial offering for the [Trino query engine](https://trino.io ### [Tabular](https://tabular.i

Re: [PR] Validate overwrite filter [iceberg-python]

2024-04-08 Thread via GitHub
Fokko commented on PR #582: URL: https://github.com/apache/iceberg-python/pull/582#issuecomment-2043480905 > If we wanted to handle the validation only in the delete function by checking if we would end up rewriting files, above pattern would succeed by deleting level = 'INFO' and dt = '202

Re: [PR] Validate overwrite filter [iceberg-python]

2024-04-08 Thread via GitHub
syun64 commented on PR #582: URL: https://github.com/apache/iceberg-python/pull/582#issuecomment-2043441654 Hi @Fokko @adrianqin I think the goal of this PR is to create a distinction to the semantic of a 'static overwrite' onto a partitioned table, from that of a 'delete' + 'append'.

Re: [PR] Docs: Added Upsolver to vendor list [iceberg]

2024-04-08 Thread via GitHub
jasonf20 commented on code in PR #10096: URL: https://github.com/apache/iceberg/pull/10096#discussion_r1556236365 ## site/docs/vendors.md: ## @@ -71,3 +71,7 @@ Starburst is a commercial offering for the [Trino query engine](https://trino.io ### [Tabular](https://tabular.io)

Re: [PR] Docs: Added Upsolver to vendor list [iceberg]

2024-04-08 Thread via GitHub
jasonf20 commented on code in PR #10096: URL: https://github.com/apache/iceberg/pull/10096#discussion_r1556232872 ## site/docs/vendors.md: ## @@ -71,3 +71,7 @@ Starburst is a commercial offering for the [Trino query engine](https://trino.io ### [Tabular](https://tabular.io)

[PR] [draft] Use Parquet's getRowIndexOffset support instead of calculating it [iceberg]

2024-04-08 Thread via GitHub
wypoon opened a new pull request, #10107: URL: https://github.com/apache/iceberg/pull/10107 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-m

Re: [PR] Manifest list encryption [iceberg]

2024-04-08 Thread via GitHub
RussellSpitzer commented on code in PR #7770: URL: https://github.com/apache/iceberg/pull/7770#discussion_r1556171156 ## core/src/main/java/org/apache/iceberg/BaseSnapshot.java: ## @@ -90,7 +127,9 @@ class BaseSnapshot implements Snapshot { this.summary = summary; this

Re: [PR] Docs: Added Upsolver to vendor list [iceberg]

2024-04-08 Thread via GitHub
ajantha-bhat commented on code in PR #10096: URL: https://github.com/apache/iceberg/pull/10096#discussion_r1556087620 ## site/docs/vendors.md: ## @@ -71,3 +71,7 @@ Starburst is a commercial offering for the [Trino query engine](https://trino.io ### [Tabular](https://tabular.io

[I] Spark procedure to compute partition stats. [iceberg]

2024-04-08 Thread via GitHub
ajantha-bhat opened a new issue, #10106: URL: https://github.com/apache/iceberg/issues/10106 ### Feature Request / Improvement Based on the experiments from https://github.com/apache/iceberg/pull/9437, spark action is not effective as the serialization cost of each partition stats en

[I] Add a table API to compute partition stats. [iceberg]

2024-04-08 Thread via GitHub
ajantha-bhat opened a new issue, #10105: URL: https://github.com/apache/iceberg/issues/10105 ### Feature Request / Improvement Based on the experiments from https://github.com/apache/iceberg/pull/9437, spark action is not effective as the serialization cost of each partition stats en

[PR] Docs: Update releases.md for Spark scala versions [iceberg]

2024-04-08 Thread via GitHub
liko9 opened a new pull request, #10104: URL: https://github.com/apache/iceberg/pull/10104 I found the Releases page for downloads to be quite confusing for the versions involved, so I updated the language and links a bit to be more clear -- This is an automated message from the Apache Gi

Re: [PR] Docs: Added Upsolver to vendor list [iceberg]

2024-04-08 Thread via GitHub
bitsondatadev commented on code in PR #10096: URL: https://github.com/apache/iceberg/pull/10096#discussion_r1555998294 ## site/docs/vendors.md: ## @@ -71,3 +71,7 @@ Starburst is a commercial offering for the [Trino query engine](https://trino.io ### [Tabular](https://tabular.i

Re: [I] Implement `Closable` interface for class `HiveCatalog` and `HiveClientPool` [iceberg]

2024-04-08 Thread via GitHub
yuqi1129 commented on issue #10100: URL: https://github.com/apache/iceberg/issues/10100#issuecomment-2042946236 > @yuqi1129: Could setting the `client.pool.cache.eviction-interval-ms`, and decreasing the `clients` size help your case? I'm afraid it won't work for me, decreasing `clie

Re: [PR] Extend HTTPClient Builder to allow setting a proxy server [iceberg]

2024-04-08 Thread via GitHub
nastra commented on code in PR #10052: URL: https://github.com/apache/iceberg/pull/10052#discussion_r1555942482 ## core/src/test/java/org/apache/iceberg/rest/TestHTTPClient.java: ## @@ -121,6 +128,103 @@ public void testHeadFailure() throws JsonProcessingException { testHt

Re: [PR] Extend HTTPClient Builder to allow setting a proxy server [iceberg]

2024-04-08 Thread via GitHub
nastra commented on code in PR #10052: URL: https://github.com/apache/iceberg/pull/10052#discussion_r1555940535 ## core/src/test/java/org/apache/iceberg/rest/TestHTTPClient.java: ## @@ -121,6 +128,103 @@ public void testHeadFailure() throws JsonProcessingException { testHt

Re: [PR] Extend HTTPClient Builder to allow setting a proxy server [iceberg]

2024-04-08 Thread via GitHub
nastra commented on code in PR #10052: URL: https://github.com/apache/iceberg/pull/10052#discussion_r1555939815 ## core/src/test/java/org/apache/iceberg/rest/TestHTTPClient.java: ## @@ -121,6 +128,103 @@ public void testHeadFailure() throws JsonProcessingException { testHt

Re: [PR] Change DataScan to accept Metadata and io [iceberg-python]

2024-04-08 Thread via GitHub
Fokko merged PR #581: URL: https://github.com/apache/iceberg-python/pull/581 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [I] Implement `Closable` interface for class `HiveCatalog` and `HiveClientPool` [iceberg]

2024-04-08 Thread via GitHub
pvary commented on issue #10100: URL: https://github.com/apache/iceberg/issues/10100#issuecomment-2042811946 @yuqi1129: Could setting the `client.pool.cache.eviction-interval-ms`, and decreasing the `clients` size help your case? -- This is an automated message from the Apache Git Service

Re: [I] Pyarrow type error [iceberg-python]

2024-04-08 Thread via GitHub
bigluck commented on issue #541: URL: https://github.com/apache/iceberg-python/issues/541#issuecomment-2042792612 @Fokko it sounds good to me! :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [I] Pyarrow type error [iceberg-python]

2024-04-08 Thread via GitHub
Fokko commented on issue #541: URL: https://github.com/apache/iceberg-python/issues/541#issuecomment-2042789157 Ciao @bigluck. Thanks for jumping in here. Until V3 is finalized, we can add a flag to cast a nanosecond to a microsecond precision. Would that work for you? -- This is an autom

Re: [PR] Flink: Don't fail to serialize IcebergSourceSplit when there is too many delete files [iceberg]

2024-04-08 Thread via GitHub
elkhand commented on PR #9464: URL: https://github.com/apache/iceberg/pull/9464#issuecomment-2042732429 @javrasya wanted to check if you will have time to update the PR this week. If not I can take update this PR accordingly. -- This is an automated message from the Apache Git Service. T

Re: [I] Pyarrow type error [iceberg-python]

2024-04-08 Thread via GitHub
bigluck commented on issue #541: URL: https://github.com/apache/iceberg-python/issues/541#issuecomment-2042703204 So, `timestamp_ns` & `timestamptz_ns` has been added on the `v3` of the iceberg specs, pyiceberg right now supports `v1` &`v2`. In my case, the column has been generated b

Re: [I] Pyarrow type error [iceberg-python]

2024-04-08 Thread via GitHub
bigluck commented on issue #541: URL: https://github.com/apache/iceberg-python/issues/541#issuecomment-2042634726 I'm facing a similar issue in my code. Tested using main@7fcdb8d25dfa2498ba98a2b8e8d2b327d85fa7c9 (the commit after `Minor fixes, #523 followup (#563)` and `Cast data to I

Re: [I] Iceberg use PARTITIONED BY (days(ts)) cause wrong partition name [iceberg]

2024-04-08 Thread via GitHub
tcguanshuhuai closed issue #10102: Iceberg use PARTITIONED BY (days(ts)) cause wrong partition name URL: https://github.com/apache/iceberg/issues/10102 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [I] Iceberg use PARTITIONED BY (days(ts)) cause wrong partition name [iceberg]

2024-04-08 Thread via GitHub
tcguanshuhuai commented on issue #10102: URL: https://github.com/apache/iceberg/issues/10102#issuecomment-2042571758 Thanks a lot. Sent from my iPhone -- Original -- From: Mathew Kapkiai ***@***.***> Date: Mon,Apr 8,2024 8:03 PM

Re: [I] Iceberg use PARTITIONED BY (days(ts)) cause wrong partition name [iceberg]

2024-04-08 Thread via GitHub
Kapkiai commented on issue #10102: URL: https://github.com/apache/iceberg/issues/10102#issuecomment-2042569175 I don’t think it’s a bug. Converting the epoch gives ‘2024-06-20 16:00:00’ at UTC. I believe spark inteprets timestamps at UTC based on it’s tz/system settings and you are in diffe

Re: [PR] Docs: Added Upsolver to vendor list [iceberg]

2024-04-08 Thread via GitHub
jasonf20 commented on code in PR #10096: URL: https://github.com/apache/iceberg/pull/10096#discussion_r1555685176 ## site/docs/vendors.md: ## @@ -71,3 +71,7 @@ Starburst is a commercial offering for the [Trino query engine](https://trino.io ### [Tabular](https://tabular.io)

Re: [PR] Docs: Added Upsolver to vendor list [iceberg]

2024-04-08 Thread via GitHub
bitsondatadev commented on code in PR #10096: URL: https://github.com/apache/iceberg/pull/10096#discussion_r1554976560 ## site/docs/vendors.md: ## @@ -71,3 +71,7 @@ Starburst is a commercial offering for the [Trino query engine](https://trino.io ### [Tabular](https://tabular.i

Re: [I] iceberg-aws: Switch tests to JUnit5 + AssertJ-style assertions [iceberg]

2024-04-08 Thread via GitHub
nastra closed issue #9080: iceberg-aws: Switch tests to JUnit5 + AssertJ-style assertions URL: https://github.com/apache/iceberg/issues/9080 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [I] Move iceberg-orc files to iceberg-core module [iceberg]

2024-04-08 Thread via GitHub
ajantha-bhat closed issue #8454: Move iceberg-orc files to iceberg-core module URL: https://github.com/apache/iceberg/issues/8454 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [I] Move iceberg-parquet files to iceberg-core module [iceberg]

2024-04-08 Thread via GitHub
ajantha-bhat closed issue #8453: Move iceberg-parquet files to iceberg-core module URL: https://github.com/apache/iceberg/issues/8453 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] Migrate AWS tests to JUnit5 [iceberg]

2024-04-08 Thread via GitHub
nastra merged PR #10086: URL: https://github.com/apache/iceberg/pull/10086 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [I] Implement spark action to compute partition stats [iceberg]

2024-04-08 Thread via GitHub
ajantha-bhat closed issue #8459: Implement spark action to compute partition stats URL: https://github.com/apache/iceberg/issues/8459 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [I] Implement spark action to compute partition stats [iceberg]

2024-04-08 Thread via GitHub
ajantha-bhat commented on issue #8459: URL: https://github.com/apache/iceberg/issues/8459#issuecomment-2042446172 Based on the experiments from https://github.com/apache/iceberg/pull/9437, spark action is not effective as the serialization cost of each partition stats entry is expensive.

Re: [I] Implement incremental update using commit stats (SnapshotSummary) [iceberg]

2024-04-08 Thread via GitHub
ajantha-bhat commented on issue #8461: URL: https://github.com/apache/iceberg/issues/8461#issuecomment-2042443146 Lower priority and will be working on this once all other planned activities from #8450 is done. -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] Migrate AWS tests to JUnit5 [iceberg]

2024-04-08 Thread via GitHub
nastra commented on code in PR #10086: URL: https://github.com/apache/iceberg/pull/10086#discussion_r1555620195 ## aws/src/integration/java/org/apache/iceberg/aws/glue/TestGlueCatalogTable.java: ## @@ -229,10 +218,10 @@ public void testRenameTable() { glueCatalog.renameTabl

[I] Implement incremental update using commit stats (SnapshotSummary) [iceberg]

2024-04-08 Thread via GitHub
ajantha-bhat opened a new issue, #8461: URL: https://github.com/apache/iceberg/issues/8461 ### Feature Request / Improvement This could be an experimental direction and can be controlled by a flag. This might bloat up the table metadata file size when millions of partitions are

Re: [PR] Migrate AWS tests to JUnit5 [iceberg]

2024-04-08 Thread via GitHub
nk1506 commented on code in PR #10086: URL: https://github.com/apache/iceberg/pull/10086#discussion_r186399 ## aws/src/integration/java/org/apache/iceberg/aws/glue/TestGlueCatalogTable.java: ## @@ -229,10 +218,10 @@ public void testRenameTable() { glueCatalog.renameTabl

Re: [PR] Change DataScan to accept Metadata and io [iceberg-python]

2024-04-08 Thread via GitHub
Fokko commented on code in PR #581: URL: https://github.com/apache/iceberg-python/pull/581#discussion_r142312 ## pyiceberg/io/pyarrow.py: ## @@ -1089,7 +1091,7 @@ def project_table( deletes_per_file.get(task.file.file_path), case_sensitive,

Re: [PR] Introduce two properties for reading the connection timeout and socke… [iceberg]

2024-04-08 Thread via GitHub
nastra commented on code in PR #10053: URL: https://github.com/apache/iceberg/pull/10053#discussion_r128393 ## core/src/test/java/org/apache/iceberg/rest/TestHTTPClient.java: ## @@ -133,6 +138,59 @@ public void testDynamicHttpRequestInterceptorLoading() { assertThat(((T

Re: [PR] Validate overwrite filter [iceberg-python]

2024-04-08 Thread via GitHub
Fokko commented on code in PR #582: URL: https://github.com/apache/iceberg-python/pull/582#discussion_r1555374364 ## pyiceberg/io/pyarrow.py: ## @@ -1776,7 +1776,10 @@ def write_parquet(task: WriteTask) -> DataFile: fo = io.new_output(file_path) with fo.create(

Re: [PR] Introduce two properties for reading the connection timeout and socke… [iceberg]

2024-04-08 Thread via GitHub
nastra commented on code in PR #10053: URL: https://github.com/apache/iceberg/pull/10053#discussion_r125438 ## core/src/test/java/org/apache/iceberg/rest/TestHTTPClient.java: ## @@ -133,6 +138,59 @@ public void testDynamicHttpRequestInterceptorLoading() { assertThat(((T

Re: [PR] Migrate AWS tests to JUnit5 [iceberg]

2024-04-08 Thread via GitHub
tomtongue commented on code in PR #10086: URL: https://github.com/apache/iceberg/pull/10086#discussion_r117972 ## aws/src/integration/java/org/apache/iceberg/aws/glue/TestGlueCatalogTable.java: ## @@ -258,10 +247,10 @@ public void testRenameTableFailsToCreateNewTable() {

Re: [PR] Migrate AWS tests to JUnit5 [iceberg]

2024-04-08 Thread via GitHub
tomtongue commented on code in PR #10086: URL: https://github.com/apache/iceberg/pull/10086#discussion_r119947 ## aws/src/integration/java/org/apache/iceberg/aws/glue/TestGlueCatalogTable.java: ## @@ -84,43 +84,35 @@ public void testCreateTable() { // verify table exist

Re: [PR] Migrate AWS tests to JUnit5 [iceberg]

2024-04-08 Thread via GitHub
tomtongue commented on code in PR #10086: URL: https://github.com/apache/iceberg/pull/10086#discussion_r117631 ## aws/src/integration/java/org/apache/iceberg/aws/glue/TestGlueCatalogTable.java: ## @@ -229,10 +218,10 @@ public void testRenameTable() { glueCatalog.renameT

Re: [PR] Migrate AWS tests to JUnit5 [iceberg]

2024-04-08 Thread via GitHub
nk1506 commented on code in PR #10086: URL: https://github.com/apache/iceberg/pull/10086#discussion_r113397 ## aws/src/integration/java/org/apache/iceberg/aws/glue/TestGlueCatalogTable.java: ## @@ -229,10 +218,10 @@ public void testRenameTable() { glueCatalog.renameTabl

Re: [I] read from Iceberg table throw java.lang.ArrayIndexOutOfBoundsException: 3 [iceberg]

2024-04-08 Thread via GitHub
nastra commented on issue #10103: URL: https://github.com/apache/iceberg/issues/10103#issuecomment-2042278015 @jiantao-vungle do you have a small reproducible example? Without that it's quite difficult to reproduce this -- This is an automated message from the Apache Git Service. To respo

Re: [PR] Migrate AWS tests to JUnit5 [iceberg]

2024-04-08 Thread via GitHub
nk1506 commented on code in PR #10086: URL: https://github.com/apache/iceberg/pull/10086#discussion_r103258 ## aws/src/integration/java/org/apache/iceberg/aws/glue/TestGlueCatalogTable.java: ## @@ -84,43 +84,35 @@ public void testCreateTable() { // verify table exists i

Re: [PR] Migrate AWS tests to JUnit5 [iceberg]

2024-04-08 Thread via GitHub
nk1506 commented on code in PR #10086: URL: https://github.com/apache/iceberg/pull/10086#discussion_r1555485467 ## aws/src/integration/java/org/apache/iceberg/aws/dynamodb/TestDynamoDbCatalog.java: ## @@ -99,22 +100,23 @@ public void testCreateNamespace() { .tab

Re: [PR] Introduce two properties for reading the connection timeout and socke… [iceberg]

2024-04-08 Thread via GitHub
harishch1998 commented on code in PR #10053: URL: https://github.com/apache/iceberg/pull/10053#discussion_r1555456733 ## core/src/test/java/org/apache/iceberg/rest/TestHTTPClient.java: ## @@ -133,6 +136,67 @@ public void testDynamicHttpRequestInterceptorLoading() { assertTh

Re: [PR] Validate overwrite filter [iceberg-python]

2024-04-08 Thread via GitHub
Fokko commented on PR #582: URL: https://github.com/apache/iceberg-python/pull/582#issuecomment-2042152141 Hi Adrian, thanks for working on this and the very comprehensive write-up. My first questions is, what is the main goal of this PR. Let me elaborate with more context. Looking at

[I] java.lang.ArrayIndexOutOfBoundsException: 3 [iceberg]

2024-04-08 Thread via GitHub
jiantao-vungle opened a new issue, #10103: URL: https://github.com/apache/iceberg/issues/10103 ### Apache Iceberg version 1.3.1 ### Query engine Spark ### Please describe the bug 🐞 Environment Spark: 3.4.1 Iceberg: 1.3.1 Description

Re: [I] [BUG] Valid column characters fail on to_arrow() or to_pandas() ArrowInvalid: No match for FieldRef.Name [iceberg-python]

2024-04-08 Thread via GitHub
Fokko commented on issue #584: URL: https://github.com/apache/iceberg-python/issues/584#issuecomment-2042077008 Generated a Parquet file using both Spark and Python: ![image](https://github.com/apache/iceberg-python/assets/1134248/b382632a-e5ef-4c3d-82bd-6efbe2ced53f) ![image](htt

Re: [PR] Introduce two properties for reading the connection timeout and socke… [iceberg]

2024-04-08 Thread via GitHub
nastra commented on code in PR #10053: URL: https://github.com/apache/iceberg/pull/10053#discussion_r1555349405 ## core/src/test/java/org/apache/iceberg/rest/TestHTTPClient.java: ## @@ -133,6 +136,67 @@ public void testDynamicHttpRequestInterceptorLoading() { assertThat(((T

Re: [PR] Introduce two properties for reading the connection timeout and socke… [iceberg]

2024-04-08 Thread via GitHub
nastra commented on code in PR #10053: URL: https://github.com/apache/iceberg/pull/10053#discussion_r1555349405 ## core/src/test/java/org/apache/iceberg/rest/TestHTTPClient.java: ## @@ -133,6 +136,67 @@ public void testDynamicHttpRequestInterceptorLoading() { assertThat(((T

Re: [PR] Docs: Fix On-screen display issues and minor expressions on Branching and Tagging DDL [iceberg]

2024-04-08 Thread via GitHub
nastra commented on code in PR #10091: URL: https://github.com/apache/iceberg/pull/10091#discussion_r1555345466 ## docs/docs/spark-ddl.md: ## @@ -499,17 +500,18 @@ AS OF VERSION 1234 -- CREATE audit-branch at snapshot 1234, retain audit-branch for 31 days, and retain the lat

Re: [PR] #9073 Junit 4 tests switched to JUnit 5 [iceberg]

2024-04-08 Thread via GitHub
nastra commented on code in PR #9793: URL: https://github.com/apache/iceberg/pull/9793#discussion_r1555322359 ## data/src/test/java/org/apache/iceberg/data/TestDataFileIndexStatsFilters.java: ## @@ -137,9 +135,11 @@ public void testPositionDeletePlanningPath() throws IOExceptio

Re: [PR] Migrate AWS tests to JUnit5 [iceberg]

2024-04-08 Thread via GitHub
tomtongue commented on code in PR #10086: URL: https://github.com/apache/iceberg/pull/10086#discussion_r1555328887 ## aws/src/integration/java/org/apache/iceberg/aws/dynamodb/TestDynamoDbCatalog.java: ## @@ -99,22 +100,23 @@ public void testCreateNamespace() { .

Re: [I] Hive Catalog cannot create table with TimestamptzType field [iceberg-python]

2024-04-08 Thread via GitHub
HonahX closed issue #583: Hive Catalog cannot create table with TimestamptzType field URL: https://github.com/apache/iceberg-python/issues/583 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] [Bug Fix] Allow HiveCatalog to create table with TimestamptzType [iceberg-python]

2024-04-08 Thread via GitHub
HonahX merged PR #585: URL: https://github.com/apache/iceberg-python/pull/585 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] [Bug Fix] Allow HiveCatalog to create table with TimestamptzType [iceberg-python]

2024-04-08 Thread via GitHub
HonahX commented on PR #585: URL: https://github.com/apache/iceberg-python/pull/585#issuecomment-2042016912 @Fokko Thanks for reviewing! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] Migrate AWS tests to JUnit5 [iceberg]

2024-04-08 Thread via GitHub
tomtongue commented on code in PR #10086: URL: https://github.com/apache/iceberg/pull/10086#discussion_r1555324898 ## aws/src/integration/java/org/apache/iceberg/aws/lakeformation/TestLakeFormationAwsClientFactory.java: ## @@ -165,7 +165,8 @@ public void testLakeFormationEnabled

[I] Iceberg use PARTITIONED BY (days(ts)) cause wrong partition name [iceberg]

2024-04-08 Thread via GitHub
tcguanshuhuai opened a new issue, #10102: URL: https://github.com/apache/iceberg/issues/10102 ### Apache Iceberg version 1.5.0 (latest release) ### Query engine Spark ### Please describe the bug 🐞 Step1: start iceberg spark-sql --jars /root/myjars/iceberg

Re: [PR] Migrate AWS tests to JUnit5 [iceberg]

2024-04-08 Thread via GitHub
nastra commented on code in PR #10086: URL: https://github.com/apache/iceberg/pull/10086#discussion_r1555315428 ## aws/src/integration/java/org/apache/iceberg/aws/lakeformation/TestLakeFormationAwsClientFactory.java: ## @@ -165,7 +165,8 @@ public void testLakeFormationEnabledGlu