Re: [PR] Build: Bump com.azure:azure-sdk-bom from 1.2.18 to 1.2.20 [iceberg]

2024-02-07 Thread via GitHub
Fokko merged PR #9571: URL: https://github.com/apache/iceberg/pull/9571 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

Re: [PR] Build: Bump net.snowflake:snowflake-jdbc from 3.14.4 to 3.14.5 [iceberg]

2024-02-07 Thread via GitHub
Fokko commented on PR #9570: URL: https://github.com/apache/iceberg/pull/9570#issuecomment-1931493191 @dependabot rebase -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Build: Bump org.testcontainers:testcontainers from 1.19.3 to 1.19.4 [iceberg]

2024-02-07 Thread via GitHub
Fokko merged PR #9577: URL: https://github.com/apache/iceberg/pull/9577 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

Re: [PR] Build: Bump io.delta:delta-standalone_2.12 from 0.6.0 to 3.0.0 [iceberg]

2024-02-07 Thread via GitHub
Fokko commented on PR #8895: URL: https://github.com/apache/iceberg/pull/8895#issuecomment-1931494015 https://github.com/dependabot rebase -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] Build: Bump io.delta:delta-standalone_2.12 from 0.6.0 to 3.0.0 [iceberg]

2024-02-07 Thread via GitHub
dependabot[bot] commented on PR #8895: URL: https://github.com/apache/iceberg/pull/8895#issuecomment-1931494503 Looks like io.delta:delta-standalone_2.12 is up-to-date now, so this is no longer needed. -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] Build: Bump io.delta:delta-standalone_2.12 from 0.6.0 to 3.0.0 [iceberg]

2024-02-07 Thread via GitHub
dependabot[bot] closed pull request #8895: Build: Bump io.delta:delta-standalone_2.12 from 0.6.0 to 3.0.0 URL: https://github.com/apache/iceberg/pull/8895 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [I] Cannot load a binary column of many rows via the `to_arrow` method. [iceberg-python]

2024-02-07 Thread via GitHub
castedice commented on issue #344: URL: https://github.com/apache/iceberg-python/issues/344#issuecomment-1931497403 This week was a tough week and I didn't have time to work on it. I'll try to submit a PR tomorrow or the day after. The way you support `large_string` in #382 is similar t

[I] Iceberg Rewrite DataFiles unmanageable behavior [iceberg]

2024-02-07 Thread via GitHub
supsupsap opened a new issue, #9674: URL: https://github.com/apache/iceberg/issues/9674 ### Apache Iceberg version 1.4.3 (latest release) ### Query engine Spark ### Please describe the bug 🐞 Hi all! I have a problem with Iceberg Rewrite DataFiles unmanage

Re: [PR] Spark 3.3: Move the Writer to a visitor [iceberg]

2024-02-07 Thread via GitHub
nastra merged PR #9672: URL: https://github.com/apache/iceberg/pull/9672 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Spark 3.4: Move the Writer to a visitor [iceberg]

2024-02-07 Thread via GitHub
nastra merged PR #9673: URL: https://github.com/apache/iceberg/pull/9673 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Spark: Detect temp functions in views [iceberg]

2024-02-07 Thread via GitHub
nastra closed pull request #9675: Spark: Detect temp functions in views URL: https://github.com/apache/iceberg/pull/9675 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsu

Re: [PR] Aliyun: Add security token to OSS client properties [iceberg]

2024-02-07 Thread via GitHub
wgtmac commented on PR #9671: URL: https://github.com/apache/iceberg/pull/9671#issuecomment-1931608673 cc @openinx -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

Re: [PR] Centralized table properties management [iceberg-python]

2024-02-07 Thread via GitHub
Fokko commented on code in PR #388: URL: https://github.com/apache/iceberg-python/pull/388#discussion_r1481153291 ## pyiceberg/table/__init__.py: ## @@ -1493,7 +1536,8 @@ def union_by_name(self, new_schema: Union[Schema, "pa.Schema"]) -> UpdateSchema: visit_with_partne

Re: [PR] Hive: Refactor hive-table commit operation to be used for other operations like view [iceberg]

2024-02-07 Thread via GitHub
nk1506 commented on code in PR #9461: URL: https://github.com/apache/iceberg/pull/9461#discussion_r1481156673 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java: ## @@ -166,153 +163,58 @@ protected void doRefresh() { refreshFromMetadataLocation

Re: [PR] Spark: Avoid NPE when catalog config doesn't have "type" set [iceberg]

2024-02-07 Thread via GitHub
nastra merged PR #9676: URL: https://github.com/apache/iceberg/pull/9676 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] [WIP] Migrate Write sub-classes in spark-extensions to JUnit5 and AssertJ style [iceberg]

2024-02-07 Thread via GitHub
nastra commented on code in PR #9670: URL: https://github.com/apache/iceberg/pull/9670#discussion_r1481183458 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestWriteAborts.java: ## @@ -74,20 +77,16 @@ public static Object[][] parameters() {

Re: [PR] [WIP] Migrate Write sub-classes in spark-extensions to JUnit5 and AssertJ style [iceberg]

2024-02-07 Thread via GitHub
tomtongue commented on code in PR #9670: URL: https://github.com/apache/iceberg/pull/9670#discussion_r1481212824 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestWriteAborts.java: ## @@ -74,20 +77,16 @@ public static Object[][] parameters() {

[PR] Docs: Add required Cargo version to install guide [iceberg-rust]

2024-02-07 Thread via GitHub
manuzhang opened a new pull request, #191: URL: https://github.com/apache/iceberg-rust/pull/191 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [PR] Website: Add release schedule on the releases page [iceberg]

2024-02-07 Thread via GitHub
bitsondatadev commented on code in PR #9666: URL: https://github.com/apache/iceberg/pull/9666#discussion_r1481366981 ## site/docs/releases.md: ## @@ -77,6 +77,15 @@ Apache Iceberg 1.4.3 was released on December 27, 2023. The main issue it solves - Core: Expired Snapshot files

Re: [PR] Website: Add release schedule on the releases page [iceberg]

2024-02-07 Thread via GitHub
ajantha-bhat commented on code in PR #9666: URL: https://github.com/apache/iceberg/pull/9666#discussion_r1481007795 ## site/docs/releases.md: ## @@ -77,6 +77,15 @@ Apache Iceberg 1.4.3 was released on December 27, 2023. The main issue it solves - Core: Expired Snapshot files i

[I] Add `rust-version` metadata in `Cargo.toml` [iceberg-rust]

2024-02-07 Thread via GitHub
Xuanwo opened a new issue, #192: URL: https://github.com/apache/iceberg-rust/issues/192 This PR reminds me that we need to add `rust-version` metadata in `Cargo.toml`. So old rustc will give a build error. _Originally posted by @Xuanwo in https://github.com/apache/icebe

Re: [PR] Docs: Add required Cargo version to install guide [iceberg-rust]

2024-02-07 Thread via GitHub
Xuanwo commented on PR #191: URL: https://github.com/apache/iceberg-rust/pull/191#issuecomment-1931941319 This PR reminds me that we need to add `rust-version` metadata in `Cargo.toml`. So old rustc will give a build error. -- This is an automated message from the Apache Git Service. To r

Re: [PR] Docs: Add required Cargo version to install guide [iceberg-rust]

2024-02-07 Thread via GitHub
Fokko merged PR #191: URL: https://github.com/apache/iceberg-rust/pull/191 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

[PR] Allow setting `{row-group,page}` limit [iceberg-python]

2024-02-07 Thread via GitHub
Fokko opened a new pull request, #390: URL: https://github.com/apache/iceberg-python/pull/390 On top of @HonahX's work in https://github.com/apache/iceberg-python/pull/388 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

Re: [I] duckdb don't seems to read the iceberg table [iceberg-python]

2024-02-07 Thread via GitHub
Fokko commented on issue #387: URL: https://github.com/apache/iceberg-python/issues/387#issuecomment-1931968281 Hey @djouallah Thanks for raising this. I was able to reproduce this on the PyIceberg side. However, removing `file://` from the warehouse directory fixed the problem. See https:/

[I] Operations on partition columns in `WHERE` clause not used in pruning [iceberg]

2024-02-07 Thread via GitHub
mgmarino opened a new issue, #9678: URL: https://github.com/apache/iceberg/issues/9678 ### Query engine AWS Athena, verified on Spark/EMR as well (3.4.1, Iceberg 1.4.3) ### Question We are currently looking to migrate several of our old Hive tables to Iceberg. Our tables

Re: [PR] Arrow: Support large-string [iceberg-python]

2024-02-07 Thread via GitHub
Fokko merged PR #382: URL: https://github.com/apache/iceberg-python/pull/382 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [I] Saving Polars Dataframe as Parquet to Iceberg [iceberg-python]

2024-02-07 Thread via GitHub
Fokko closed issue #226: Saving Polars Dataframe as Parquet to Iceberg URL: https://github.com/apache/iceberg-python/issues/226 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Add tests and fixes for Daft integration [iceberg-python]

2024-02-07 Thread via GitHub
Fokko merged PR #381: URL: https://github.com/apache/iceberg-python/pull/381 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

[PR] Arrow: Support int8 and int16 types [iceberg-python]

2024-02-07 Thread via GitHub
Fokko opened a new pull request, #391: URL: https://github.com/apache/iceberg-python/pull/391 I've checked with Spark, and here byte and short types are converted to integers. I think it makes sense to do this for Arrow as well. Closes https://github.com/apache/iceberg-python/issues/3

Re: [PR] Docs: Add/Update Snowflake docs to new docs site [iceberg]

2024-02-07 Thread via GitHub
bitsondatadev commented on PR #9557: URL: https://github.com/apache/iceberg/pull/9557#issuecomment-1932093312 @scottteal Sorry about that I saw that you added this PR and figured we would just use this. Could you reopen this please? -- This is an automated message from the Apache Git Serv

Re: [I] Difference between iceberg/docs and https://iceberg.apache.org/docs/latest/ [iceberg]

2024-02-07 Thread via GitHub
Waterkin commented on issue #9663: URL: https://github.com/apache/iceberg/issues/9663#issuecomment-1932123468 @bitsondatadev, Thank you for recognizing my efforts! Looking forward to contributing to the Mandarin translation of Iceberg docs. Happy to discuss further details when the document

Re: [I] Cannot create a V1 table with `CREATE OR REPLACE TABLE` [iceberg]

2024-02-07 Thread via GitHub
nastra commented on issue #8756: URL: https://github.com/apache/iceberg/issues/8756#issuecomment-1932137730 @Fokko is this a new table or an existing one that was created with `format-version=2`? -- This is an automated message from the Apache Git Service. To respond to the message, pleas

Re: [PR] Spark 3.4: Handle concurrently dropped view during CREATE OR REPLACE [iceberg]

2024-02-07 Thread via GitHub
nastra merged PR #9677: URL: https://github.com/apache/iceberg/pull/9677 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Core: Add view support for JDBC catalog [iceberg]

2024-02-07 Thread via GitHub
jbonofre commented on code in PR #9487: URL: https://github.com/apache/iceberg/pull/9487#discussion_r1481559062 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcCatalog.java: ## @@ -80,19 +83,37 @@ public class JdbcCatalog extends BaseMetastoreCatalog private final Function,

Re: [PR] Core: Add view support for JDBC catalog [iceberg]

2024-02-07 Thread via GitHub
jbonofre commented on code in PR #9487: URL: https://github.com/apache/iceberg/pull/9487#discussion_r1481560466 ## core/src/test/java/org/apache/iceberg/jdbc/TestJdbcUtil.java: ## @@ -18,14 +18,116 @@ */ package org.apache.iceberg.jdbc; +import static org.assertj.core.api.A

[I] RewriteDataFiles works from Scala API, but not from SQL API [iceberg]

2024-02-07 Thread via GitHub
paulpaul1076 opened a new issue, #9679: URL: https://github.com/apache/iceberg/issues/9679 ### Apache Iceberg version 1.4.3 (latest release) ### Query engine Spark ### Please describe the bug 🐞 I wrote this code in Scala: ``` HiveCatalog ca

Re: [PR] Core: Add view support for JDBC catalog [iceberg]

2024-02-07 Thread via GitHub
nastra commented on code in PR #9487: URL: https://github.com/apache/iceberg/pull/9487#discussion_r1481631049 ## core/src/test/java/org/apache/iceberg/jdbc/TestJdbcUtil.java: ## @@ -18,14 +18,116 @@ */ package org.apache.iceberg.jdbc; +import static org.assertj.core.api.Ass

Re: [PR] Core: Add view support for JDBC catalog [iceberg]

2024-02-07 Thread via GitHub
jbonofre commented on code in PR #9487: URL: https://github.com/apache/iceberg/pull/9487#discussion_r1481637607 ## core/src/test/java/org/apache/iceberg/jdbc/TestJdbcUtil.java: ## @@ -18,14 +18,116 @@ */ package org.apache.iceberg.jdbc; +import static org.assertj.core.api.A

Re: [I] RewriteManifest with more options [iceberg]

2024-02-07 Thread via GitHub
zachdisc commented on issue #9615: URL: https://github.com/apache/iceberg/issues/9615#issuecomment-1932280121 Love the idea, I can take a crack at it! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] Website: Add release schedule on the releases page [iceberg]

2024-02-07 Thread via GitHub
jbonofre commented on code in PR #9666: URL: https://github.com/apache/iceberg/pull/9666#discussion_r1481653302 ## site/docs/releases.md: ## @@ -77,6 +77,15 @@ Apache Iceberg 1.4.3 was released on December 27, 2023. The main issue it solves - Core: Expired Snapshot files in a

Re: [I] Support Parquet v2 Spark vectorized read [iceberg]

2024-02-07 Thread via GitHub
wgtmac commented on issue #7162: URL: https://github.com/apache/iceberg/issues/7162#issuecomment-1932286170 @jackye1995 I can work on this if you see fit. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [I] rewrite_data_files procedure fails with Premature end of Content-Length when using S3 client [iceberg]

2024-02-07 Thread via GitHub
paulpaul1076 commented on issue #9679: URL: https://github.com/apache/iceberg/issues/9679#issuecomment-1932287892 @nastra the Scala code works fine, the problem is inside iceberg. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

Re: [PR] Website: Add release schedule on the releases page [iceberg]

2024-02-07 Thread via GitHub
jbonofre commented on code in PR #9666: URL: https://github.com/apache/iceberg/pull/9666#discussion_r1481653302 ## site/docs/releases.md: ## @@ -77,6 +77,15 @@ Apache Iceberg 1.4.3 was released on December 27, 2023. The main issue it solves - Core: Expired Snapshot files in a

Re: [PR] Fix: add required rust version in cargo.toml [iceberg-rust]

2024-02-07 Thread via GitHub
odysa commented on PR #193: URL: https://github.com/apache/iceberg-rust/pull/193#issuecomment-1932288716 It closes #192 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [I] rewrite_data_files procedure fails with Premature end of Content-Length when using S3 client [iceberg]

2024-02-07 Thread via GitHub
nastra commented on issue #9679: URL: https://github.com/apache/iceberg/issues/9679#issuecomment-1932300265 @paulpaul1076 do you have a chance to try with `http-client.type=urlconnection`? It's of course also possible that there's a bug in `RewriteDataFilesSparkAction` that went unnoticed.

Re: [PR] Hive: Refactor hive-table commit operation to be used for other operations like view [iceberg]

2024-02-07 Thread via GitHub
nk1506 commented on code in PR #9461: URL: https://github.com/apache/iceberg/pull/9461#discussion_r1481156673 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java: ## @@ -166,153 +163,58 @@ protected void doRefresh() { refreshFromMetadataLocation

Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-02-07 Thread via GitHub
RussellSpitzer commented on code in PR #9546: URL: https://github.com/apache/iceberg/pull/9546#discussion_r1481706889 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java: ## @@ -59,7 +61,7 @@ */ public class HadoopTableOperations implements TableOperati

Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-02-07 Thread via GitHub
BsoBird commented on code in PR #9546: URL: https://github.com/apache/iceberg/pull/9546#discussion_r1481709724 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java: ## @@ -59,7 +61,7 @@ */ public class HadoopTableOperations implements TableOperations {

Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-02-07 Thread via GitHub
BsoBird commented on code in PR #9546: URL: https://github.com/apache/iceberg/pull/9546#discussion_r1481713978 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java: ## @@ -59,7 +61,7 @@ */ public class HadoopTableOperations implements TableOperations {

Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-02-07 Thread via GitHub
RussellSpitzer commented on code in PR #9546: URL: https://github.com/apache/iceberg/pull/9546#discussion_r1481727849 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java: ## @@ -149,26 +183,71 @@ public void commit(TableMetadata base, TableMetadata metada

Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-02-07 Thread via GitHub
RussellSpitzer commented on code in PR #9546: URL: https://github.com/apache/iceberg/pull/9546#discussion_r1481728197 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java: ## @@ -129,6 +133,36 @@ public TableMetadata refresh() { @Override public voi

Re: [PR] Allow setting `{row-group,page}` limit [iceberg-python]

2024-02-07 Thread via GitHub
amogh-jahagirdar commented on code in PR #390: URL: https://github.com/apache/iceberg-python/pull/390#discussion_r1481728511 ## pyiceberg/table/__init__.py: ## @@ -134,6 +133,53 @@ _JAVA_LONG_MAX = 9223372036854775807 +class TableProperties: +PARQUET_ROW_GROUP_SIZE_BYTE

Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-02-07 Thread via GitHub
RussellSpitzer commented on code in PR #9546: URL: https://github.com/apache/iceberg/pull/9546#discussion_r1481731851 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java: ## @@ -149,26 +183,71 @@ public void commit(TableMetadata base, TableMetadata metada

Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-02-07 Thread via GitHub
RussellSpitzer commented on code in PR #9546: URL: https://github.com/apache/iceberg/pull/9546#discussion_r1481733543 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java: ## @@ -149,26 +183,71 @@ public void commit(TableMetadata base, TableMetadata metada

Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-02-07 Thread via GitHub
RussellSpitzer commented on code in PR #9546: URL: https://github.com/apache/iceberg/pull/9546#discussion_r1481736484 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java: ## @@ -149,26 +183,71 @@ public void commit(TableMetadata base, TableMetadata metada

Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-02-07 Thread via GitHub
RussellSpitzer commented on code in PR #9546: URL: https://github.com/apache/iceberg/pull/9546#discussion_r1481737021 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java: ## @@ -149,26 +183,71 @@ public void commit(TableMetadata base, TableMetadata metada

Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-02-07 Thread via GitHub
RussellSpitzer commented on code in PR #9546: URL: https://github.com/apache/iceberg/pull/9546#discussion_r1481737972 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java: ## @@ -149,26 +183,71 @@ public void commit(TableMetadata base, TableMetadata metada

Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-02-07 Thread via GitHub
RussellSpitzer commented on code in PR #9546: URL: https://github.com/apache/iceberg/pull/9546#discussion_r1481739516 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java: ## @@ -234,7 +313,16 @@ public long newSnapshotId() { } @VisibleForTesting -

Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-02-07 Thread via GitHub
RussellSpitzer commented on code in PR #9546: URL: https://github.com/apache/iceberg/pull/9546#discussion_r1481741236 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java: ## @@ -289,64 +377,153 @@ Path versionHintFile() { return metadataPath(Util.VERS

Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-02-07 Thread via GitHub
RussellSpitzer commented on code in PR #9546: URL: https://github.com/apache/iceberg/pull/9546#discussion_r1481743290 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java: ## @@ -289,64 +377,153 @@ Path versionHintFile() { return metadataPath(Util.VERS

Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-02-07 Thread via GitHub
RussellSpitzer commented on code in PR #9546: URL: https://github.com/apache/iceberg/pull/9546#discussion_r1481744452 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java: ## @@ -289,64 +377,153 @@ Path versionHintFile() { return metadataPath(Util.VERS

Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-02-07 Thread via GitHub
RussellSpitzer commented on code in PR #9546: URL: https://github.com/apache/iceberg/pull/9546#discussion_r1481744452 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java: ## @@ -289,64 +377,153 @@ Path versionHintFile() { return metadataPath(Util.VERS

Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-02-07 Thread via GitHub
BsoBird commented on code in PR #9546: URL: https://github.com/apache/iceberg/pull/9546#discussion_r1481750312 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java: ## @@ -129,6 +133,36 @@ public TableMetadata refresh() { @Override public void commi

Re: [PR] Fix: add required rust version in cargo.toml [iceberg-rust]

2024-02-07 Thread via GitHub
Xuanwo commented on code in PR #193: URL: https://github.com/apache/iceberg-rust/pull/193#discussion_r1481752869 ## Cargo.toml: ## @@ -26,6 +26,7 @@ homepage = "https://rust.iceberg.apache.org/"; repository = "https://github.com/apache/iceberg-rust"; license = "Apache-2.0" +

Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-02-07 Thread via GitHub
BsoBird commented on code in PR #9546: URL: https://github.com/apache/iceberg/pull/9546#discussion_r1481752513 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java: ## @@ -149,26 +183,71 @@ public void commit(TableMetadata base, TableMetadata metadata) {

Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-02-07 Thread via GitHub
BsoBird commented on code in PR #9546: URL: https://github.com/apache/iceberg/pull/9546#discussion_r1481752513 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java: ## @@ -149,26 +183,71 @@ public void commit(TableMetadata base, TableMetadata metadata) {

Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-02-07 Thread via GitHub
BsoBird commented on code in PR #9546: URL: https://github.com/apache/iceberg/pull/9546#discussion_r1481753874 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java: ## @@ -149,26 +183,71 @@ public void commit(TableMetadata base, TableMetadata metadata) {

Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-02-07 Thread via GitHub
BsoBird commented on code in PR #9546: URL: https://github.com/apache/iceberg/pull/9546#discussion_r1481758443 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java: ## @@ -149,26 +183,71 @@ public void commit(TableMetadata base, TableMetadata metadata) {

Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-02-07 Thread via GitHub
BsoBird commented on code in PR #9546: URL: https://github.com/apache/iceberg/pull/9546#discussion_r1481759800 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java: ## @@ -289,64 +377,153 @@ Path versionHintFile() { return metadataPath(Util.VERSION_HIN

Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-02-07 Thread via GitHub
BsoBird commented on code in PR #9546: URL: https://github.com/apache/iceberg/pull/9546#discussion_r1481762541 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java: ## @@ -289,64 +377,153 @@ Path versionHintFile() { return metadataPath(Util.VERSION_HIN

Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-02-07 Thread via GitHub
BsoBird commented on PR #9546: URL: https://github.com/apache/iceberg/pull/9546#issuecomment-1932492668 I added exception catching for fs.rename calls. If the fs.rename call succeeds, I will swallow all exceptions. -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-02-07 Thread via GitHub
BsoBird commented on code in PR #9546: URL: https://github.com/apache/iceberg/pull/9546#discussion_r1481844750 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java: ## @@ -149,26 +183,71 @@ public void commit(TableMetadata base, TableMetadata metadata) {

Re: [I] rewrite_data_files procedure fails with Premature end of Content-Length when using S3 client [iceberg]

2024-02-07 Thread via GitHub
paulpaul1076 commented on issue #9679: URL: https://github.com/apache/iceberg/issues/9679#issuecomment-1932545715 Yea, I can try with that setting, where do I set it, by the way? Do I have to rebuild iceberg jars? The problem is not the RewriteDataFiles Spark action, it's the procedur

Re: [I] rewrite_data_files procedure fails with Premature end of Content-Length when using S3 client [iceberg]

2024-02-07 Thread via GitHub
paulpaul1076 commented on issue #9679: URL: https://github.com/apache/iceberg/issues/9679#issuecomment-1932550064 I just need to set a spark option like this, right: `spark.sql.catalog.my_catalog.http-client.type=urlconnection` ? -- This is an automated message from the Apac

Re: [PR] Core: HadoopTable needs to skip file cleanup after task failure under some boundary conditions. [iceberg]

2024-02-07 Thread via GitHub
BsoBird commented on code in PR #9546: URL: https://github.com/apache/iceberg/pull/9546#discussion_r1481851042 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java: ## @@ -289,64 +377,153 @@ Path versionHintFile() { return metadataPath(Util.VERSION_HIN

Re: [PR] Core: Add view support for JDBC catalog [iceberg]

2024-02-07 Thread via GitHub
jbonofre commented on code in PR #9487: URL: https://github.com/apache/iceberg/pull/9487#discussion_r1481853805 ## core/src/test/java/org/apache/iceberg/jdbc/TestJdbcUtil.java: ## @@ -18,14 +18,116 @@ */ package org.apache.iceberg.jdbc; +import static org.assertj.core.api.A

Re: [PR] Core: Add view support for JDBC catalog [iceberg]

2024-02-07 Thread via GitHub
jbonofre commented on code in PR #9487: URL: https://github.com/apache/iceberg/pull/9487#discussion_r1481559062 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcCatalog.java: ## @@ -80,19 +83,37 @@ public class JdbcCatalog extends BaseMetastoreCatalog private final Function,

Re: [PR] Core: Add view support for JDBC catalog [iceberg]

2024-02-07 Thread via GitHub
jbonofre commented on code in PR #9487: URL: https://github.com/apache/iceberg/pull/9487#discussion_r1481853805 ## core/src/test/java/org/apache/iceberg/jdbc/TestJdbcUtil.java: ## @@ -18,14 +18,116 @@ */ package org.apache.iceberg.jdbc; +import static org.assertj.core.api.A

Re: [I] Difference between iceberg/docs and https://iceberg.apache.org/docs/latest/ [iceberg]

2024-02-07 Thread via GitHub
bitsondatadev commented on issue #9663: URL: https://github.com/apache/iceberg/issues/9663#issuecomment-1932568002 Thank you for you initiative to make Iceberg more accessible to the Chinese-speaking community! -- This is an automated message from the Apache Git Service. To respond to the

Re: [I] rewrite_data_files procedure fails with Premature end of Content-Length when using S3 client [iceberg]

2024-02-07 Thread via GitHub
paulpaul1076 commented on issue #9679: URL: https://github.com/apache/iceberg/issues/9679#issuecomment-1932570894 Looks like iceberg-aws-bundle doesn't have this class: `Exception in thread "main" java.lang.NoClassDefFoundError: software/amazon/awssdk/http/urlconnection/UrlConnectionH

Re: [I] rewrite_data_files procedure fails with Premature end of Content-Length when using S3 client [iceberg]

2024-02-07 Thread via GitHub
paulpaul1076 commented on issue #9679: URL: https://github.com/apache/iceberg/issues/9679#issuecomment-1932587930 Anyways, got it to work, now there's a similar exception, but written a bit different: ``` org.apache.iceberg.exceptions.RuntimeIOException: java.io.EOFException: Reac

Re: [PR] Spark: Detect temp functions in views [iceberg]

2024-02-07 Thread via GitHub
singhpk234 commented on code in PR #9675: URL: https://github.com/apache/iceberg/pull/9675#discussion_r1481899840 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/RewriteViewCommands.scala: ## @@ -149,4 +167,20 @@ case class RewriteViewCommand

Re: [PR] Docs: Add/update Snowflake [iceberg]

2024-02-07 Thread via GitHub
bitsondatadev commented on code in PR #9669: URL: https://github.com/apache/iceberg/pull/9669#discussion_r1481948831 ## docs/mkdocs.yml: ## @@ -48,15 +48,16 @@ nav: - flink-actions.md - flink-configuration.md - hive.md - - Trino: https://trino.io/docs/current/conne

Re: [I] Updating a property map in a iceberg table [iceberg]

2024-02-07 Thread via GitHub
namrathamyske commented on issue #9659: URL: https://github.com/apache/iceberg/issues/9659#issuecomment-1932693605 @amogh-jahagirdar it would be be simple for 1-2 keys, but can get complex if number of keys increase -- This is an automated message from the Apache Git Service. To respond t

Re: [PR] Docs: Add/update Snowflake [iceberg]

2024-02-07 Thread via GitHub
flyrain commented on code in PR #9669: URL: https://github.com/apache/iceberg/pull/9669#discussion_r1481994517 ## docs/mkdocs.yml: ## @@ -48,15 +48,16 @@ nav: - flink-actions.md - flink-configuration.md - hive.md - - Trino: https://trino.io/docs/current/connector/i

[PR] Bug Fix: Rest Catalog update partition spec and sort order when fresh schema is created [iceberg-python]

2024-02-07 Thread via GitHub
syun64 opened a new pull request, #392: URL: https://github.com/apache/iceberg-python/pull/392 Right now, if fresh schema IDs are assigned on the Iceberg table schema that doesn't map exactly to the schema that was originally passed in, the partition and sort order may map to a different fi

Re: [PR] Arrow: Support `int8` and `int16` types [iceberg-python]

2024-02-07 Thread via GitHub
geruh commented on PR #391: URL: https://github.com/apache/iceberg-python/pull/391#issuecomment-1932886280 Does this imply when we convert an Iceberg table to an arrow schema there will be a loss in the int8 and int 16 types? -- This is an automated message from the Apache Git Service. T

Re: [PR] Arrow: Support `int8` and `int16` types [iceberg-python]

2024-02-07 Thread via GitHub
Fokko commented on PR #391: URL: https://github.com/apache/iceberg-python/pull/391#issuecomment-1932897418 @geruh The types will be widened. But when you write the data, the wider type will be used (resulting in more memory/disk usage). -- This is an automated message from the Apache Git

Re: [PR] Bug Fix: Rest Catalog update partition spec and sort order when fresh schema is created [iceberg-python]

2024-02-07 Thread via GitHub
syun64 commented on PR #392: URL: https://github.com/apache/iceberg-python/pull/392#issuecomment-1932926267 > do you know if this is captured by the REST integration test? would be a good testcase if not Yes, I'm in the process of writing it up now :) Thank you for the suggesti

Re: [PR] Bug Fix: Rest Catalog update partition spec and sort order when fresh schema is created [iceberg-python]

2024-02-07 Thread via GitHub
kevinjqliu commented on PR #392: URL: https://github.com/apache/iceberg-python/pull/392#issuecomment-1932951935 btw, i have a toy REST catalog here. its easier than spinning up the integration docker image every time. `~/.pyiceberg.yaml` ``` catalog: default: uri: ht

Re: [PR] Bug Fix: Rest Catalog update partition spec and sort order when fresh schema is created [iceberg-python]

2024-02-07 Thread via GitHub
kevinjqliu commented on code in PR #392: URL: https://github.com/apache/iceberg-python/pull/392#discussion_r1482123800 ## tests/integration/test_rest_schema.py: ## @@ -2497,3 +2500,32 @@ def test_two_add_schemas_in_a_single_transaction(catalog: Catalog) -> None: assert "Up

Re: [PR] Bug Fix: Rest Catalog update partition spec and sort order when fresh schema is created [iceberg-python]

2024-02-07 Thread via GitHub
syun64 commented on code in PR #392: URL: https://github.com/apache/iceberg-python/pull/392#discussion_r1482132916 ## tests/integration/test_rest_schema.py: ## @@ -2497,3 +2500,32 @@ def test_two_add_schemas_in_a_single_transaction(catalog: Catalog) -> None: assert "Update

Re: [PR] Bug Fix: Rest Catalog update partition spec and sort order when fresh schema is created [iceberg-python]

2024-02-07 Thread via GitHub
syun64 commented on code in PR #392: URL: https://github.com/apache/iceberg-python/pull/392#discussion_r1482134934 ## tests/integration/test_rest_schema.py: ## @@ -2497,3 +2500,32 @@ def test_two_add_schemas_in_a_single_transaction(catalog: Catalog) -> None: assert "Update

Re: [PR] Bug Fix: Rest Catalog update partition spec and sort order when fresh schema is created [iceberg-python]

2024-02-07 Thread via GitHub
syun64 commented on PR #392: URL: https://github.com/apache/iceberg-python/pull/392#issuecomment-1932991909 > some comments on equality check. > > Also, it looks like we are reassigning `id`s from 1. I'm not familiar with this process so bare with me. Is there a reason why we can't cr

Re: [PR] Build: Bump pytest from 7.4.4 to 8.0.0 [iceberg-python]

2024-02-07 Thread via GitHub
hussein-awala commented on PR #393: URL: https://github.com/apache/iceberg-python/pull/393#issuecomment-1932997731 CI is green 🎉 Please check https://github.com/TvoroG/pytest-lazy-fixture/issues/65 for the issue cc: @Fokko -- This is an automated message from the Apache Gi

[PR] detect breaking changes [iceberg-python]

2024-02-07 Thread via GitHub
syun64 opened a new pull request, #394: URL: https://github.com/apache/iceberg-python/pull/394 Implement https://github.com/apache/iceberg-python/issues/334 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[PR] Build: Bump mkdocs-material from 9.5.7 to 9.5.8 [iceberg-python]

2024-02-07 Thread via GitHub
dependabot[bot] opened a new pull request, #395: URL: https://github.com/apache/iceberg-python/pull/395 Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 9.5.7 to 9.5.8. Release notes Sourced from https://github.com/squidfunk/mkdocs-material/releases";>mkdo

Re: [PR] Add pagination to open api spec for listing of namespaces, tables, views [iceberg]

2024-02-07 Thread via GitHub
rahil-c commented on code in PR #9660: URL: https://github.com/apache/iceberg/pull/9660#discussion_r1482206498 ## open-api/rest-catalog-open-api.yaml: ## @@ -212,6 +212,34 @@ paths: schema: type: string example: "accounting%1Ftax" +- na

Re: [PR] Add pagination to open api spec for listing of namespaces, tables, views [iceberg]

2024-02-07 Thread via GitHub
rahil-c commented on code in PR #9660: URL: https://github.com/apache/iceberg/pull/9660#discussion_r1482207982 ## open-api/rest-catalog-open-api.yaml: ## @@ -212,6 +212,34 @@ paths: schema: type: string example: "accounting%1Ftax" +- na

  1   2   >