Re: [I] Caused by: java.net.SocketException: Connection reset [iceberg]

2024-01-17 Thread via GitHub
pvary commented on issue #9444: URL: https://github.com/apache/iceberg/issues/9444#issuecomment-1895290384 > Because it can also be an intermittent network issue for anyone using Iceberg with Flink and failing the entire stream for that sounds a bit harsh. For the record, the fix you

Re: [PR] Add formatting for toml files [iceberg-rust]

2024-01-17 Thread via GitHub
liurenjie1024 commented on code in PR #167: URL: https://github.com/apache/iceberg-rust/pull/167#discussion_r1454897761 ## Makefile: ## @@ -32,7 +32,11 @@ cargo-sort: cargo install cargo-sort cargo sort -c -w -check: check-fmt check-clippy cargo-sort +fmt-toml:

Re: [I] Caused by: java.net.SocketException: Connection reset [iceberg]

2024-01-17 Thread via GitHub
javrasya commented on issue #9444: URL: https://github.com/apache/iceberg/issues/9444#issuecomment-1895303601 True, the same file Io can be used for Spark too, I forgot that :-) Exactly I saw that implementation for retries in the source code. Thanks for sharing it, it is a good referenc

Re: [PR] init writer framework [iceberg-rust]

2024-01-17 Thread via GitHub
ZENOTME commented on PR #135: URL: https://github.com/apache/iceberg-rust/pull/135#issuecomment-1895304460 I feel that for this writer framework, we may need more discussion, so I can separate this framework as `IcebergWriter` and `FileWriter` parts. The `IcebergWriter` part is about how

Re: [PR] Flink: Don't fail to serialize IcebergSourceSplit when there is too many delete files [iceberg]

2024-01-17 Thread via GitHub
pvary commented on PR #9464: URL: https://github.com/apache/iceberg/pull/9464#issuecomment-1895305741 > The split/FileScanTask should only contain ASCII. If I understand correctly, the `FileScanTask` json will contain the `Schema`. The `Schema` has a `doc` field for comments. Do we ha

Re: [PR] Hive: Refactor hive-table commit operation to be used for other operations like view [iceberg]

2024-01-17 Thread via GitHub
nk1506 commented on code in PR #9461: URL: https://github.com/apache/iceberg/pull/9461#discussion_r1454918307 ## core/src/main/java/org/apache/iceberg/BaseMetastoreTableOperations.java: ## @@ -309,65 +300,19 @@ protected enum CommitStatus { * @return Commit Status of Success

Re: [PR] init writer framework [iceberg-rust]

2024-01-17 Thread via GitHub
Xuanwo commented on PR #135: URL: https://github.com/apache/iceberg-rust/pull/135#issuecomment-1895313954 > And we can work on the `FileWriter` part first if it looks good. How do you think? @Xuanwo @Fokko LGTM! It's good to merge things in small chunks and polish them during the rea

Re: [PR] Spark: Support creating views via SQL [iceberg]

2024-01-17 Thread via GitHub
nastra commented on code in PR #9423: URL: https://github.com/apache/iceberg/pull/9423#discussion_r1454928120 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/CreateV2ViewExec.scala: ## @@ -0,0 +1,147 @@ +/* + * Licensed to the Apache S

Re: [PR] Spark: Support creating views via SQL [iceberg]

2024-01-17 Thread via GitHub
nastra commented on code in PR #9423: URL: https://github.com/apache/iceberg/pull/9423#discussion_r1454951834 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/CreateV2ViewExec.scala: ## @@ -0,0 +1,147 @@ +/* + * Licensed to the Apache S

[I] `write.parquet.compression-codec` being set even if file-format is not parquet [iceberg]

2024-01-17 Thread via GitHub
oneonestar opened a new issue, #9490: URL: https://github.com/apache/iceberg/issues/9490 ### Apache Iceberg version 1.4.2 (latest release) ### Query engine Trino ### Please describe the bug 🐞 In Trino 436 (Iceberg 1.4.3), `write.parquet.compression-codec` pr

Re: [PR] Spark: Support creating views via SQL [iceberg]

2024-01-17 Thread via GitHub
nastra commented on code in PR #9423: URL: https://github.com/apache/iceberg/pull/9423#discussion_r1454958750 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/CreateV2ViewExec.scala: ## @@ -0,0 +1,147 @@ +/* + * Licensed to the Apache S

Re: [I] Caused by: java.net.SocketException: Connection reset [iceberg]

2024-01-17 Thread via GitHub
pvary commented on issue #9444: URL: https://github.com/apache/iceberg/issues/9444#issuecomment-1895346429 Maybe @jackye1995? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] init writer framework [iceberg-rust]

2024-01-17 Thread via GitHub
liurenjie1024 commented on PR #135: URL: https://github.com/apache/iceberg-rust/pull/135#issuecomment-1895352135 > And we can work on the FileWriter part first if it looks good. How do you think? @Xuanwo @Fokko +1 -- This is an automated message from the Apache Git Service. To resp

Re: [PR] Hive: Unwrap RuntimeException for Hive TException with rename table [iceberg]

2024-01-17 Thread via GitHub
pvary commented on code in PR #9432: URL: https://github.com/apache/iceberg/pull/9432#discussion_r1454984558 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalog.java: ## @@ -251,12 +251,14 @@ public void renameTable(TableIdentifier from, TableIdentifier original

Re: [I] `write.parquet.compression-codec` being set even if file-format is not parquet [iceberg]

2024-01-17 Thread via GitHub
findinpath commented on issue #9490: URL: https://github.com/apache/iceberg/issues/9490#issuecomment-1895363913 cc @aokolnychyi pls see `org.apache.iceberg.TableMetadata#persistedProperties` in 2e291c2b -- This is an automated message from the Apache Git Service. To respond to the messa

Re: [PR] Spark: Support creating views via SQL [iceberg]

2024-01-17 Thread via GitHub
nastra commented on code in PR #9423: URL: https://github.com/apache/iceberg/pull/9423#discussion_r1455012760 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/CreateV2ViewExec.scala: ## @@ -0,0 +1,147 @@ +/* + * Licensed to the Apache S

Re: [PR] Spark: Support creating views via SQL [iceberg]

2024-01-17 Thread via GitHub
nastra commented on code in PR #9423: URL: https://github.com/apache/iceberg/pull/9423#discussion_r1455041489 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/ViewCheck.scala: ## @@ -0,0 +1,39 @@ +/* + * Licensed to the Apache Software Foundat

Re: [PR] Spark: Support creating views via SQL [iceberg]

2024-01-17 Thread via GitHub
nastra commented on code in PR #9423: URL: https://github.com/apache/iceberg/pull/9423#discussion_r1455090708 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/ViewCheck.scala: ## @@ -0,0 +1,39 @@ +/* + * Licensed to the Apache Software Foundat

Re: [PR] Spark: Support creating views via SQL [iceberg]

2024-01-17 Thread via GitHub
nastra commented on code in PR #9423: URL: https://github.com/apache/iceberg/pull/9423#discussion_r1455100866 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/ViewCheck.scala: ## @@ -0,0 +1,39 @@ +/* + * Licensed to the Apache Software Foundat

Re: [PR] Check stale issues in ascending order [iceberg]

2024-01-17 Thread via GitHub
ajantha-bhat commented on code in PR #9489: URL: https://github.com/apache/iceberg/pull/9489#discussion_r1455140931 ## .github/workflows/stale.yml: ## @@ -47,3 +47,4 @@ jobs: close-issue-message: > This issue has been closed because it has not received an

Re: [PR] Hive: Unwrap RuntimeException for Hive TException with rename table [iceberg]

2024-01-17 Thread via GitHub
pvary merged PR #9432: URL: https://github.com/apache/iceberg/pull/9432 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

Re: [PR] Hive: Unwrap RuntimeException for Hive TException with rename table [iceberg]

2024-01-17 Thread via GitHub
pvary commented on PR #9432: URL: https://github.com/apache/iceberg/pull/9432#issuecomment-1895498122 Thanks @nk1506 for the PR! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

Re: [PR] Update iceberg_bug_report.yml to 1.4.3 [iceberg]

2024-01-17 Thread via GitHub
nastra merged PR #9491: URL: https://github.com/apache/iceberg/pull/9491 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Check stale issues in ascending order [iceberg]

2024-01-17 Thread via GitHub
manuzhang commented on code in PR #9489: URL: https://github.com/apache/iceberg/pull/9489#discussion_r1455190604 ## .github/workflows/stale.yml: ## @@ -47,3 +47,4 @@ jobs: close-issue-message: > This issue has been closed because it has not received any

Re: [PR] Check stale issues in ascending order [iceberg]

2024-01-17 Thread via GitHub
nastra merged PR #9489: URL: https://github.com/apache/iceberg/pull/9489 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Core: Add view support for JDBC catalog [iceberg]

2024-01-17 Thread via GitHub
ajantha-bhat commented on code in PR #9487: URL: https://github.com/apache/iceberg/pull/9487#discussion_r1455179512 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcUtil.java: ## @@ -53,21 +58,19 @@ final class JdbcUtil { + " WHERE " + CATALOG_NAME

Re: [PR] Spark: Support creating views via SQL [iceberg]

2024-01-17 Thread via GitHub
nastra commented on code in PR #9423: URL: https://github.com/apache/iceberg/pull/9423#discussion_r1455299682 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/CreateV2ViewExec.scala: ## @@ -0,0 +1,147 @@ +/* + * Licensed to the Apache S

Re: [PR] Core: Close the MetricsReporter when the Catalog is closed. [iceberg]

2024-01-17 Thread via GitHub
nastra commented on code in PR #9353: URL: https://github.com/apache/iceberg/pull/9353#discussion_r1455363550 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcCatalog.java: ## @@ -482,7 +489,13 @@ public boolean removeProperties(Namespace namespace, Set properties) @Overr

Re: [PR] Core: Close the MetricsReporter when the Catalog is closed. [iceberg]

2024-01-17 Thread via GitHub
nastra commented on code in PR #9353: URL: https://github.com/apache/iceberg/pull/9353#discussion_r1455378169 ## aws/src/main/java/org/apache/iceberg/aws/dynamodb/DynamoDbCatalog.java: ## @@ -143,6 +142,7 @@ void initialize( this.closeableGroup = new CloseableGroup();

Re: [PR] Docs: Add distribution mode not respected for CTAS/RTAS before Spark 3.5.0 [iceberg]

2024-01-17 Thread via GitHub
nastra commented on code in PR #9439: URL: https://github.com/apache/iceberg/pull/9439#discussion_r1455411644 ## docs/spark-writes.md: ## @@ -343,7 +343,8 @@ data.writeTo("prod.db.sample").option("mergeSchema","true").append() Iceberg's default Spark writers require that the d

Re: [PR] Build: Bump Spark 3.3 from 3.3.3 to 3.3.4 [iceberg]

2024-01-17 Thread via GitHub
nastra merged PR #9492: URL: https://github.com/apache/iceberg/pull/9492 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Add SqlCatalog _commit_table support [iceberg-python]

2024-01-17 Thread via GitHub
Fokko commented on PR #265: URL: https://github.com/apache/iceberg-python/pull/265#issuecomment-1895723645 Thanks @syun64 for working on this 👍 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] Add SqlCatalog _commit_table support [iceberg-python]

2024-01-17 Thread via GitHub
Fokko merged PR #265: URL: https://github.com/apache/iceberg-python/pull/265 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [I] SqlCatalog _commit_table() support [iceberg-python]

2024-01-17 Thread via GitHub
Fokko closed issue #262: SqlCatalog _commit_table() support URL: https://github.com/apache/iceberg-python/issues/262 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

Re: [I] SqlCatalog _commit_table() support [iceberg-python]

2024-01-17 Thread via GitHub
Fokko commented on issue #262: URL: https://github.com/apache/iceberg-python/issues/262#issuecomment-1895726712 Has been fixed in #265 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] Core: Add view support for JDBC catalog [iceberg]

2024-01-17 Thread via GitHub
jbonofre commented on code in PR #9487: URL: https://github.com/apache/iceberg/pull/9487#discussion_r1455472169 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcCatalog.java: ## @@ -157,8 +159,10 @@ private void initializeCatalogTables() throws InterruptedException, SQLExcepti

Re: [PR] Core: Add view support for JDBC catalog [iceberg]

2024-01-17 Thread via GitHub
jbonofre commented on code in PR #9487: URL: https://github.com/apache/iceberg/pull/9487#discussion_r1455473193 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcUtil.java: ## @@ -53,21 +58,19 @@ final class JdbcUtil { + " WHERE " + CATALOG_NAME

Re: [PR] Core: Add view support for JDBC catalog [iceberg]

2024-01-17 Thread via GitHub
jbonofre commented on code in PR #9487: URL: https://github.com/apache/iceberg/pull/9487#discussion_r1455474276 ## core/src/main/java/org/apache/iceberg/jdbc/JdbcUtil.java: ## @@ -139,7 +134,22 @@ final class JdbcUtil { + " LIKE ? ESCAPE '\\' " + " ) "

[I] Build/Release: Upgrade to Apache RAT 0.16 and scan hidden directories [iceberg]

2024-01-17 Thread via GitHub
jbonofre opened a new issue, #9494: URL: https://github.com/apache/iceberg/issues/9494 ### Feature Request / Improvement As identified on a previous Iceberg release, apache-rat 0.15 doesn't scan hidden directories. It's not good as the hidden directories are part of the released Iceb

[PR] Build: Upgrade to Apache RAT 0.16, scanning hidden directories and adding missing ASF header [iceberg]

2024-01-17 Thread via GitHub
jbonofre opened a new pull request, #9495: URL: https://github.com/apache/iceberg/pull/9495 This PR does: - upgrade to Apache RAT 0.16 - add `--scan-hidden-directories` option - add ASF header where missing - add new excluded file from RAT check -- This is an automated message

Re: [PR] Build: Upgrade to Apache RAT 0.16, scanning hidden directories and adding missing ASF header [iceberg]

2024-01-17 Thread via GitHub
jbonofre commented on PR #9495: URL: https://github.com/apache/iceberg/pull/9495#issuecomment-1895818508 @Fokko can you please take a look ? Thanks ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[PR] Docs: Fix typo in tag reading example [iceberg]

2024-01-17 Thread via GitHub
pvary opened a new pull request, #9496: URL: https://github.com/apache/iceberg/pull/9496 Small fix in the docs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscrib

[PR] Add small test on duplicate changes [iceberg-python]

2024-01-17 Thread via GitHub
Fokko opened a new pull request, #273: URL: https://github.com/apache/iceberg-python/pull/273 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e

Re: [PR] Docs: Fix typo in tag reading example [iceberg]

2024-01-17 Thread via GitHub
nastra merged PR #9496: URL: https://github.com/apache/iceberg/pull/9496 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] Set `ghp_path` to `/` [iceberg]

2024-01-17 Thread via GitHub
Fokko merged PR #9493: URL: https://github.com/apache/iceberg/pull/9493 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

Re: [PR] Spark: Support creating views via SQL [iceberg]

2024-01-17 Thread via GitHub
nastra commented on code in PR #9423: URL: https://github.com/apache/iceberg/pull/9423#discussion_r1455627217 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/ViewCheck.scala: ## @@ -0,0 +1,39 @@ +/* + * Licensed to the Apache Software Foundat

Re: [PR] Build: Upgrade to Apache RAT 0.16, scanning hidden directories and adding missing ASF header [iceberg]

2024-01-17 Thread via GitHub
ajantha-bhat commented on code in PR #9495: URL: https://github.com/apache/iceberg/pull/9495#discussion_r1455712834 ## dev/check-license: ## @@ -68,7 +68,7 @@ mkdir -p "$FWDIR"/lib } mkdir -p build -$java_cmd -jar "$rat_jar" -E "$FWDIR"/dev/.rat-excludes -d "$FWDIR" > build

Re: [PR] Spark 3.5: Spark action to compute the partition stats [iceberg]

2024-01-17 Thread via GitHub
ajantha-bhat commented on code in PR #9437: URL: https://github.com/apache/iceberg/pull/9437#discussion_r1454359159 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/BaseSparkAction.java: ## @@ -150,6 +154,21 @@ protected Dataset contentFileDS(Table table, Set

Re: [PR] Add Hive integration tests [iceberg-python]

2024-01-17 Thread via GitHub
Fokko commented on code in PR #207: URL: https://github.com/apache/iceberg-python/pull/207#discussion_r1455755578 ## tests/integration/test_hive.py: ## @@ -0,0 +1,409 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See

Re: [PR] Spark 3.5: Spark action to compute the partition stats [iceberg]

2024-01-17 Thread via GitHub
ajantha-bhat commented on code in PR #9437: URL: https://github.com/apache/iceberg/pull/9437#discussion_r1455763573 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/BaseSparkAction.java: ## @@ -150,6 +154,21 @@ protected Dataset contentFileDS(Table table, Set

Re: [PR] Spark 3.5: Spark action to compute the partition stats [iceberg]

2024-01-17 Thread via GitHub
ajantha-bhat commented on code in PR #9437: URL: https://github.com/apache/iceberg/pull/9437#discussion_r1455766754 ## api/src/main/java/org/apache/iceberg/actions/ComputePartitionStats.java: ## @@ -0,0 +1,35 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

[I] Hive Catalog: Implement `_commit_table` [iceberg-python]

2024-01-17 Thread via GitHub
Fokko opened a new issue, #275: URL: https://github.com/apache/iceberg-python/issues/275 ### Feature Request / Improvement Probably very similar to the Glue/Sql one :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[I] Two-level parquet read EOF error: org.apache.parquet.io.ParquetDecodingException: Can't read value in column [a, array] repeated int32 array = 2 at value 4 out of 4 in current page. repetition lev

2024-01-17 Thread via GitHub
gaoshihang opened a new issue, #9497: URL: https://github.com/apache/iceberg/issues/9497 ### Apache Iceberg version 1.4.3 (latest release) ### Query engine Spark ### Please describe the bug 🐞 We have a two-level parquet list, the schema like below: ![ima

Re: [PR] Add Hive integration tests [iceberg-python]

2024-01-17 Thread via GitHub
Fokko commented on PR #207: URL: https://github.com/apache/iceberg-python/pull/207#issuecomment-1896052914 Thanks @HonahX for the review 👍 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] Add Hive integration tests [iceberg-python]

2024-01-17 Thread via GitHub
Fokko merged PR #207: URL: https://github.com/apache/iceberg-python/pull/207 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

[PR] Core: Hadoop: Fix: HadoopTableOperations renameToFinal [iceberg]

2024-01-17 Thread via GitHub
N-o-Z opened a new pull request, #9498: URL: https://github.com/apache/iceberg/pull/9498 Closes #9485 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail

Re: [PR] Core: Hadoop: Fix: HadoopTableOperations renameToFinal [iceberg]

2024-01-17 Thread via GitHub
N-o-Z commented on PR #9498: URL: https://github.com/apache/iceberg/pull/9498#issuecomment-1896152644 @amogh-jahagirdar, FYI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [I] Purge support for Iceberg view [iceberg]

2024-01-17 Thread via GitHub
rdblue commented on issue #9433: URL: https://github.com/apache/iceberg/issues/9433#issuecomment-1896190645 What is the proposed behavior for a purge operation? How does this apply to views? -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] Flink: Don't fail to serialize IcebergSourceSplit when there is too many delete files [iceberg]

2024-01-17 Thread via GitHub
stevenzwu commented on PR #9464: URL: https://github.com/apache/iceberg/pull/9464#issuecomment-1896232286 > If I understand correctly, the FileScanTask json will contain the Schema. The Schema has a doc field for comments. Do we have restrictions defined for the doc field? @pvary yo

Re: [PR] Core: Hadoop: Fix: HadoopTableOperations renameToFinal [iceberg]

2024-01-17 Thread via GitHub
amogh-jahagirdar commented on code in PR #9498: URL: https://github.com/apache/iceberg/pull/9498#discussion_r1456088754 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java: ## @@ -360,7 +360,10 @@ int findVersion() { */ private void renameToFinal(Fi

Re: [PR] Core: Hadoop: Fix: HadoopTableOperations renameToFinal [iceberg]

2024-01-17 Thread via GitHub
amogh-jahagirdar commented on code in PR #9498: URL: https://github.com/apache/iceberg/pull/9498#discussion_r1456088754 ## core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java: ## @@ -360,7 +360,10 @@ int findVersion() { */ private void renameToFinal(Fi

Re: [PR] Core: Fix lock acquisition logic in HadoopTableOperations rename [iceberg]

2024-01-17 Thread via GitHub
amogh-jahagirdar commented on PR #9498: URL: https://github.com/apache/iceberg/pull/9498#issuecomment-1896295071 looks like spotless checks are failing: if you could run ``` ./gradlew spotlessApply ``` before pushing your next changes that would fix it! -- This

Re: [I] Cannot write nullable values to non-null column in the Iceberg Table [iceberg]

2024-01-17 Thread via GitHub
abharath9 commented on issue #9488: URL: https://github.com/apache/iceberg/issues/9488#issuecomment-1896331390 @nastra Yes i am aware of that. How do i write optional fields data to the mandatory fields data. It is mentioned in this issue that it is possible by setting "spark.sql.iceberg.ch

Re: [I] Two-level parquet read EOF error: org.apache.parquet.io.ParquetDecodingException: Can't read value in column [a, array] repeated int32 array = 2 at value 4 out of 4 in current page. repetition

2024-01-17 Thread via GitHub
gaoshihang commented on issue #9497: URL: https://github.com/apache/iceberg/issues/9497#issuecomment-1896362482 And here is the iceberg schema [v8.metadata.json](https://github.com/apache/iceberg/files/13967089/v8.metadata.json) -- This is an automated message from the Apache Git S

Re: [I] Two-level parquet read EOF error: org.apache.parquet.io.ParquetDecodingException: Can't read value in column [a, array] repeated int32 array = 2 at value 4 out of 4 in current page. repetition

2024-01-17 Thread via GitHub
gaoshihang commented on issue #9497: URL: https://github.com/apache/iceberg/issues/9497#issuecomment-1896369837 And here is the parquet file we used to add_files. (need to change the .log to .parquet) [user_error_parquet.log](https://github.com/apache/iceberg/files/13967114/user_error_

Re: [PR] Core: Fix lock acquisition logic in HadoopTableOperations rename [iceberg]

2024-01-17 Thread via GitHub
N-o-Z commented on PR #9498: URL: https://github.com/apache/iceberg/pull/9498#issuecomment-1896383730 @amogh-jahagirdar Done! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

Re: [PR] API, Core, Spark: Add fastForwardOrCreate API and integrate that with Spark fast forward procedure [iceberg]

2024-01-17 Thread via GitHub
rdblue commented on PR #9196: URL: https://github.com/apache/iceberg/pull/9196#issuecomment-1896389067 @amogh-jahagirdar, I think I would prefer the second alternative, to change the behavior of fast-forward. I doubt that anyone relies on fast-forward _not_ creating a branch and failing ins

Re: [PR] Write support [iceberg-python]

2024-01-17 Thread via GitHub
rdblue commented on code in PR #41: URL: https://github.com/apache/iceberg-python/pull/41#discussion_r1456300895 ## mkdocs/docs/api.md: ## @@ -175,6 +175,104 @@ static_table = StaticTable.from_metadata( The static-table is considered read-only. +## Write support + +With PyI

Re: [PR] Write support [iceberg-python]

2024-01-17 Thread via GitHub
Fokko commented on code in PR #41: URL: https://github.com/apache/iceberg-python/pull/41#discussion_r1456327592 ## mkdocs/docs/api.md: ## @@ -175,6 +175,104 @@ static_table = StaticTable.from_metadata( The static-table is considered read-only. +## Write support + +With PyIc

Re: [I] Purge support for Iceberg view [iceberg]

2024-01-17 Thread via GitHub
nk1506 commented on issue #9433: URL: https://github.com/apache/iceberg/issues/9433#issuecomment-1896529369 With purge enablement similar like [dropTable](https://github.com/apache/iceberg/blob/66b1aa662761606d4d68d99371c62505e7ac2f1e/api/src/main/java/org/apache/iceberg/catalog/Catalog.java

Re: [PR] Apply Name mapping, new_schema_for_table [iceberg-python]

2024-01-17 Thread via GitHub
syun64 commented on code in PR #219: URL: https://github.com/apache/iceberg-python/pull/219#discussion_r1456437583 ## pyiceberg/io/pyarrow.py: ## @@ -733,42 +854,178 @@ def _get_field_id(field: pa.Field) -> Optional[int]: ) -class _ConvertToIceberg(PyArrowSchemaVisitor[

Re: [PR] Write support [iceberg-python]

2024-01-17 Thread via GitHub
rdblue commented on code in PR #41: URL: https://github.com/apache/iceberg-python/pull/41#discussion_r1456467981 ## pyiceberg/table/__init__.py: ## @@ -856,6 +909,61 @@ def history(self) -> List[SnapshotLogEntry]: def update_schema(self, allow_incompatible_changes: bool = F

Re: [PR] Write support [iceberg-python]

2024-01-17 Thread via GitHub
rdblue commented on code in PR #41: URL: https://github.com/apache/iceberg-python/pull/41#discussion_r1456469457 ## pyiceberg/table/__init__.py: ## @@ -856,6 +909,61 @@ def history(self) -> List[SnapshotLogEntry]: def update_schema(self, allow_incompatible_changes: bool = F

Re: [PR] Write support [iceberg-python]

2024-01-17 Thread via GitHub
rdblue commented on code in PR #41: URL: https://github.com/apache/iceberg-python/pull/41#discussion_r1456478299 ## pyiceberg/table/__init__.py: ## @@ -831,6 +887,46 @@ def history(self) -> List[SnapshotLogEntry]: def update_schema(self, allow_incompatible_changes: bool = F

Re: [I] [HadoopCatalog]: [HadoopTableOperations]: Commit flow, renameToFinal does not actually check if lock acquired [iceberg]

2024-01-17 Thread via GitHub
amogh-jahagirdar closed issue #9485: [HadoopCatalog]: [HadoopTableOperations]: Commit flow, renameToFinal does not actually check if lock acquired URL: https://github.com/apache/iceberg/issues/9485 -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] Core: Fix lock acquisition logic in HadoopTableOperations rename [iceberg]

2024-01-17 Thread via GitHub
amogh-jahagirdar merged PR #9498: URL: https://github.com/apache/iceberg/pull/9498 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ic

Re: [PR] Write support [iceberg-python]

2024-01-17 Thread via GitHub
rdblue commented on code in PR #41: URL: https://github.com/apache/iceberg-python/pull/41#discussion_r1456486632 ## pyiceberg/table/__init__.py: ## @@ -1935,3 +2043,184 @@ def _generate_snapshot_id() -> int: snapshot_id = snapshot_id if snapshot_id >= 0 else snapshot_id * -

Re: [PR] Write support [iceberg-python]

2024-01-17 Thread via GitHub
rdblue commented on code in PR #41: URL: https://github.com/apache/iceberg-python/pull/41#discussion_r1456487418 ## pyiceberg/table/__init__.py: ## @@ -1935,3 +2043,184 @@ def _generate_snapshot_id() -> int: snapshot_id = snapshot_id if snapshot_id >= 0 else snapshot_id * -

Re: [PR] Write support [iceberg-python]

2024-01-17 Thread via GitHub
rdblue commented on code in PR #41: URL: https://github.com/apache/iceberg-python/pull/41#discussion_r1456489014 ## pyiceberg/table/__init__.py: ## @@ -1935,3 +2043,184 @@ def _generate_snapshot_id() -> int: snapshot_id = snapshot_id if snapshot_id >= 0 else snapshot_id * -

Re: [PR] Write support [iceberg-python]

2024-01-17 Thread via GitHub
rdblue commented on code in PR #41: URL: https://github.com/apache/iceberg-python/pull/41#discussion_r1456491428 ## pyiceberg/table/__init__.py: ## @@ -1935,3 +2043,184 @@ def _generate_snapshot_id() -> int: snapshot_id = snapshot_id if snapshot_id >= 0 else snapshot_id * -

Re: [PR] Write support [iceberg-python]

2024-01-17 Thread via GitHub
rdblue commented on code in PR #41: URL: https://github.com/apache/iceberg-python/pull/41#discussion_r1456493214 ## pyiceberg/table/__init__.py: ## @@ -1935,3 +2043,184 @@ def _generate_snapshot_id() -> int: snapshot_id = snapshot_id if snapshot_id >= 0 else snapshot_id * -

Re: [PR] Write support [iceberg-python]

2024-01-17 Thread via GitHub
rdblue commented on code in PR #41: URL: https://github.com/apache/iceberg-python/pull/41#discussion_r1456495179 ## pyiceberg/table/__init__.py: ## @@ -1935,3 +2043,184 @@ def _generate_snapshot_id() -> int: snapshot_id = snapshot_id if snapshot_id >= 0 else snapshot_id * -

Re: [PR] Write support [iceberg-python]

2024-01-17 Thread via GitHub
rdblue commented on code in PR #41: URL: https://github.com/apache/iceberg-python/pull/41#discussion_r1456498171 ## pyiceberg/table/__init__.py: ## @@ -1935,3 +2043,184 @@ def _generate_snapshot_id() -> int: snapshot_id = snapshot_id if snapshot_id >= 0 else snapshot_id * -

[PR] Add 1.4.3 docs [iceberg]

2024-01-17 Thread via GitHub
bitsondatadev opened a new pull request, #9499: URL: https://github.com/apache/iceberg/pull/9499 Add 1.4.3 docs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

Re: [PR] Write support [iceberg-python]

2024-01-17 Thread via GitHub
rdblue commented on code in PR #41: URL: https://github.com/apache/iceberg-python/pull/41#discussion_r1456502771 ## pyiceberg/table/__init__.py: ## @@ -1935,3 +2043,184 @@ def _generate_snapshot_id() -> int: snapshot_id = snapshot_id if snapshot_id >= 0 else snapshot_id * -

Re: [PR] Write support [iceberg-python]

2024-01-17 Thread via GitHub
rdblue commented on code in PR #41: URL: https://github.com/apache/iceberg-python/pull/41#discussion_r1456523045 ## mkdocs/docs/api.md: ## @@ -175,6 +175,104 @@ static_table = StaticTable.from_metadata( The static-table is considered read-only. +## Write support + +With PyI

Re: [PR] Write support [iceberg-python]

2024-01-17 Thread via GitHub
rdblue commented on PR #41: URL: https://github.com/apache/iceberg-python/pull/41#issuecomment-1896944638 @Fokko, this works great and I don't see any blockers so I've approved it. I think there are a few things to consider in terms of how we want to do this moving forward (whether to

Re: [PR] Add 1.4.3 docs [iceberg]

2024-01-17 Thread via GitHub
dramaticlly commented on code in PR #9499: URL: https://github.com/apache/iceberg/pull/9499#discussion_r1456526671 ## 1.4.3/mkdocs.yml: ## @@ -0,0 +1,70 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE fi

Re: [PR] Flink: implement range partitioner for map data statistics [iceberg]

2024-01-17 Thread via GitHub
stevenzwu commented on code in PR #9321: URL: https://github.com/apache/iceberg/pull/9321#discussion_r1456527386 ## flink/v1.17/flink/src/main/java/org/apache/iceberg/flink/sink/shuffle/MapRangePartitioner.java: ## @@ -0,0 +1,288 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] Flink: implement range partitioner for map data statistics [iceberg]

2024-01-17 Thread via GitHub
stevenzwu commented on code in PR #9321: URL: https://github.com/apache/iceberg/pull/9321#discussion_r1456527386 ## flink/v1.17/flink/src/main/java/org/apache/iceberg/flink/sink/shuffle/MapRangePartitioner.java: ## @@ -0,0 +1,288 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] Flink: implement range partitioner for map data statistics [iceberg]

2024-01-17 Thread via GitHub
stevenzwu commented on code in PR #9321: URL: https://github.com/apache/iceberg/pull/9321#discussion_r1456529072 ## flink/v1.17/flink/src/main/java/org/apache/iceberg/flink/sink/shuffle/MapRangePartitioner.java: ## @@ -0,0 +1,288 @@ +/* + * Licensed to the Apache Software Founda

[PR] Fix community link [iceberg]

2024-01-17 Thread via GitHub
bitsondatadev opened a new pull request, #9500: URL: https://github.com/apache/iceberg/pull/9500 The community link works only on top-level site links. Link this to the static site for now, eventually we need to consider a site-wide variable solution but that's not important for now. --

Re: [PR] Add 1.4.3 docs [iceberg]

2024-01-17 Thread via GitHub
bitsondatadev commented on code in PR #9499: URL: https://github.com/apache/iceberg/pull/9499#discussion_r1456528989 ## 1.4.3/mkdocs.yml: ## @@ -0,0 +1,70 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE

[PR] [Bug Fix] TruncateTransform for falsey values [iceberg-python]

2024-01-17 Thread via GitHub
syun64 opened a new pull request, #276: URL: https://github.com/apache/iceberg-python/pull/276 Currently, any falsey values will return None for their **TruncateTransform**. This results in **fill_parquet_file_metadata** throwing an exception whenever there is a falsey lower bound as the mi

Re: [PR] Spark: Support creating views via SQL [iceberg]

2024-01-17 Thread via GitHub
rdblue commented on code in PR #9423: URL: https://github.com/apache/iceberg/pull/9423#discussion_r1456546379 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/CreateV2ViewExec.scala: ## @@ -0,0 +1,147 @@ +/* + * Licensed to the Apache S

Re: [PR] Flink: Added error handling and default logic for Flink version detection [iceberg]

2024-01-17 Thread via GitHub
gjacoby126 commented on code in PR #9452: URL: https://github.com/apache/iceberg/pull/9452#discussion_r1456547407 ## flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/util/FlinkPackage.java: ## @@ -19,15 +19,31 @@ package org.apache.iceberg.flink.util; import org.apac

Re: [PR] Flink: implement range partitioner for map data statistics [iceberg]

2024-01-17 Thread via GitHub
stevenzwu commented on code in PR #9321: URL: https://github.com/apache/iceberg/pull/9321#discussion_r1456527386 ## flink/v1.17/flink/src/main/java/org/apache/iceberg/flink/sink/shuffle/MapRangePartitioner.java: ## @@ -0,0 +1,288 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] Spark: Support creating views via SQL [iceberg]

2024-01-17 Thread via GitHub
rdblue commented on code in PR #9423: URL: https://github.com/apache/iceberg/pull/9423#discussion_r1456586630 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/CreateV2ViewExec.scala: ## @@ -0,0 +1,144 @@ +/* + * Licensed to the Apache S

Re: [PR] Fix community link [iceberg]

2024-01-17 Thread via GitHub
amogh-jahagirdar merged PR #9500: URL: https://github.com/apache/iceberg/pull/9500 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ic

[PR] [Reference PR] [API + Avro] Add default value APIs and Avro implementation [iceberg]

2024-01-17 Thread via GitHub
wmoustafa opened a new pull request, #9502: URL: https://github.com/apache/iceberg/pull/9502 This PR adds default value APIs according to the default value spec, and implements it in the `GenericAvroReader` case. It uses a `ConstantReader` to fill in the default values of fields from their

  1   2   >