Re: [PR] Spark 3.5: Add Parallelism Parameter Validation to AddFilesProcedure. [iceberg]

2025-04-21 Thread via GitHub
nastra merged PR #12784: URL: https://github.com/apache/iceberg/pull/12784 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Spark3.4: Migrate tests in spark, extensions and functions [iceberg]

2025-04-21 Thread via GitHub
nastra commented on code in PR #12853: URL: https://github.com/apache/iceberg/pull/12853#discussion_r2053443377 ## spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/actions/TestCreateActions.java: ## @@ -972,33 +974,38 @@ private void assertSnapshotFileCount(SnapshotTable

Re: [PR] Core: Pass storage credentials from LoadTableResponse to FileIO [iceberg]

2025-04-21 Thread via GitHub
nastra commented on code in PR #12591: URL: https://github.com/apache/iceberg/pull/12591#discussion_r2053420927 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIO.java: ## @@ -547,4 +563,28 @@ private boolean recoverObject(ObjectVersion version, String bucket) { ret

[PR] bug: Fix glue integration test failures [iceberg-go]

2025-04-21 Thread via GitHub
lliangyu-lin opened a new pull request, #400: URL: https://github.com/apache/iceberg-go/pull/400 ### Description * Fix glue integration test failures. Mostly caused by asserting table identifiers, which require both database name and table name. ### Testing * Ran the integ tests

Re: [PR] Use assumeThat instead of assumeTrue [iceberg]

2025-04-21 Thread via GitHub
nastra commented on code in PR #12822: URL: https://github.com/apache/iceberg/pull/12822#discussion_r2053416463 ## .baseline/checkstyle/checkstyle.xml: ## @@ -439,6 +439,11 @@ + + Review Comment: ```suggestion

Re: [PR] Core: Broaden exception handling in writer clean up logic [iceberg]

2025-04-21 Thread via GitHub
geruh commented on code in PR #12863: URL: https://github.com/apache/iceberg/pull/12863#discussion_r2053396195 ## core/src/main/java/org/apache/iceberg/io/BaseTaskWriter.java: ## @@ -347,9 +346,9 @@ private void closeCurrent() throws IOException { if (currentRows == 0

Re: [D] [Question] How to generate manifest files and manifest list [iceberg-rust]

2025-04-21 Thread via GitHub
GitHub user dentiny edited a comment on the discussion: [Question] How to generate manifest files and manifest list I confirm the following python version works, and would like to find an equivalent solution in rust. ```python import os from pyiceberg.catalog import load_catalog from pyiceberg

Re: [D] [Question] How to generate manifest files and manifest list [iceberg-rust]

2025-04-21 Thread via GitHub
GitHub user dentiny added a comment to the discussion: [Question] How to generate manifest files and manifest list I confirm the following python version works: ```python import os from pyiceberg.catalog import load_catalog from pyiceberg.schema import Schema from pyiceberg.types import Integer

Re: [PR] Add table property to disable/enable parquet column statistics #12770 [iceberg]

2025-04-21 Thread via GitHub
huaxiangsun commented on PR #12771: URL: https://github.com/apache/iceberg/pull/12771#issuecomment-2820163664 Hi Folks, I uploaded a new path which addresses the comments, please take a look. -- This is an automated message from the Apache Git Service. To respond to the message, please l

Re: [PR] Add table property to disable/enable parquet column statistics #12770 [iceberg]

2025-04-21 Thread via GitHub
huaxiangsun commented on code in PR #12771: URL: https://github.com/apache/iceberg/pull/12771#discussion_r2053387227 ## parquet/src/main/java/org/apache/iceberg/parquet/Parquet.java: ## @@ -306,33 +308,29 @@ private WriteBuilder createContextFunc( return this; } +

Re: [PR] Add table property to disable/enable parquet column statistics #12770 [iceberg]

2025-04-21 Thread via GitHub
huaxiangsun commented on code in PR #12771: URL: https://github.com/apache/iceberg/pull/12771#discussion_r2053366572 ## docs/docs/configuration.md: ## @@ -52,6 +52,8 @@ Iceberg tables support table properties to configure table behavior, like the de | write.parquet.bloom-filte

Re: [PR] Add table property to disable/enable parquet column statistics #12770 [iceberg]

2025-04-21 Thread via GitHub
huaxiangsun commented on code in PR #12771: URL: https://github.com/apache/iceberg/pull/12771#discussion_r2053366037 ## parquet/src/test/java/org/apache/iceberg/parquet/TestParquet.java: ## @@ -219,6 +221,50 @@ public void testTwoLevelList() throws IOException { assertThat(

Re: [PR] Add table property to disable/enable parquet column statistics #12770 [iceberg]

2025-04-21 Thread via GitHub
huaxiangsun commented on code in PR #12771: URL: https://github.com/apache/iceberg/pull/12771#discussion_r2053365460 ## docs/docs/configuration.md: ## @@ -52,6 +52,8 @@ Iceberg tables support table properties to configure table behavior, like the de | write.parquet.bloom-filte

Re: [PR] Add table property to disable/enable parquet column statistics #12770 [iceberg]

2025-04-21 Thread via GitHub
huaxiangsun commented on code in PR #12771: URL: https://github.com/apache/iceberg/pull/12771#discussion_r2053349866 ## core/src/main/java/org/apache/iceberg/TableProperties.java: ## @@ -174,6 +174,13 @@ private TableProperties() {} public static final String PARQUET_BLOOM_FI

Re: [PR] Add table property to disable/enable parquet column statistics #12770 [iceberg]

2025-04-21 Thread via GitHub
huaxiangsun commented on code in PR #12771: URL: https://github.com/apache/iceberg/pull/12771#discussion_r2053349200 ## parquet/src/main/java/org/apache/iceberg/parquet/Parquet.java: ## @@ -477,6 +513,9 @@ private static class Context { private final int bloomFilterMaxByt

[PR] refactor: TableCreation::builder()::properties accept an `IntoIterator` [iceberg-rust]

2025-04-21 Thread via GitHub
drmingdrmer opened a new pull request, #1233: URL: https://github.com/apache/iceberg-rust/pull/1233 ## What changes are included in this PR? `TableCreation::builder().properties()` should be able to accept any `IntoIterator` type that can be used to build an `HashMap`.

Re: [PR] [Spark]Add max files rewrite option for RewriteAction [iceberg]

2025-04-21 Thread via GitHub
coderfender commented on PR #12824: URL: https://github.com/apache/iceberg/pull/12824#issuecomment-2819966147 @manuzhang , Already pinged in the dev channel to get some feedback : https://apache-iceberg.slack.com/archives/C03LG1D563F/p1744871503652659 . Let me bump the message again .

Re: [PR] Core: Make namespace separator configurable [iceberg]

2025-04-21 Thread via GitHub
ZacBlanco commented on PR #10877: URL: https://github.com/apache/iceberg/pull/10877#issuecomment-2819963728 @nastra Do you intend to continue working on this? I am seeing issues in our test environments and would like to get a fix in upstream. I would be happy to take over the PR if you are

[I] Iceberg Rest Catalog APIs for Multi-Statement Multi-Table Transactions [iceberg]

2025-04-21 Thread via GitHub
jagdeeps91 opened a new issue, #12865: URL: https://github.com/apache/iceberg/issues/12865 ### Proposed Change This document proposes IRC APIs that engines can use to build multi-statement multi-table transactions across catalog. These APIs aim to solve 2 main use cases: 1. Ab

Re: [PR] Spark 3.5 row lineage [iceberg]

2025-04-21 Thread via GitHub
amogh-jahagirdar commented on code in PR #12736: URL: https://github.com/apache/iceberg/pull/12736#discussion_r2053235855 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/RewriteMergeIntoTableForRowLineage.scala: ## @@ -0,0 +1,95 @@ +/* + * Li

Re: [PR] Core: Add test cases for row lineage metadata [iceberg]

2025-04-21 Thread via GitHub
rdblue commented on PR #12843: URL: https://github.com/apache/iceberg/pull/12843#issuecomment-2819866925 Thanks for reviewing, @amogh-jahagirdar! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [PR] Core: Add test cases for row lineage metadata [iceberg]

2025-04-21 Thread via GitHub
rdblue merged PR #12843: URL: https://github.com/apache/iceberg/pull/12843 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Spark: Add 'skip_file_list' option to RewriteTablePathProcedure for optional file-list generation [iceberg]

2025-04-21 Thread via GitHub
slfan1989 commented on PR #12844: URL: https://github.com/apache/iceberg/pull/12844#issuecomment-2819858442 > > Interesting, is it all that you need to do Hive -> Iceberg conversion. Seems simple and make sense to me. cc @flyrain @dramaticlly for any thoughts > > Glad to hear RewriteT

Re: [PR] chore: use chrono::milliseconds in snapshot and consolidate error usage [iceberg-cpp]

2025-04-21 Thread via GitHub
zhjwpku commented on PR #83: URL: https://github.com/apache/iceberg-cpp/pull/83#issuecomment-2819856014 @Fokko @Xuanwo This PR should benefit future dev with better error handling, I'd appreciate if you can take a look, thanks. -- This is an automated message from the Apache Git Service.

Re: [PR] Spark: Add 'skip_file_list' option to RewriteTablePathProcedure for optional file-list generation [iceberg]

2025-04-21 Thread via GitHub
slfan1989 commented on code in PR #12844: URL: https://github.com/apache/iceberg/pull/12844#discussion_r2053192186 ## api/src/main/java/org/apache/iceberg/actions/RewriteTablePath.java: ## @@ -86,6 +86,18 @@ public interface RewriteTablePath extends Action

Re: [PR] Build and test hive-metastore with Hive 2, 3 and 4 with a single source set [iceberg]

2025-04-21 Thread via GitHub
wypoon commented on PR #12721: URL: https://github.com/apache/iceberg/pull/12721#issuecomment-2819850170 @danielcweeks we are in agreement that we should produce a single set of artifacts for the hive-metastore module, that should have runtime compatibility across supported Hive versions. A

Re: [PR] Add timestamp_ns, time and UUID types for Variant [iceberg]

2025-04-21 Thread via GitHub
aihuaxu commented on code in PR #12682: URL: https://github.com/apache/iceberg/pull/12682#discussion_r2052976424 ## api/src/main/java/org/apache/iceberg/variants/PhysicalType.java: ## @@ -41,6 +41,10 @@ public enum PhysicalType { FLOAT(LogicalType.FLOAT, Float.class), BINA

Re: [PR] Skip producing empty parquet files [iceberg-rust]

2025-04-21 Thread via GitHub
liurenjie1024 commented on PR #1230: URL: https://github.com/apache/iceberg-rust/pull/1230#issuecomment-2819838094 > I tried the unit test directly and it can pass. And I try to reproduce #1224 using master branch before this PR and there is no file written. So I guess we can revert this PR

Re: [PR] Added ExpireSnapshots Feature [iceberg-python]

2025-04-21 Thread via GitHub
ForeverAngry commented on code in PR #1880: URL: https://github.com/apache/iceberg-python/pull/1880#discussion_r2053172440 ## pyiceberg/table/update/snapshot.py: ## @@ -843,3 +849,52 @@ def remove_branch(self, branch_name: str) -> ManageSnapshots: This for method c

Re: [PR] Remove usage of Map.containsKey(null) [iceberg]

2025-04-21 Thread via GitHub
eric-maynard closed pull request #12864: Remove usage of Map.containsKey(null) URL: https://github.com/apache/iceberg/pull/12864 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Added ExpireSnapshots Feature [iceberg-python]

2025-04-21 Thread via GitHub
ForeverAngry commented on code in PR #1880: URL: https://github.com/apache/iceberg-python/pull/1880#discussion_r2053170133 ## tests/table/test_expire_snapshots.py: ## @@ -0,0 +1,43 @@ +from unittest.mock import MagicMock +from uuid import uuid4 + +from pyiceberg.table import Com

Re: [PR] Spark 3.5 row lineage [iceberg]

2025-04-21 Thread via GitHub
amogh-jahagirdar commented on code in PR #12736: URL: https://github.com/apache/iceberg/pull/12736#discussion_r2053165632 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/RewriteMergeIntoTableForRowLineage.scala: ## @@ -0,0 +1,95 @@ +/* + * Li

Re: [PR] Spark 3.5 row lineage [iceberg]

2025-04-21 Thread via GitHub
amogh-jahagirdar commented on code in PR #12736: URL: https://github.com/apache/iceberg/pull/12736#discussion_r2053163626 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/RewriteMergeIntoTableForRowLineage.scala: ## @@ -0,0 +1,95 @@ +/* + * Li

Re: [PR] Spark 3.5 row lineage [iceberg]

2025-04-21 Thread via GitHub
amogh-jahagirdar commented on code in PR #12736: URL: https://github.com/apache/iceberg/pull/12736#discussion_r2053163626 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/RewriteMergeIntoTableForRowLineage.scala: ## @@ -0,0 +1,95 @@ +/* + * Li

[PR] Remove usage of Map.containsKey(null) [iceberg]

2025-04-21 Thread via GitHub
eric-maynard opened a new pull request, #12864: URL: https://github.com/apache/iceberg/pull/12864 When working on a change for Apache Polaris, I hit an error of the following form: ``` java.lang.NullPointerException at java.base/java.util.Objects.requireNonNull(Objects.java

Re: [PR] Data: Handle case where partition location is missing for `TableMigrationUtil` [iceberg]

2025-04-21 Thread via GitHub
jshmchenxi commented on PR #12212: URL: https://github.com/apache/iceberg/pull/12212#issuecomment-2819779110 Kindly pinging @RussellSpitzer — when you have a moment, could you please help review this change? Looking forward to getting it merged. Thanks in advance! -- This is an automated

Re: [PR] Spark 3.5 row lineage [iceberg]

2025-04-21 Thread via GitHub
amogh-jahagirdar commented on code in PR #12736: URL: https://github.com/apache/iceberg/pull/12736#discussion_r2053145064 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/RewriteMergeIntoTableForRowLineage.scala: ## @@ -0,0 +1,95 @@ +/* + * Li

Re: [PR] Spark 3.5 row lineage [iceberg]

2025-04-21 Thread via GitHub
amogh-jahagirdar commented on code in PR #12736: URL: https://github.com/apache/iceberg/pull/12736#discussion_r2053145064 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/RewriteMergeIntoTableForRowLineage.scala: ## @@ -0,0 +1,95 @@ +/* + * Li

Re: [I] Docs: REST OAuth2 Client Authentication Guide [iceberg]

2025-04-21 Thread via GitHub
github-actions[bot] commented on issue #11286: URL: https://github.com/apache/iceberg/issues/11286#issuecomment-2819755193 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [I] How to create an iceberg table under a custom catalog name like iceberg instead of hive, using HiveCatalog [iceberg]

2025-04-21 Thread via GitHub
github-actions[bot] commented on issue #10786: URL: https://github.com/apache/iceberg/issues/10786#issuecomment-2819755049 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [I] Docs: REST OAuth2 Client Authentication Guide [iceberg]

2025-04-21 Thread via GitHub
github-actions[bot] closed issue #11286: Docs: REST OAuth2 Client Authentication Guide URL: https://github.com/apache/iceberg/issues/11286 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] Spark 3.5 row lineage [iceberg]

2025-04-21 Thread via GitHub
rdblue commented on code in PR #12736: URL: https://github.com/apache/iceberg/pull/12736#discussion_r2053135137 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestRowLevelOperationsWithLineage.java: ## @@ -0,0 +1,443 @@ +/* + * Licensed to the A

Re: [PR] Spark 3.5 row lineage [iceberg]

2025-04-21 Thread via GitHub
rdblue commented on code in PR #12736: URL: https://github.com/apache/iceberg/pull/12736#discussion_r2053133869 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestRowLevelOperationsWithLineage.java: ## @@ -0,0 +1,443 @@ +/* + * Licensed to the A

Re: [PR] Spark 3.5 row lineage [iceberg]

2025-04-21 Thread via GitHub
rdblue commented on code in PR #12736: URL: https://github.com/apache/iceberg/pull/12736#discussion_r2053127840 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RemoveRowLineageOutputFromOriginalTable.scala: ## @@ -0,0 +1,56 @@ +/* + * Licens

Re: [PR] Spark 3.5 row lineage [iceberg]

2025-04-21 Thread via GitHub
rdblue commented on code in PR #12736: URL: https://github.com/apache/iceberg/pull/12736#discussion_r2053127517 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/RewriteUpdateTableForRowLineage.scala: ## @@ -0,0 +1,66 @@ +/* + * Licensed to the

Re: [PR] Spark 3.5 row lineage [iceberg]

2025-04-21 Thread via GitHub
rdblue commented on code in PR #12736: URL: https://github.com/apache/iceberg/pull/12736#discussion_r2053126840 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/RewriteMergeIntoTableForRowLineage.scala: ## @@ -0,0 +1,95 @@ +/* + * Licensed to

Re: [PR] Spark 3.5 row lineage [iceberg]

2025-04-21 Thread via GitHub
rdblue commented on code in PR #12736: URL: https://github.com/apache/iceberg/pull/12736#discussion_r2053123798 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/RewriteMergeIntoTableForRowLineage.scala: ## @@ -0,0 +1,95 @@ +/* + * Licensed to

Re: [PR] Spark 3.5 row lineage [iceberg]

2025-04-21 Thread via GitHub
rdblue commented on code in PR #12736: URL: https://github.com/apache/iceberg/pull/12736#discussion_r2053123443 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/RewriteMergeIntoTableForRowLineage.scala: ## @@ -0,0 +1,95 @@ +/* + * Licensed to

Re: [PR] Spark 3.5 row lineage [iceberg]

2025-04-21 Thread via GitHub
rdblue commented on code in PR #12736: URL: https://github.com/apache/iceberg/pull/12736#discussion_r2053122266 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/RewriteMergeIntoTableForRowLineage.scala: ## @@ -0,0 +1,95 @@ +/* + * Licensed to

Re: [PR] Spark 3.5 row lineage [iceberg]

2025-04-21 Thread via GitHub
rdblue commented on code in PR #12736: URL: https://github.com/apache/iceberg/pull/12736#discussion_r2053121924 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/RewriteMergeIntoTableForRowLineage.scala: ## @@ -0,0 +1,95 @@ +/* + * Licensed to

Re: [PR] Spark 3.5 row lineage [iceberg]

2025-04-21 Thread via GitHub
rdblue commented on code in PR #12736: URL: https://github.com/apache/iceberg/pull/12736#discussion_r2053119851 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/RewriteMergeIntoTableForRowLineage.scala: ## @@ -0,0 +1,95 @@ +/* + * Licensed to

Re: [PR] Spark 3.5 row lineage [iceberg]

2025-04-21 Thread via GitHub
rdblue commented on code in PR #12736: URL: https://github.com/apache/iceberg/pull/12736#discussion_r2053119414 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/RewriteMergeIntoTableForRowLineage.scala: ## @@ -0,0 +1,95 @@ +/* + * Licensed to

Re: [PR] Spark 3.5 row lineage [iceberg]

2025-04-21 Thread via GitHub
rdblue commented on code in PR #12736: URL: https://github.com/apache/iceberg/pull/12736#discussion_r2053112107 ## core/src/main/java/org/apache/iceberg/TableUtil.java: ## @@ -60,4 +61,28 @@ public static String metadataFileLocation(Table table) { "%s does not hav

Re: [PR] Fix kerberized hive client [iceberg-python]

2025-04-21 Thread via GitHub
kevinjqliu commented on PR #1941: URL: https://github.com/apache/iceberg-python/pull/1941#issuecomment-2819693549 Thanks for confirming @mnzpk -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] Fix kerberized hive client [iceberg-python]

2025-04-21 Thread via GitHub
kevinjqliu commented on code in PR #1941: URL: https://github.com/apache/iceberg-python/pull/1941#discussion_r2053100497 ## tests/catalog/test_hive.py: ## @@ -183,6 +192,61 @@ def hive_database(tmp_path_factory: pytest.TempPathFactory) -> HiveDatabase: ) +class SaslSer

[PR] Build: Bump moto from 5.1.3 to 5.1.4 [iceberg-python]

2025-04-21 Thread via GitHub
dependabot[bot] opened a new pull request, #1944: URL: https://github.com/apache/iceberg-python/pull/1944 Bumps [moto](https://github.com/getmoto/moto) from 5.1.3 to 5.1.4. Changelog Sourced from https://github.com/getmoto/moto/blob/master/CHANGELOG.md";>moto's changelog. 5.

[PR] Build: Bump mkdocs-material from 9.6.11 to 9.6.12 [iceberg-python]

2025-04-21 Thread via GitHub
dependabot[bot] opened a new pull request, #1943: URL: https://github.com/apache/iceberg-python/pull/1943 Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 9.6.11 to 9.6.12. Release notes Sourced from https://github.com/squidfunk/mkdocs-material/releases";>

Re: [PR] Azure: Support vended credentials refresh in ADLSFileIO. [iceberg]

2025-04-21 Thread via GitHub
tedyu commented on code in PR #11577: URL: https://github.com/apache/iceberg/pull/11577#discussion_r2053022552 ## azure/src/main/java/org/apache/iceberg/azure/adlsv2/ADLSFileIO.java: ## @@ -212,4 +220,13 @@ public void deletePrefix(String prefix) { } } } + + @Ove

Re: [PR] Spec: Avoid struct field conflicts in default values [iceberg]

2025-04-21 Thread via GitHub
RussellSpitzer commented on code in PR #12841: URL: https://github.com/apache/iceberg/pull/12841#discussion_r2053019016 ## format/spec.md: ## @@ -266,7 +266,9 @@ The `initial-default` is set only when a field is added to an existing schema. T The `initial-default` and `write

Re: [PR] Spec: Make next-row-id required in v3 [iceberg]

2025-04-21 Thread via GitHub
rdblue commented on PR #12757: URL: https://github.com/apache/iceberg/pull/12757#issuecomment-2819541992 Merged as part of #12781. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] Spec: Avoid struct field conflicts in default values [iceberg]

2025-04-21 Thread via GitHub
RussellSpitzer commented on code in PR #12841: URL: https://github.com/apache/iceberg/pull/12841#discussion_r2053016120 ## format/spec.md: ## @@ -315,7 +317,7 @@ Struct evolution requires the following rules for default values: * The `write-default` must be set when a field is

Re: [PR] API: Use normalized JSON path to identify Variant fields [iceberg]

2025-04-21 Thread via GitHub
amogh-jahagirdar commented on PR #12835: URL: https://github.com/apache/iceberg/pull/12835#issuecomment-2819521337 thanks @rdblue and thank you @aihuaxu for the spec change for this and for reviewing! I'll go ahead and merge -- This is an automated message from the Apache Git Service. To

Re: [PR] API: Use normalized JSON path to identify Variant fields [iceberg]

2025-04-21 Thread via GitHub
amogh-jahagirdar merged PR #12835: URL: https://github.com/apache/iceberg/pull/12835 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@

[PR] Core: Broaden exception handling in writer clean up logic [iceberg]

2025-04-21 Thread via GitHub
xiaoxuandev opened a new pull request, #12863: URL: https://github.com/apache/iceberg/pull/12863 Expanded the exception handling during writer cleanup to address scenarios where users may lack delete permissions, ensuring consistent behavior with [uncommitted metadata file handling](https:

Re: [I] Panic writing nullable struct with required fields [iceberg-go]

2025-04-21 Thread via GitHub
zeroshade commented on issue #398: URL: https://github.com/apache/iceberg-go/issues/398#issuecomment-2819475496 Thanks for filing this, looks like you're correct. It's coming from `NewStructArrayWithFields`, which and the `ToRequestedSchema` looks like it doesn't properly carry over the nul

Re: [PR] Core: Fix failure when reading files table with branch [iceberg]

2025-04-21 Thread via GitHub
dramaticlly commented on PR #11719: URL: https://github.com/apache/iceberg/pull/11719#issuecomment-2819452522 not stale -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [PR] Core: Fix failure when reading files table with branch [iceberg]

2025-04-21 Thread via GitHub
dramaticlly commented on code in PR #11719: URL: https://github.com/apache/iceberg/pull/11719#discussion_r2052966756 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestMetadataTables.java: ## @@ -671,6 +671,76 @@ public void testFilesTableTimeTr

Re: [PR] spec: Variant lower/upper bounds [iceberg]

2025-04-21 Thread via GitHub
amogh-jahagirdar merged PR #12658: URL: https://github.com/apache/iceberg/pull/12658 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@

Re: [PR] spec: Variant lower/upper bounds [iceberg]

2025-04-21 Thread via GitHub
amogh-jahagirdar commented on PR #12658: URL: https://github.com/apache/iceberg/pull/12658#issuecomment-2819450022 Thanks @RussellSpitzer @rdblue @Fokko @flyrain @huaxingao @XBaith for reviewing and everyone for voting. Since the vote passed, I'll go ahead and merge -- This is an automat

Re: [PR] Fix kerberized hive client [iceberg-python]

2025-04-21 Thread via GitHub
mnzpk commented on PR #1941: URL: https://github.com/apache/iceberg-python/pull/1941#issuecomment-2819384657 @kevinjqliu can confirm that with this fix I can no longer reproduce the issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log o

Re: [I] Add support for Avro's timestamp-millis LogicalType in DataReader [iceberg]

2025-04-21 Thread via GitHub
raphaelauv commented on issue #12395: URL: https://github.com/apache/iceberg/issues/12395#issuecomment-2819383032 until it's supported by the kafka-connector , I do ```json "transforms": "TimestampConverter2", "transforms.TimestampConverter2.type": "org.apache.kafka.conn

Re: [I] [SparkMicroBatchStream] Executors prematurely close I/O client during Spark broadcast cleanup [iceberg]

2025-04-21 Thread via GitHub
bk-mz commented on issue #12858: URL: https://github.com/apache/iceberg/issues/12858#issuecomment-2819380897 >Is this share between tasks b/w tasks. >do we need to implement reference count ? before we called close of IO ? TBH I dunno -> the matter looks quite complicated to hav

Re: [PR] Add detailed debug and warn logging to SparkMicroBatchStream [iceberg]

2025-04-21 Thread via GitHub
bk-mz commented on code in PR #12856: URL: https://github.com/apache/iceberg/pull/12856#discussion_r2052921635 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkMicroBatchStream.java: ## @@ -138,24 +167,32 @@ public InputPartition[] planInputPartitions(Offse

Re: [PR] Add detailed debug and warn logging to SparkMicroBatchStream [iceberg]

2025-04-21 Thread via GitHub
bk-mz commented on code in PR #12856: URL: https://github.com/apache/iceberg/pull/12856#discussion_r2052921366 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkMicroBatchStream.java: ## @@ -105,28 +108,54 @@ public class SparkMicroBatchStream implements Mi

Re: [PR] Add detailed debug and warn logging to SparkMicroBatchStream [iceberg]

2025-04-21 Thread via GitHub
bk-mz commented on code in PR #12856: URL: https://github.com/apache/iceberg/pull/12856#discussion_r2052921152 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkMicroBatchStream.java: ## @@ -222,37 +264,74 @@ private List planFiles(StreamingOffset startOffs

[I] remove redundant ci tests [iceberg-python]

2025-04-21 Thread via GitHub
kevinjqliu opened a new issue, #1942: URL: https://github.com/apache/iceberg-python/issues/1942 ### Apache Iceberg version None ### Please describe the bug 🐞 [python-ci.yml](https://github.com/apache/iceberg-python/blob/main/.github/workflows/python-ci.yml) and [python

Re: [I] [SparkMicroBatchStream] Executors prematurely close I/O client during Spark broadcast cleanup [iceberg]

2025-04-21 Thread via GitHub
singhpk234 commented on issue #12858: URL: https://github.com/apache/iceberg/issues/12858#issuecomment-2819270523 I see this was ideally intended to be called when done processing at executor end, If GC is triggering this we need a way to protect this from these case if we are still need it

Re: [PR] Fallback for upsert when arrow cannot compare source rows with target rows [iceberg-python]

2025-04-21 Thread via GitHub
kevinjqliu commented on code in PR #1878: URL: https://github.com/apache/iceberg-python/pull/1878#discussion_r2052857498 ## pyiceberg/table/upsert_util.py: ## @@ -82,14 +82,54 @@ def get_rows_to_update(source_table: pa.Table, target_table: pa.Table, join_cols ], )

Re: [PR] Fallback for upsert when arrow cannot compare source rows with target rows [iceberg-python]

2025-04-21 Thread via GitHub
kevinjqliu commented on code in PR #1878: URL: https://github.com/apache/iceberg-python/pull/1878#discussion_r2052855516 ## pyiceberg/table/upsert_util.py: ## @@ -82,14 +82,54 @@ def get_rows_to_update(source_table: pa.Table, target_table: pa.Table, join_cols ], )

[PR] update links to daft docs [iceberg]

2025-04-21 Thread via GitHub
ccmao1130 opened a new pull request, #12860: URL: https://github.com/apache/iceberg/pull/12860 We did a major refresh of Daft docs on our end, so want to make sure the links are also updated on your end! One thing to note: On our docs, we renamed the below fields, let me know if it

Re: [I] [DISCUSS] A catalog loader api. [iceberg-rust]

2025-04-21 Thread via GitHub
kevinjqliu commented on issue #1228: URL: https://github.com/apache/iceberg-rust/issues/1228#issuecomment-2819221311 +1 this is similar to pyiceberg's [`load_catalog`](https://github.com/apache/iceberg-python/blob/f5978bbf525174cf7c9d49297680ecf2fce7b159/pyiceberg/catalog/__init__.py#L215-L2

Re: [PR] Added ExpireSnapshots Feature [iceberg-python]

2025-04-21 Thread via GitHub
kevinjqliu commented on code in PR #1880: URL: https://github.com/apache/iceberg-python/pull/1880#discussion_r2051592966 ## .gitignore: ## @@ -50,3 +50,5 @@ htmlcov pyiceberg/avro/decoder_fast.c pyiceberg/avro/*.html pyiceberg/avro/*.so +.vscode/settings.json +pyiceberg/table

Re: [PR] Add timestamp_ns, time and UUID types for Variant [iceberg]

2025-04-21 Thread via GitHub
rdblue commented on code in PR #12682: URL: https://github.com/apache/iceberg/pull/12682#discussion_r2052762453 ## parquet/src/test/java/org/apache/iceberg/parquet/TestVariantMetrics.java: ## @@ -496,6 +506,13 @@ private static VariantValue increment(VariantValue value) {

Re: [PR] Add timestamp_ns, time and UUID types for Variant [iceberg]

2025-04-21 Thread via GitHub
rdblue commented on code in PR #12682: URL: https://github.com/apache/iceberg/pull/12682#discussion_r2052761580 ## parquet/src/main/java/org/apache/iceberg/parquet/VariantWriterBuilder.java: ## @@ -242,6 +242,12 @@ public Optional> visit(TimestampLogicalTypeAnnotation time

Re: [PR] Spark: Add _row_id and _last_updated_sequence_number readers [iceberg]

2025-04-21 Thread via GitHub
amogh-jahagirdar commented on code in PR #12836: URL: https://github.com/apache/iceberg/pull/12836#discussion_r2052686530 ## parquet/src/main/java/org/apache/iceberg/parquet/ParquetValueReaders.java: ## @@ -237,36 +278,9 @@ private static class ConstantReader implements Parquet

Re: [PR] Spark: Add _row_id and _last_updated_sequence_number readers [iceberg]

2025-04-21 Thread via GitHub
amogh-jahagirdar commented on PR #12836: URL: https://github.com/apache/iceberg/pull/12836#issuecomment-2819098664 Thanks @rdblue ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] Spark: Add _row_id and _last_updated_sequence_number readers [iceberg]

2025-04-21 Thread via GitHub
amogh-jahagirdar merged PR #12836: URL: https://github.com/apache/iceberg/pull/12836 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@

Re: [I] Is there any way on Flink to read newly appended data only (NOT in current Iceberg table snapshot)? [iceberg]

2025-04-21 Thread via GitHub
stevenzwu closed issue #9955: Is there any way on Flink to read newly appended data only (NOT in current Iceberg table snapshot)? URL: https://github.com/apache/iceberg/issues/9955 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] Flink: Add StreamingStartingStrategy.INCREMENTAL_FROM_LATEST_SNAPSHOT_EXCLUSIVE [iceberg]

2025-04-21 Thread via GitHub
stevenzwu merged PR #12839: URL: https://github.com/apache/iceberg/pull/12839 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] Flink: Add StreamingStartingStrategy.INCREMENTAL_FROM_LATEST_SNAPSHOT_EXCLUSIVE [iceberg]

2025-04-21 Thread via GitHub
stevenzwu commented on PR #12839: URL: https://github.com/apache/iceberg/pull/12839#issuecomment-2819094014 I am going to merge this PR so that back port PR can be proceeded. if there are more comments, they can be followed up separately. -- This is an automated message from the Apache Gi

Re: [PR] Add timestamp_ns, time and UUID types for Variant [iceberg]

2025-04-21 Thread via GitHub
rdblue commented on code in PR #12682: URL: https://github.com/apache/iceberg/pull/12682#discussion_r2052757722 ## core/src/main/java/org/apache/iceberg/variants/Variants.java: ## @@ -200,4 +201,36 @@ public static VariantPrimitive of(ByteBuffer value) { public static Varian

Re: [PR] Add timestamp_ns, time and UUID types for Variant [iceberg]

2025-04-21 Thread via GitHub
rdblue commented on code in PR #12682: URL: https://github.com/apache/iceberg/pull/12682#discussion_r2052757248 ## api/src/main/java/org/apache/iceberg/variants/PhysicalType.java: ## @@ -41,6 +41,10 @@ public enum PhysicalType { FLOAT(LogicalType.FLOAT, Float.class), BINAR

Re: [PR] Add timestamp_ns, time and UUID types for Variant [iceberg]

2025-04-21 Thread via GitHub
rdblue commented on code in PR #12682: URL: https://github.com/apache/iceberg/pull/12682#discussion_r2052756028 ## core/src/main/java/org/apache/iceberg/variants/PrimitiveWrapper.java: ## @@ -204,6 +217,23 @@ public int writeTo(ByteBuffer outBuffer, int offset) { outBuf

Re: [I] Add support for pyarrow DurationType [iceberg-python]

2025-04-21 Thread via GitHub
jayceslesar commented on issue #1900: URL: https://github.com/apache/iceberg-python/issues/1900#issuecomment-2819085145 This was just formally proposed to the dev mailing list via https://docs.google.com/document/d/12ghQxWxyAhSQeZyy0IWiwJ02gTqFOgfYm8x851HZFLk/edit?tab=t.0#heading=h.rt0cvesd

Re: [PR] Add timestamp_ns, time and UUID types for Variant [iceberg]

2025-04-21 Thread via GitHub
rdblue commented on code in PR #12682: URL: https://github.com/apache/iceberg/pull/12682#discussion_r2052751355 ## api/src/test/java/org/apache/iceberg/variants/TestSerializedPrimitives.java: ## @@ -451,11 +452,143 @@ public void testShortString() { assertThat(value.get()).

Re: [PR] Add timestamp_ns, time and UUID types for Variant [iceberg]

2025-04-21 Thread via GitHub
rdblue commented on code in PR #12682: URL: https://github.com/apache/iceberg/pull/12682#discussion_r2052750931 ## api/src/main/java/org/apache/iceberg/variants/VariantObject.java: ## @@ -44,9 +46,13 @@ default VariantObject asObject() { static String asString(VariantObject o

Re: [PR] Add timestamp_ns, time and UUID types for Variant [iceberg]

2025-04-21 Thread via GitHub
rdblue commented on code in PR #12682: URL: https://github.com/apache/iceberg/pull/12682#discussion_r2052749038 ## api/src/main/java/org/apache/iceberg/variants/Primitives.java: ## @@ -36,7 +36,10 @@ class Primitives { static final int TYPE_FLOAT = 14; static final int TYP

Re: [PR] Add timestamp_ns, time and UUID types for Variant [iceberg]

2025-04-21 Thread via GitHub
rdblue commented on code in PR #12682: URL: https://github.com/apache/iceberg/pull/12682#discussion_r2052748279 ## api/src/main/java/org/apache/iceberg/variants/LogicalType.java: ## @@ -29,6 +29,8 @@ enum LogicalType { TIMESTAMPNTZ, BINARY, STRING, + TIMENTZ, Review C

Re: [PR] Fix kerberized hive client [iceberg-python]

2025-04-21 Thread via GitHub
kevinjqliu commented on PR #1941: URL: https://github.com/apache/iceberg-python/pull/1941#issuecomment-2819063677 Thanks for following up @ mnzpk, really appreciate it. I incorporated the tests in this PR as well. Please give it a try and let me know if that fully resolves the issue -- T

Re: [PR] Fix kerberized hive client [iceberg-python]

2025-04-21 Thread via GitHub
kevinjqliu closed pull request #1939: Fix kerberized hive client URL: https://github.com/apache/iceberg-python/pull/1939 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsu

  1   2   >