Re: [PR] Spark: SnapshotTableSparkAction add validation for non-overlapping source/dest table paths. [iceberg]

2025-04-17 Thread via GitHub
sririshindra commented on code in PR #12779: URL: https://github.com/apache/iceberg/pull/12779#discussion_r2049123660 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/actions/TestSnapshotTableAction.java: ## @@ -65,4 +69,40 @@ public void testSnapshotWithParallelTasks(

Re: [PR] [Spark]Add max files rewrite option for RewriteAction [iceberg]

2025-04-17 Thread via GitHub
coderfender commented on PR #12824: URL: https://github.com/apache/iceberg/pull/12824#issuecomment-2813124077 Issue : https://github.com/apache/iceberg/issues/12832 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

Re: [PR] Fixed force_virtual_addressing problem [iceberg-python]

2025-04-17 Thread via GitHub
Fokko commented on code in PR #1923: URL: https://github.com/apache/iceberg-python/pull/1923#discussion_r2049108649 ## pyiceberg/io/pyarrow.py: ## @@ -408,7 +408,7 @@ def _initialize_oss_fs(self) -> FileSystem: "access_key": get_first_property_value(self.properties,

Re: [PR] [1.8.x] Fix versions in LICENSE and NOTICE files [iceberg]

2025-04-17 Thread via GitHub
nastra merged PR #12834: URL: https://github.com/apache/iceberg/pull/12834 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] [1.8.x] Remove version in LICENSE files [iceberg]

2025-04-17 Thread via GitHub
jbonofre commented on PR #12815: URL: https://github.com/apache/iceberg/pull/12815#issuecomment-2813377650 Superseeded by #12834 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

Re: [PR] Fix versions in LICENSE and NOTICE [iceberg]

2025-04-17 Thread via GitHub
nastra merged PR #12831: URL: https://github.com/apache/iceberg/pull/12831 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

[PR] Revert ignore duckdb test [iceberg-python]

2025-04-17 Thread via GitHub
kevinjqliu opened a new pull request, #1927: URL: https://github.com/apache/iceberg-python/pull/1927 # Rationale for this change Issue is fixed upstream, https://github.com/duckdb/duckdb-iceberg/issues/185 This reverts commit eb8756a00311955c6bea7ee3cc02320e5

Re: [PR] Revert ignore duckdb test [iceberg-python]

2025-04-17 Thread via GitHub
Fokko merged PR #1927: URL: https://github.com/apache/iceberg-python/pull/1927 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceber

Re: [PR] Use version-hint.text for StaticTable [iceberg-python]

2025-04-17 Thread via GitHub
Fokko merged PR #1887: URL: https://github.com/apache/iceberg-python/pull/1887 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceber

Re: [I] [feat] Ability to read table using `version-hint.txt` [iceberg-python]

2025-04-17 Thread via GitHub
Fokko closed issue #763: [feat] Ability to read table using `version-hint.txt` URL: https://github.com/apache/iceberg-python/issues/763 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [I] [feat] Ability to read table using `version-hint.txt` [iceberg-python]

2025-04-17 Thread via GitHub
Fokko closed issue #763: [feat] Ability to read table using `version-hint.txt` URL: https://github.com/apache/iceberg-python/issues/763 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] [Spark]Add max files rewrite option for RewriteAction [iceberg]

2025-04-17 Thread via GitHub
coderfender commented on PR #12824: URL: https://github.com/apache/iceberg/pull/12824#issuecomment-2813097314 @manuzhang , resolved conflicts and update the branch and the PR is ready for review now -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] Core: Test loading table/view with non-existing namespace [iceberg]

2025-04-17 Thread via GitHub
tedyu commented on code in PR #12812: URL: https://github.com/apache/iceberg/pull/12812#discussion_r2049292649 ## core/src/test/java/org/apache/iceberg/catalog/CatalogTests.java: ## @@ -917,6 +917,15 @@ public void testLoadTable() { .containsAll(properties.entrySet());

Re: [PR] Flink: backport fix TriggerManager to unlock task execution when previous job left an orphaned lock for Flink 1.19 [iceberg]

2025-04-17 Thread via GitHub
tedyu commented on code in PR #12801: URL: https://github.com/apache/iceberg/pull/12801#discussion_r2049300254 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/TriggerManager.java: ## @@ -189,6 +189,9 @@ public void initializeState(FunctionInitial

Re: [PR] Add table property to disable/enable parquet column statistics #12770 [iceberg]

2025-04-17 Thread via GitHub
huaxiangsun commented on code in PR #12771: URL: https://github.com/apache/iceberg/pull/12771#discussion_r2049348066 ## docs/docs/configuration.md: ## @@ -52,6 +52,8 @@ Iceberg tables support table properties to configure table behavior, like the de | write.parquet.bloom-filte

Re: [I] ExpireSnapshots customizable cleanup strategy [iceberg]

2025-04-17 Thread via GitHub
gaborkaszab commented on issue #12819: URL: https://github.com/apache/iceberg/issues/12819#issuecomment-2813576254 It seems package private in RemoveSnapshots. For me it makes sense to expose it through the API. -- This is an automated message from the Apache Git Service. To respond to th

Re: [PR] Iceberg time type fix [iceberg]

2025-04-17 Thread via GitHub
kumarpritam863 commented on PR #12725: URL: https://github.com/apache/iceberg/pull/12725#issuecomment-2813580642 @bryanck I have added that to SMT in a separate PR, but the only caveat to that is extra latency it would incur. -- This is an automated message from the Apache Git Service. To

Re: [PR] Flink: backport fix TriggerManager to unlock task execution when previous job left an orphaned lock for Flink 1.19 [iceberg]

2025-04-17 Thread via GitHub
Guosmilesmile commented on code in PR #12801: URL: https://github.com/apache/iceberg/pull/12801#discussion_r2049320785 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/TriggerManager.java: ## @@ -189,6 +189,9 @@ public void initializeState(Functio

Re: [PR] AWS: Use custom Execution interceptor to support multiple storage credentials [iceberg]

2025-04-17 Thread via GitHub
danielcweeks commented on PR #12827: URL: https://github.com/apache/iceberg/pull/12827#issuecomment-2813587792 @nastra I'm a little concerned about this approach because we're doing a lot of hand-crafting/manipulation of the request as opposed to using features of the SDK. There are two al

Re: [PR] Fixed force_virtual_addressing problem [iceberg-python]

2025-04-17 Thread via GitHub
helmiazizm commented on code in PR #1923: URL: https://github.com/apache/iceberg-python/pull/1923#discussion_r2049361189 ## pyiceberg/io/pyarrow.py: ## @@ -408,7 +408,7 @@ def _initialize_oss_fs(self) -> FileSystem: "access_key": get_first_property_value(self.proper

Re: [PR] Fixed force_virtual_addressing problem [iceberg-python]

2025-04-17 Thread via GitHub
helmiazizm commented on code in PR #1923: URL: https://github.com/apache/iceberg-python/pull/1923#discussion_r2049361189 ## pyiceberg/io/pyarrow.py: ## @@ -408,7 +408,7 @@ def _initialize_oss_fs(self) -> FileSystem: "access_key": get_first_property_value(self.proper

Re: [PR] Core: Support first-row-id for manifests and manifest lists [iceberg]

2025-04-17 Thread via GitHub
rdblue commented on code in PR #12672: URL: https://github.com/apache/iceberg/pull/12672#discussion_r2049517618 ## core/src/test/java/org/apache/iceberg/TestManifestWriterVersions.java: ## @@ -213,27 +228,125 @@ public void testV2ManifestRewriteWithInheritance() throws IOExcept

Re: [PR] Parquet: Add variant array reader in Parquet [iceberg]

2025-04-17 Thread via GitHub
rdblue commented on code in PR #12512: URL: https://github.com/apache/iceberg/pull/12512#discussion_r2049555097 ## parquet/src/main/java/org/apache/iceberg/parquet/VariantReaderBuilder.java: ## @@ -159,8 +159,16 @@ public VariantValueReader object( @Override public Varia

Re: [PR] Flink: backport fix TriggerManager to unlock task execution when previous job left an orphaned lock for Flink 1.19 [iceberg]

2025-04-17 Thread via GitHub
tedyu commented on code in PR #12801: URL: https://github.com/apache/iceberg/pull/12801#discussion_r2049743332 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/TriggerManager.java: ## @@ -189,6 +189,9 @@ public void initializeState(FunctionInitial

Re: [PR] spec: Variant lower/upper bounds [iceberg]

2025-04-17 Thread via GitHub
rdblue commented on code in PR #12658: URL: https://github.com/apache/iceberg/pull/12658#discussion_r2049500773 ## format/spec.md: ## @@ -648,6 +648,9 @@ Notes: 5. The `content_offset` and `content_size_in_bytes` fields are used to reference a specific blob for direct access t

Re: [PR] Core: Support first-row-id for manifests and manifest lists [iceberg]

2025-04-17 Thread via GitHub
danielcweeks commented on code in PR #12672: URL: https://github.com/apache/iceberg/pull/12672#discussion_r2049732360 ## core/src/main/java/org/apache/iceberg/ContentFileParser.java: ## @@ -164,6 +165,7 @@ static ContentFile fromJson(JsonNode jsonNode, PartitionSpec spec) {

Re: [PR] Core: Support first-row-id for manifests and manifest lists [iceberg]

2025-04-17 Thread via GitHub
danielcweeks commented on code in PR #12672: URL: https://github.com/apache/iceberg/pull/12672#discussion_r2049736814 ## core/src/main/java/org/apache/iceberg/GenericManifestFile.java: ## @@ -481,14 +503,15 @@ private CopyBuilder(ManifestFile toCopy) { toCopy.se

Re: [PR] Spark 3.5 row lineage [iceberg]

2025-04-17 Thread via GitHub
amogh-jahagirdar commented on code in PR #12736: URL: https://github.com/apache/iceberg/pull/12736#discussion_r2050020503 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestRowLineagePropagation.java: ## @@ -0,0 +1,489 @@ +/* + * Licensed to the

Re: [PR] refactor: add expression subdirectory [iceberg-cpp]

2025-04-17 Thread via GitHub
Xuanwo merged PR #81: URL: https://github.com/apache/iceberg-cpp/pull/81 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apac

Re: [PR] refactor: add expression subdirectory [iceberg-cpp]

2025-04-17 Thread via GitHub
wgtmac commented on PR #81: URL: https://github.com/apache/iceberg-cpp/pull/81#issuecomment-2814544743 Thank you @Xuanwo! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] feat: introduce Scalar and StructLike interfaces for Transform evaluation [iceberg-cpp]

2025-04-17 Thread via GitHub
wgtmac commented on code in PR #77: URL: https://github.com/apache/iceberg-cpp/pull/77#discussion_r2049989575 ## src/iceberg/struct_like.h: ## @@ -0,0 +1,278 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See th

Re: [PR] Refactor `Metadata` in `Transaction` [iceberg-python]

2025-04-17 Thread via GitHub
smaheshwar-pltr commented on code in PR #1903: URL: https://github.com/apache/iceberg-python/pull/1903#discussion_r2050068724 ## pyiceberg/table/__init__.py: ## @@ -255,12 +254,15 @@ def __init__(self, table: Table, autocommit: bool = False): table: The table that

Re: [PR] Fallback for upsert when arrow cannot compare source rows with target rows [iceberg-python]

2025-04-17 Thread via GitHub
Fokko commented on code in PR #1878: URL: https://github.com/apache/iceberg-python/pull/1878#discussion_r2049541362 ## pyiceberg/table/upsert_util.py: ## @@ -82,14 +82,56 @@ def get_rows_to_update(source_table: pa.Table, target_table: pa.Table, join_cols ], ) -

Re: [PR] Rewrite manifests [iceberg-python]

2025-04-17 Thread via GitHub
Fokko commented on PR #1661: URL: https://github.com/apache/iceberg-python/pull/1661#issuecomment-2812970719 @amitgilad3 gentle ping, are you still interested in working on this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] Move implementation of upsert from Table to Transaction [iceberg-python]

2025-04-17 Thread via GitHub
Fokko commented on PR #1817: URL: https://github.com/apache/iceberg-python/pull/1817#issuecomment-2813868016 > @Fokko should it be possible to read uncommitted changes? Yes, it should. If you do a subsequent upsert with the same data, it should be a no-op. This should be the case toda

Re: [PR] Parquet: Add variant array reader in Parquet [iceberg]

2025-04-17 Thread via GitHub
rdblue commented on code in PR #12512: URL: https://github.com/apache/iceberg/pull/12512#discussion_r2049552501 ## parquet/src/main/java/org/apache/iceberg/parquet/ParquetVariantReaders.java: ## @@ -332,6 +346,57 @@ public void setPageSource(PageReadStore pageStore) { }

Re: [PR] Fallback for upsert when arrow cannot compare source rows with target rows [iceberg-python]

2025-04-17 Thread via GitHub
koenvo commented on code in PR #1878: URL: https://github.com/apache/iceberg-python/pull/1878#discussion_r2049603657 ## pyiceberg/table/upsert_util.py: ## @@ -82,14 +82,56 @@ def get_rows_to_update(source_table: pa.Table, target_table: pa.Table, join_cols ], ) -

Re: [PR] Flink: backport fix TriggerManager to unlock task execution when previous job left an orphaned lock for Flink 1.19 [iceberg]

2025-04-17 Thread via GitHub
Guosmilesmile commented on code in PR #12801: URL: https://github.com/apache/iceberg/pull/12801#discussion_r2049714118 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/TriggerManager.java: ## @@ -189,6 +189,9 @@ public void initializeState(Functio

Re: [PR] [Spark]Add max files rewrite option for RewriteAction [iceberg]

2025-04-17 Thread via GitHub
coderfender commented on PR #12824: URL: https://github.com/apache/iceberg/pull/12824#issuecomment-2814148621 @yogevyuval , The goal here is to provide user an option to limit the number of files to be rewritten (either through compaction , data rewrite etc) . In a use case (like mine) wher

Re: [PR] [Spark]Add max files rewrite option for RewriteAction [iceberg]

2025-04-17 Thread via GitHub
coderfender commented on code in PR #12824: URL: https://github.com/apache/iceberg/pull/12824#discussion_r2049724630 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteDataFilesSparkAction.java: ## @@ -407,15 +409,49 @@ private Builder doExecuteWithPartial

Re: [PR] Fixed force_virtual_addressing problem [iceberg-python]

2025-04-17 Thread via GitHub
Fokko commented on code in PR #1923: URL: https://github.com/apache/iceberg-python/pull/1923#discussion_r2049488052 ## pyiceberg/io/pyarrow.py: ## @@ -408,7 +408,7 @@ def _initialize_oss_fs(self) -> FileSystem: "access_key": get_first_property_value(self.properties,

Re: [PR] spec: Variant lower/upper bounds [iceberg]

2025-04-17 Thread via GitHub
aihuaxu commented on code in PR #12658: URL: https://github.com/apache/iceberg/pull/12658#discussion_r2049538827 ## format/spec.md: ## @@ -648,6 +648,9 @@ Notes: 5. The `content_offset` and `content_size_in_bytes` fields are used to reference a specific blob for direct access

Re: [PR] Flink: backport fix TriggerManager to unlock task execution when previous job left an orphaned lock for Flink 1.19 [iceberg]

2025-04-17 Thread via GitHub
Guosmilesmile commented on code in PR #12801: URL: https://github.com/apache/iceberg/pull/12801#discussion_r2049714118 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/TriggerManager.java: ## @@ -189,6 +189,9 @@ public void initializeState(Functio

Re: [PR] [Spark]Add max files rewrite option for RewriteAction [iceberg]

2025-04-17 Thread via GitHub
coderfender commented on code in PR #12824: URL: https://github.com/apache/iceberg/pull/12824#discussion_r2049717807 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteDataFilesSparkAction.java: ## @@ -407,15 +409,49 @@ private Builder doExecuteWithPartial

Re: [PR] Core: Fix locationProvider implementation for SerializableTable [iceberg]

2025-04-17 Thread via GitHub
github-actions[bot] commented on PR #12564: URL: https://github.com/apache/iceberg/pull/12564#issuecomment-2814237751 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [PR] AWS: Use custom Execution interceptor to support multiple storage credentials [iceberg]

2025-04-17 Thread via GitHub
singhpk234 commented on PR #12827: URL: https://github.com/apache/iceberg/pull/12827#issuecomment-2814270644 Thanks @danielcweeks, my understanding for the approaches are the following : For 1, My only concern with the following was each client maintaining its own connection pool, with

Re: [PR] Materialized View Spec [iceberg]

2025-04-17 Thread via GitHub
hashhar commented on code in PR #11041: URL: https://github.com/apache/iceberg/pull/11041#discussion_r2048360343 ## format/view-spec.md: ## @@ -42,12 +42,28 @@ An atomic swap of one view metadata file for another provides the basis for maki Writers create view metadata files

Re: [PR] Materialized View Spec [iceberg]

2025-04-17 Thread via GitHub
hashhar commented on code in PR #11041: URL: https://github.com/apache/iceberg/pull/11041#discussion_r2048363317 ## format/view-spec.md: ## @@ -160,6 +179,57 @@ Each entry in `version-log` is a struct with the following fields: | _required_ | `timestamp-ms` | Timestamp when t

[PR] feat: implement initial MemoryCatalog functionality with namespace and table support [iceberg-cpp]

2025-04-17 Thread via GitHub
gty404 opened a new pull request, #80: URL: https://github.com/apache/iceberg-cpp/pull/80 - Added NamespaceContainer class to manage namespace hierarchy and table metadata - Implemented MemoryCatalog methods: ListTables, TableExists, DropTable, RegisterTable -- This is an automated me

Re: [PR] Materialized View Spec [iceberg]

2025-04-17 Thread via GitHub
hashhar commented on code in PR #11041: URL: https://github.com/apache/iceberg/pull/11041#discussion_r2048365318 ## format/view-spec.md: ## @@ -160,6 +179,57 @@ Each entry in `version-log` is a struct with the following fields: | _required_ | `timestamp-ms` | Timestamp when t

Re: [PR] Use assumeThat instead of assumeTrue [iceberg]

2025-04-17 Thread via GitHub
nastra commented on code in PR #12822: URL: https://github.com/apache/iceberg/pull/12822#discussion_r2048422582 ## core/src/test/java/org/apache/iceberg/util/TestEnvironmentUtil.java: ## @@ -19,19 +19,20 @@ package org.apache.iceberg.util; import static org.assertj.core.api.

Re: [PR] feat: update the supported catalog operations with some operation implementation [iceberg-go]

2025-04-17 Thread via GitHub
xuhui-lu commented on PR #396: URL: https://github.com/apache/iceberg-go/pull/396#issuecomment-2811746167 hey @zeroshade , do you mind help take a look here? : ) thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] Use assumeThat instead of assumeTrue [iceberg]

2025-04-17 Thread via GitHub
slfan1989 commented on PR #12822: URL: https://github.com/apache/iceberg/pull/12822#issuecomment-2812127522 > @slfan1989 can you also please add a checkstyle rule to `checkstyle.xml` that prevents using `assumeTrue()` from the JUnit package? This can probably be an `IllegalImport` check sim

Re: [PR] Use assumeThat instead of assumeTrue [iceberg]

2025-04-17 Thread via GitHub
slfan1989 commented on code in PR #12822: URL: https://github.com/apache/iceberg/pull/12822#discussion_r2048461757 ## core/src/test/java/org/apache/iceberg/catalog/CatalogTests.java: ## @@ -448,8 +453,9 @@ public void testListNamespaces() { @Test public void testListNest

Re: [PR] Spark 4.0 integration [iceberg]

2025-04-17 Thread via GitHub
huaxingao commented on code in PR #12494: URL: https://github.com/apache/iceberg/pull/12494#discussion_r2048242486 ## build.gradle: ## @@ -760,6 +763,7 @@ project(':iceberg-hive-metastore') { testImplementation project(path: ':iceberg-api', configuration: 'testArtifacts')

Re: [PR] Spark 4.0 integration [iceberg]

2025-04-17 Thread via GitHub
huaxingao commented on code in PR #12494: URL: https://github.com/apache/iceberg/pull/12494#discussion_r2048298289 ## spark/v4.0/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestViews.java: ## @@ -538,7 +538,7 @@ public void readFromViewReferencingTempFunct

Re: [PR] Spark 4.0 integration [iceberg]

2025-04-17 Thread via GitHub
huaxingao commented on PR #12494: URL: https://github.com/apache/iceberg/pull/12494#issuecomment-2814615640 > The git commits need to be tidied up before merging. Or, commit > > mv spark-3.5 to spark-4.0 > copy spark-4.0 spark-3.5 Right, when this PR is ready, I will squash t

Re: [PR] Spark: SnapshotTableSparkAction add validation for non-overlapping source/dest table paths. [iceberg]

2025-04-17 Thread via GitHub
slfan1989 commented on code in PR #12779: URL: https://github.com/apache/iceberg/pull/12779#discussion_r2050107263 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/actions/TestSnapshotTableAction.java: ## @@ -65,4 +69,40 @@ public void testSnapshotWithParallelTasks() t

Re: [I] Kafka Connect: Add dead letter queue support [iceberg]

2025-04-17 Thread via GitHub
kumarpritam863 commented on issue #10840: URL: https://github.com/apache/iceberg/issues/10840#issuecomment-2814556039 @bryanck as I was investigating of integrating DLQ with the Iceberg Sink, I was thinking of not putting the DLQ logic in the Iceberg Sink itself, and in this case I think pu

Re: [PR] Spark 4.0 integration [iceberg]

2025-04-17 Thread via GitHub
huaxingao commented on code in PR #12494: URL: https://github.com/apache/iceberg/pull/12494#discussion_r2048257067 ## gradle/libs.versions.toml: ## @@ -81,6 +82,7 @@ slf4j = "2.0.17" snowflake-jdbc = "3.23.2" spark34 = "3.4.4" spark35 = "3.5.5" +spark40 = "4.0.1-SNAPSHOT" Re

Re: [PR] Docs: Add the recommended style for ArrayAssertions [iceberg]

2025-04-17 Thread via GitHub
tomtongue commented on PR #12820: URL: https://github.com/apache/iceberg/pull/12820#issuecomment-2811888095 Thanks for the review and fix! @nastra, and thanks for requesting the review! @Fokko -- This is an automated message from the Apache Git Service. To respond to the message, please

[PR] Core: Ensure reactivated view version uses correct timestamp [iceberg]

2025-04-17 Thread via GitHub
lliangyu-lin opened a new pull request, #12821: URL: https://github.com/apache/iceberg/pull/12821 ### Description * Fixes #12809 * Apply the fixes from https://github.com/apache/iceberg-rust/pull/1218 by @c-thiel -- This is an automated message from the Apache Git Service. To respo

Re: [I] PyIceberg permits OAuth URI in REST config but Iceberg Java does not [iceberg-python]

2025-04-17 Thread via GitHub
smaheshwar-pltr commented on issue #1684: URL: https://github.com/apache/iceberg-python/issues/1684#issuecomment-2814587549 Update: the Java codepath has changed on `main` with the `AuthManager` changes, and it looks like PyIceberg will follow suit (https://github.com/apache/iceberg-python

Re: [PR] Use assumeThat instead of assumeTrue [iceberg]

2025-04-17 Thread via GitHub
slfan1989 commented on code in PR #12822: URL: https://github.com/apache/iceberg/pull/12822#discussion_r2049753630 ## core/src/test/java/org/apache/iceberg/util/TestEnvironmentUtil.java: ## @@ -19,19 +19,20 @@ package org.apache.iceberg.util; import static org.assertj.core.a

Re: [PR] Spark 4.0 integration [iceberg]

2025-04-17 Thread via GitHub
pan3793 commented on code in PR #12494: URL: https://github.com/apache/iceberg/pull/12494#discussion_r2050081614 ## build.gradle: ## @@ -760,6 +763,7 @@ project(':iceberg-hive-metastore') { testImplementation project(path: ':iceberg-api', configuration: 'testArtifacts')

Re: [PR] Spark 4.0 integration [iceberg]

2025-04-17 Thread via GitHub
pan3793 commented on PR #12494: URL: https://github.com/apache/iceberg/pull/12494#issuecomment-2814605191 The git commits need to be tidied up before merging. Or, commit 1. mv spark-3.5 to spark-4.0 2. copy spark-4.0 spark-3.5 to main first, the spark-4.0 folder won't be recogniz

Re: [PR] Spark: SnapshotTableSparkAction add validation for non-overlapping source/dest table paths. [iceberg]

2025-04-17 Thread via GitHub
slfan1989 commented on code in PR #12779: URL: https://github.com/apache/iceberg/pull/12779#discussion_r2050104933 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/SnapshotTableSparkAction.java: ## @@ -124,10 +124,12 @@ private SnapshotTable.Result doExecute()

Re: [PR] Spark 4.0 integration [iceberg]

2025-04-17 Thread via GitHub
pan3793 commented on PR #12494: URL: https://github.com/apache/iceberg/pull/12494#issuecomment-2814630566 BTW, I have compiled your yesterday's version and deployed it with Spark 4.0.0 RC4 in a small YARN cluster, run some TPC-H queries, and it works as expected. -- This is an automated

Re: [PR] Spark 4.0 integration [iceberg]

2025-04-17 Thread via GitHub
huaxingao commented on PR #12494: URL: https://github.com/apache/iceberg/pull/12494#issuecomment-2814589062 @RussellSpitzer @pan3793 I have addressed the comments. Could you please take one more look when you have time? -- This is an automated message from the Apache Git Service.

Re: [PR] Flink: Maintenance - RewriteDataFiles [iceberg]

2025-04-17 Thread via GitHub
stevenzwu commented on code in PR #11497: URL: https://github.com/apache/iceberg/pull/11497#discussion_r2045041541 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/DataFileRewritePlanner.java: ## @@ -0,0 +1,209 @@ +/* + * Licensed to the Apache So

Re: [PR] Spark 4.0 integration [iceberg]

2025-04-17 Thread via GitHub
huaxingao commented on code in PR #12494: URL: https://github.com/apache/iceberg/pull/12494#discussion_r2050092005 ## build.gradle: ## @@ -760,6 +763,7 @@ project(':iceberg-hive-metastore') { testImplementation project(path: ':iceberg-api', configuration: 'testArtifacts')

Re: [PR] Spark: SnapshotTableSparkAction add validation for non-overlapping source/dest table paths. [iceberg]

2025-04-17 Thread via GitHub
slfan1989 commented on PR #12779: URL: https://github.com/apache/iceberg/pull/12779#issuecomment-2814624456 @sririshindra Thank you very much for your message! The information you provided is very detailed and insightful. From my personal understanding, I believe the original purpose of the

Re: [PR] feat: validate snapshot write compatibility [iceberg-python]

2025-04-17 Thread via GitHub
sungwy commented on code in PR #1772: URL: https://github.com/apache/iceberg-python/pull/1772#discussion_r2049778093 ## tests/integration/test_add_files.py: ## @@ -850,3 +850,70 @@ def test_add_files_that_referenced_by_current_snapshot_with_check_duplicate_file with pytest

Re: [PR] Core: Support first-row-id for manifests and manifest lists [iceberg]

2025-04-17 Thread via GitHub
RussellSpitzer commented on code in PR #12672: URL: https://github.com/apache/iceberg/pull/12672#discussion_r2049681573 ## spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/data/TestHelpers.java: ## @@ -887,11 +887,14 @@ public static void asMetadataRecord(GenericData.Reco

Re: [PR] feat: refresh table when committing to support concurrent appends [iceberg-python]

2025-04-17 Thread via GitHub
sungwy commented on PR #1885: URL: https://github.com/apache/iceberg-python/pull/1885#issuecomment-2814295236 I've created some subtasks on https://github.com/apache/iceberg-python/issues/819 that will help us implement the required validation functions that we can invoke to check that no

Re: [PR] feat: implement initial MemoryCatalog functionality with namespace and table support [iceberg-cpp]

2025-04-17 Thread via GitHub
wgtmac commented on code in PR #80: URL: https://github.com/apache/iceberg-cpp/pull/80#discussion_r2049935332 ## src/iceberg/catalog/memory_catalog.h: ## @@ -0,0 +1,228 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreemen

Re: [PR] feat: introduce Scalar and StructLike interfaces for Transform evaluation [iceberg-cpp]

2025-04-17 Thread via GitHub
wgtmac commented on code in PR #77: URL: https://github.com/apache/iceberg-cpp/pull/77#discussion_r2049982557 ## src/iceberg/struct_like.h: ## @@ -0,0 +1,278 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See th

Re: [PR] [Spark]Add max files rewrite option for RewriteAction [iceberg]

2025-04-17 Thread via GitHub
coderfender commented on code in PR #12824: URL: https://github.com/apache/iceberg/pull/12824#discussion_r2049722429 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteDataFilesSparkAction.java: ## @@ -407,15 +409,49 @@ private Builder doExecuteWithPartial

Re: [PR] [1.8.x] Add 1.8.x to protected branch [iceberg]

2025-04-17 Thread via GitHub
jbonofre commented on code in PR #12826: URL: https://github.com/apache/iceberg/pull/12826#discussion_r2048625732 ## .asf.yaml: ## @@ -34,7 +34,7 @@ github: rebase: true protected_branches: -main: +1.8.x: Review Comment: I don't think this change makes such

Re: [PR] feat: introduce Scalar and StructLike interfaces for Transform evaluation [iceberg-cpp]

2025-04-17 Thread via GitHub
wgtmac commented on code in PR #77: URL: https://github.com/apache/iceberg-cpp/pull/77#discussion_r2049982557 ## src/iceberg/struct_like.h: ## @@ -0,0 +1,278 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See th

Re: [PR] refactor: add expression subdirectory [iceberg-cpp]

2025-04-17 Thread via GitHub
wgtmac commented on code in PR #81: URL: https://github.com/apache/iceberg-cpp/pull/81#discussion_r2049976718 ## src/iceberg/result.h: ## @@ -78,4 +78,12 @@ auto JsonParseError(const std::format_string fmt, Args&&... args) .message = std::format(fmt

Re: [PR] Build and test hive-metastore with Hive 2, 3 and 4 with a single source set [iceberg]

2025-04-17 Thread via GitHub
danielcweeks commented on PR #12721: URL: https://github.com/apache/iceberg/pull/12721#issuecomment-2814094524 @wypoon I was able to get this up and running based on your branch, so i think we can make it work. As for the comparison to the Spark distribution, I don't think we want t

[I] Support Concurrency Safety Validation: Implement `validateNoNewDeleteFiles` [iceberg-python]

2025-04-17 Thread via GitHub
sungwy opened a new issue, #1930: URL: https://github.com/apache/iceberg-python/issues/1930 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-m

Re: [PR] feat: introduce Scalar and StructLike interfaces for Transform evaluation [iceberg-cpp]

2025-04-17 Thread via GitHub
wgtmac commented on code in PR #77: URL: https://github.com/apache/iceberg-cpp/pull/77#discussion_r2049988018 ## src/iceberg/struct_like.h: ## @@ -0,0 +1,278 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See th

Re: [PR] Spec: Update row lineage requirements for upgrading tables [iceberg]

2025-04-17 Thread via GitHub
RussellSpitzer commented on code in PR #12781: URL: https://github.com/apache/iceberg/pull/12781#discussion_r2049407478 ## format/spec.md: ## @@ -450,21 +452,24 @@ Within `added1`, the first added manifest, each data file's `first_row_id` follo The `first_row_id` of the EXIS

Re: [PR] Flink: backport fix TriggerManager to unlock task execution when previous job left an orphaned lock for Flink 1.19 [iceberg]

2025-04-17 Thread via GitHub
tedyu commented on code in PR #12801: URL: https://github.com/apache/iceberg/pull/12801#discussion_r2049300254 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/TriggerManager.java: ## @@ -189,6 +189,9 @@ public void initializeState(FunctionInitial

Re: [PR] Spec: Update row lineage requirements for upgrading tables [iceberg]

2025-04-17 Thread via GitHub
RussellSpitzer commented on code in PR #12781: URL: https://github.com/apache/iceberg/pull/12781#discussion_r2049405392 ## format/spec.md: ## @@ -450,21 +452,24 @@ Within `added1`, the first added manifest, each data file's `first_row_id` follo The `first_row_id` of the EXIS

Re: [PR] WIP Parquet: Support reading/writing geometry and geography columns [iceberg]

2025-04-17 Thread via GitHub
szehon-ho commented on PR #12347: URL: https://github.com/apache/iceberg/pull/12347#issuecomment-2813681713 Hi, sorry @Kontinuation for the long delay, (Iceberg summit and internal stuff). I wonder if we can rebase based on #12346 and also only have the Parquet part (not the expression par

Re: [PR] [1.8.x] Add 1.8.x to protected branch [iceberg]

2025-04-17 Thread via GitHub
manuzhang commented on code in PR #12826: URL: https://github.com/apache/iceberg/pull/12826#discussion_r2048876330 ## .asf.yaml: ## @@ -34,7 +34,7 @@ github: rebase: true protected_branches: -main: +1.8.x: Review Comment: Okay, let me open a PR against `mai

Re: [PR] Flink: backport fix TriggerManager to unlock task execution when previous job left an orphaned lock for Flink 1.19 [iceberg]

2025-04-17 Thread via GitHub
tedyu commented on code in PR #12801: URL: https://github.com/apache/iceberg/pull/12801#discussion_r2049362138 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/TriggerManager.java: ## @@ -189,6 +189,9 @@ public void initializeState(FunctionInitial

[PR] API: Use normalized JSON path to identify Variant fields [iceberg]

2025-04-17 Thread via GitHub
rdblue opened a new pull request, #12835: URL: https://github.com/apache/iceberg/pull/12835 This updates the Parquet metrics conversion to use normalized JSON paths instead of joining field names using `.`. A normalized path is a more reliable representation that preserves the distinction b

Re: [PR] Fixed force_virtual_addressing problem [iceberg-python]

2025-04-17 Thread via GitHub
helmiazizm commented on code in PR #1923: URL: https://github.com/apache/iceberg-python/pull/1923#discussion_r2049361189 ## pyiceberg/io/pyarrow.py: ## @@ -408,7 +408,7 @@ def _initialize_oss_fs(self) -> FileSystem: "access_key": get_first_property_value(self.proper

Re: [PR] [Spark]Add max files rewrite option for RewriteAction [iceberg]

2025-04-17 Thread via GitHub
sririshindra commented on code in PR #12824: URL: https://github.com/apache/iceberg/pull/12824#discussion_r2049351617 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteDataFilesAction.java: ## @@ -2028,6 +2028,42 @@ protected List currentDataFiles(Tab

Re: [PR] Spec: Update row lineage requirements for upgrading tables [iceberg]

2025-04-17 Thread via GitHub
szehon-ho commented on code in PR #12781: URL: https://github.com/apache/iceberg/pull/12781#discussion_r2049405529 ## format/spec.md: ## @@ -732,12 +738,10 @@ Valid snapshots are stored as a list in table metadata. For serialization, see A Snapshot Row IDs -A snapshot

Re: [PR] Core: Test loading table/view with non-existing namespace [iceberg]

2025-04-17 Thread via GitHub
tedyu commented on code in PR #12812: URL: https://github.com/apache/iceberg/pull/12812#discussion_r2049360573 ## core/src/test/java/org/apache/iceberg/catalog/CatalogTests.java: ## @@ -917,6 +917,15 @@ public void testLoadTable() { .containsAll(properties.entrySet());

Re: [PR] Spec: Update row lineage requirements for upgrading tables [iceberg]

2025-04-17 Thread via GitHub
szehon-ho commented on code in PR #12781: URL: https://github.com/apache/iceberg/pull/12781#discussion_r2049405529 ## format/spec.md: ## @@ -732,12 +738,10 @@ Valid snapshots are stored as a list in table metadata. For serialization, see A Snapshot Row IDs -A snapshot

Re: [PR] Spec: Update row lineage requirements for upgrading tables [iceberg]

2025-04-17 Thread via GitHub
RussellSpitzer commented on code in PR #12781: URL: https://github.com/apache/iceberg/pull/12781#discussion_r2049420137 ## format/spec.md: ## @@ -450,21 +452,24 @@ Within `added1`, the first added manifest, each data file's `first_row_id` follo The `first_row_id` of the EXIS

Re: [PR] Flink: backport fix TriggerManager to unlock task execution when previous job left an orphaned lock for Flink 1.19 [iceberg]

2025-04-17 Thread via GitHub
Guosmilesmile commented on code in PR #12801: URL: https://github.com/apache/iceberg/pull/12801#discussion_r2049320785 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/TriggerManager.java: ## @@ -189,6 +189,9 @@ public void initializeState(Functio

Re: [PR] [Spark]Add max files rewrite option for RewriteAction [iceberg]

2025-04-17 Thread via GitHub
sririshindra commented on code in PR #12824: URL: https://github.com/apache/iceberg/pull/12824#discussion_r2049351617 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteDataFilesAction.java: ## @@ -2028,6 +2028,42 @@ protected List currentDataFiles(Tab

Re: [PR] [Spark]Add max files rewrite option for RewriteAction [iceberg]

2025-04-17 Thread via GitHub
sririshindra commented on code in PR #12824: URL: https://github.com/apache/iceberg/pull/12824#discussion_r2049351617 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteDataFilesAction.java: ## @@ -2028,6 +2028,42 @@ protected List currentDataFiles(Tab

  1   2   >