Re: [PR] feat: `validation_history` and `ancestors_between` [iceberg-python]

2025-04-23 Thread via GitHub
Fokko commented on code in PR #1935: URL: https://github.com/apache/iceberg-python/pull/1935#discussion_r2057670881 ## pyiceberg/table/update/validate.py: ## @@ -0,0 +1,72 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements.

Re: [PR] Flink: Add support for Flink 2.0 [iceberg]

2025-04-23 Thread via GitHub
mxm commented on PR #12527: URL: https://github.com/apache/iceberg/pull/12527#issuecomment-2826554921 Thanks for reviewing / merging @stevenzwu. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [PR] Flink: Add support for Flink 2.0 [iceberg]

2025-04-23 Thread via GitHub
mxm commented on PR #12527: URL: https://github.com/apache/iceberg/pull/12527#issuecomment-2826553216 @manuzhang Yes, there is. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

Re: [PR] Core: Fix Kryo ser/de with StorageCredential config [iceberg]

2025-04-23 Thread via GitHub
nastra closed pull request #12882: Core: Fix Kryo ser/de with StorageCredential config URL: https://github.com/apache/iceberg/pull/12882 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Added ExpireSnapshots Feature [iceberg-python]

2025-04-23 Thread via GitHub
Fokko commented on code in PR #1880: URL: https://github.com/apache/iceberg-python/pull/1880#discussion_r2057650580 ## tests/integration/test_partition_evolution.py: ## @@ -140,6 +140,14 @@ def test_add_hour(catalog: Catalog) -> None: _validate_new_partition_fields(table, 1

Re: [PR] Added ExpireSnapshots Feature [iceberg-python]

2025-04-23 Thread via GitHub
Fokko commented on code in PR #1880: URL: https://github.com/apache/iceberg-python/pull/1880#discussion_r2057645522 ## tests/table/test_expire_snapshots.py: ## @@ -0,0 +1,43 @@ +from unittest.mock import MagicMock Review Comment: ```suggestion # Licensed to the Apache Sof

Re: [PR] Added ExpireSnapshots Feature [iceberg-python]

2025-04-23 Thread via GitHub
Fokko commented on code in PR #1880: URL: https://github.com/apache/iceberg-python/pull/1880#discussion_r2057644908 ## tests/table/test_expire_snapshots.py: ## @@ -0,0 +1,43 @@ +from unittest.mock import MagicMock Review Comment: The license is missing here -- This is an

Re: [PR] Added ExpireSnapshots Feature [iceberg-python]

2025-04-23 Thread via GitHub
Fokko commented on code in PR #1880: URL: https://github.com/apache/iceberg-python/pull/1880#discussion_r2057643193 ## pyiceberg/table/update/snapshot.py: ## @@ -843,3 +849,64 @@ def remove_branch(self, branch_name: str) -> ManageSnapshots: This for method chaining

[I] Flaky test: TestRewriteDataFilesAction.testParallelPartialProgressWithMaxFailedCommitsLargerThanTotalFileGroup() [iceberg]

2025-04-23 Thread via GitHub
nastra opened a new issue, #12889: URL: https://github.com/apache/iceberg/issues/12889 ### Apache Iceberg version main (development) ### Query engine None ### Please describe the bug 🐞 ``` TestRewriteDataFilesAction > testParallelPartialProgressWithMaxFa

Re: [PR] Core, Data: File Format API interfaces [iceberg]

2025-04-23 Thread via GitHub
wgtmac commented on code in PR #12774: URL: https://github.com/apache/iceberg/pull/12774#discussion_r2057631093 ## data/src/main/java/org/apache/iceberg/data/AppenderBuilder.java: ## @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more

[I] Performance Issue: High Decompressor Allocation Due to doc_id Partitioning During User-Based Filtering [iceberg]

2025-04-23 Thread via GitHub
rameshkanna3 opened a new issue, #12888: URL: https://github.com/apache/iceberg/issues/12888 ### Query engine ### Description: Hi Iceberg Community, I’m facing a performance issue when querying an Iceberg table that is partitioned on a high-cardinality field (doc_id) while a

Re: [PR] test: Introduce datafusion engine for executing sqllogictest. [iceberg-rust]

2025-04-23 Thread via GitHub
liurenjie1024 closed pull request #895: test: Introduce datafusion engine for executing sqllogictest. URL: https://github.com/apache/iceberg-rust/pull/895 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [I] GlueCatalog name validation [iceberg]

2025-04-23 Thread via GitHub
andreiluca96 commented on issue #12185: URL: https://github.com/apache/iceberg/issues/12185#issuecomment-2826491245 `+1` to this, curious if there's a workaround for this -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] feat: support rewrite manifest action [iceberg-rust]

2025-04-23 Thread via GitHub
vrd83 commented on code in PR #1237: URL: https://github.com/apache/iceberg-rust/pull/1237#discussion_r2057606767 ## crates/iceberg/src/transaction/rewrite_manifest.rs: ## @@ -0,0 +1,389 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor li

[PR] feat: support azure blob storage [iceberg-rust]

2025-04-23 Thread via GitHub
wcy-fdu opened a new pull request, #1242: URL: https://github.com/apache/iceberg-rust/pull/1242 ## Which issue does this PR close? - Closes #. ## What changes are included in this PR? This PR is similar to the previous one that supported GCS as storage, and it adds s

Re: [PR] Add table property to disable/enable parquet column statistics #12770 [iceberg]

2025-04-23 Thread via GitHub
huaxiangsun commented on code in PR #12771: URL: https://github.com/apache/iceberg/pull/12771#discussion_r2057563902 ## docs/docs/configuration.md: ## @@ -52,6 +52,8 @@ Iceberg tables support table properties to configure table behavior, like the de | write.parquet.bloom-filte

Re: [PR] Add table property to disable/enable parquet column statistics #12770 [iceberg]

2025-04-23 Thread via GitHub
huaxiangsun commented on code in PR #12771: URL: https://github.com/apache/iceberg/pull/12771#discussion_r2057556043 ## parquet/src/main/java/org/apache/iceberg/parquet/Parquet.java: ## @@ -306,33 +309,35 @@ private WriteBuilder createContextFunc( return this; } +

Re: [I] java. lang.UnsupportedOperationException: Unknown delete file content: DATA [iceberg]

2025-04-23 Thread via GitHub
coderfender commented on issue #11981: URL: https://github.com/apache/iceberg/issues/11981#issuecomment-2826373668 @wardlican , Can you help me replicate the issue ? I would like to work and push a potential fix to this if this is a problem which can be solved from Iceberg's end -- This

Re: [PR] Spark 3.5 row lineage [iceberg]

2025-04-23 Thread via GitHub
amogh-jahagirdar commented on PR #12736: URL: https://github.com/apache/iceberg/pull/12736#issuecomment-2826336284 It looks like tests are failing after I rebased off the recent inheritance changes, looking into it... -- This is an automated message from the Apache Git Service. To respond

Re: [PR] feat(rest): support AWS SIGv4 [iceberg-rust]

2025-04-23 Thread via GitHub
xxchan commented on code in PR #1241: URL: https://github.com/apache/iceberg-rust/pull/1241#discussion_r2057458270 ## Cargo.toml: ## @@ -93,6 +93,7 @@ port_scanner = "0.1.5" pretty_assertions = "1.4" rand = "0.8.5" regex = "1.10.5" +reqsign = { version = "0.16.3" } Review Co

Re: [PR] Spark: Avoid closing deserialized copies of shared resources like FileIO [iceberg]

2025-04-23 Thread via GitHub
xiaoxuandev commented on code in PR #12868: URL: https://github.com/apache/iceberg/pull/12868#discussion_r2057404607 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/source/SerializableTableWithSize.java: ## @@ -65,8 +66,7 @@ public static Table copyOf(Table table) {

Re: [PR] feat(rest): support AWS SIGv4 [iceberg-rust]

2025-04-23 Thread via GitHub
ananthaksr commented on code in PR #1241: URL: https://github.com/apache/iceberg-rust/pull/1241#discussion_r2057448422 ## crates/catalog/rest/src/catalog.rs: ## @@ -306,6 +331,13 @@ impl RestCatalog { None => None, }; +if let Some(warehouse_path)

Re: [PR] feat(rest): support AWS SIGv4 [iceberg-rust]

2025-04-23 Thread via GitHub
ananthaksr commented on code in PR #1241: URL: https://github.com/apache/iceberg-rust/pull/1241#discussion_r2057445681 ## Cargo.toml: ## @@ -93,6 +93,7 @@ port_scanner = "0.1.5" pretty_assertions = "1.4" rand = "0.8.5" regex = "1.10.5" +reqsign = { version = "0.16.3" } Revie

Re: [PR] feat(rest): support AWS SIGv4 [iceberg-rust]

2025-04-23 Thread via GitHub
ananthaksr commented on code in PR #1241: URL: https://github.com/apache/iceberg-rust/pull/1241#discussion_r2057443166 ## crates/catalog/rest/src/client.rs: ## @@ -220,6 +225,39 @@ impl HttpClient { /// Executes the given `Request` and returns a `Response`. pub async f

Re: [PR] feat(rest): support AWS SIGv4 [iceberg-rust]

2025-04-23 Thread via GitHub
ananthaksr commented on code in PR #1241: URL: https://github.com/apache/iceberg-rust/pull/1241#discussion_r2057435044 ## crates/catalog/rest/src/client.rs: ## @@ -220,6 +225,39 @@ impl HttpClient { /// Executes the given `Request` and returns a `Response`. pub async f

Re: [PR] Flink: Add support for Flink 2.0 [iceberg]

2025-04-23 Thread via GitHub
manuzhang commented on PR #12527: URL: https://github.com/apache/iceberg/pull/12527#issuecomment-2826224493 Is there a fourth commit that removed 1.18 support? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [PR] refactor(s3tables): avoid misleading FileIO::from_path [iceberg-rust]

2025-04-23 Thread via GitHub
flaneur2020 commented on PR #1240: URL: https://github.com/apache/iceberg-rust/pull/1240#issuecomment-2826193836 LGTM, thank you for this! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [D] [Question] How to generate manifest files and manifest list [iceberg-rust]

2025-04-23 Thread via GitHub
GitHub user liurenjie1024 closed the discussion with a comment: [Question] How to generate manifest files and manifest list Yes, rest catalog supports it. Support in other catalog are undergoing. GitHub link: https://github.com/apache/iceberg-rust/discussions/1232#discussioncomment-12930073

Re: [PR] [SPARK] Fix add_files type conversion exception and incorrect partition value when handling null partitions [iceberg]

2025-04-23 Thread via GitHub
ebyhr commented on PR #12886: URL: https://github.com/apache/iceberg/pull/12886#issuecomment-2826165564 The indentation is wrong. Could you run the following command? ```sh ./gradlew :iceberg-spark:iceberg-spark-extensions-3.5_2.12:spotlessApply ``` -- This is an automated messag

[PR] [SPARK] Fix add_files type conversion exception and incorrect partition value when handling null partitions [iceberg]

2025-04-23 Thread via GitHub
hariuserx opened a new pull request, #12886: URL: https://github.com/apache/iceberg/pull/12886 Currently when performing add_files procedure through Apache Spark, we treat null partitions as `"null"` (String literal). This causes a `NumberFormatException` when adding null partitions when th

Re: [PR] Hive: Throw exception when listNamespaces takes non-empty namespace [iceberg]

2025-04-23 Thread via GitHub
ebyhr commented on PR #12884: URL: https://github.com/apache/iceberg/pull/12884#issuecomment-2826135433 `TestNamespaceSQL#testCreateNamespaceWithMetadata` starts failing with this change. -- This is an automated message from the Apache Git Service. To respond to the message, please log o

Re: [I] REST catalog: support AWS sigV4 [iceberg-rust]

2025-04-23 Thread via GitHub
xxchan commented on issue #1236: URL: https://github.com/apache/iceberg-rust/issues/1236#issuecomment-2826130051 @ananthaksr Hi, thanks for offering help, but I've already finished a working draft. #1241 -- This is an automated message from the Apache Git Service. To respond to the messag

[PR] feat(rest): support AWS SIGv4 [iceberg-rust]

2025-04-23 Thread via GitHub
xxchan opened a new pull request, #1241: URL: https://github.com/apache/iceberg-rust/pull/1241 Signed-off-by: xxchan ## Which issue does this PR close? - Closes #. ## What changes are included in this PR? ## Are these changes tested? --

Re: [PR] Add `format/` to site-ci [iceberg]

2025-04-23 Thread via GitHub
manuzhang commented on PR #12869: URL: https://github.com/apache/iceberg/pull/12869#issuecomment-2826128682 We might want to add https://github.com/apache/iceberg/tree/main/open-api as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log

[PR] Spark 3.5: RewriteTablePath: filter content files by snapshotId [iceberg]

2025-04-23 Thread via GitHub
dramaticlly opened a new pull request, #12885: URL: https://github.com/apache/iceberg/pull/12885 Allow rewrite table path to use snapshot id to filter both 1. `added_snapshot_id` in ManifestFile 1. `snapshot_id` in ManifestEntry This PR help add 2nd filter, this help avoid rep

Re: [PR] Added ExpireSnapshots Feature [iceberg-python]

2025-04-23 Thread via GitHub
ForeverAngry commented on code in PR #1880: URL: https://github.com/apache/iceberg-python/pull/1880#discussion_r2057102705 ## pyiceberg/table/update/snapshot.py: ## @@ -82,7 +84,11 @@ from pyiceberg.utils.properties import property_as_bool, property_as_int if TYPE_CHECKING:

Re: [PR] Table Scan Performance Tests [iceberg-rust]

2025-04-23 Thread via GitHub
sdd closed pull request #497: Table Scan Performance Tests URL: https://github.com/apache/iceberg-rust/pull/497 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [PR] Spark: Add 'skip_file_list' option to RewriteTablePathProcedure for optional file-list generation [iceberg]

2025-04-23 Thread via GitHub
slfan1989 commented on PR #12844: URL: https://github.com/apache/iceberg/pull/12844#issuecomment-2825908909 > @slfan1989 @szehon-ho I meant outputting to console if user doesn't want to save to file. If that's not possible when the plan is big, maybe add two more output values, "rewrite_del

Re: [PR] Docs: Fix links to javadoc [iceberg]

2025-04-23 Thread via GitHub
ebyhr commented on code in PR #12880: URL: https://github.com/apache/iceberg/pull/12880#discussion_r2057148261 ## docs/docs/hive.md: ## @@ -271,7 +271,7 @@ CREATE TABLE target LIKE source STORED BY ICEBERG; ### CREATE EXTERNAL TABLE overlaying an existing Iceberg table The `

Re: [PR] feat: implement initial MemoryCatalog functionality with namespace and table support [iceberg-cpp]

2025-04-23 Thread via GitHub
wgtmac commented on code in PR #80: URL: https://github.com/apache/iceberg-cpp/pull/80#discussion_r2057148192 ## src/iceberg/catalog/memory_catalog.h: ## @@ -0,0 +1,204 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreemen

Re: [PR] feat: implement initial MemoryCatalog functionality with namespace and table support [iceberg-cpp]

2025-04-23 Thread via GitHub
gty404 commented on code in PR #80: URL: https://github.com/apache/iceberg-cpp/pull/80#discussion_r2057116999 ## src/iceberg/catalog/memory_catalog.h: ## @@ -0,0 +1,204 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreemen

Re: [PR] Added ExpireSnapshots Feature [iceberg-python]

2025-04-23 Thread via GitHub
ForeverAngry commented on code in PR #1880: URL: https://github.com/apache/iceberg-python/pull/1880#discussion_r2057106903 ## pyiceberg/table/update/snapshot.py: ## @@ -843,3 +849,64 @@ def remove_branch(self, branch_name: str) -> ManageSnapshots: This for method c

Re: [PR] Flink: Add support for Flink 2.0 [iceberg]

2025-04-23 Thread via GitHub
mxm commented on PR #12527: URL: https://github.com/apache/iceberg/pull/12527#issuecomment-2825133056 (I renamed the Flink 2.0 commit to "Flink: Add support for Flink 2.0") -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

Re: [PR] Added ExpireSnapshots Feature [iceberg-python]

2025-04-23 Thread via GitHub
ForeverAngry commented on code in PR #1880: URL: https://github.com/apache/iceberg-python/pull/1880#discussion_r2057105366 ## pyiceberg/table/update/snapshot.py: ## @@ -843,3 +849,64 @@ def remove_branch(self, branch_name: str) -> ManageSnapshots: This for method c

Re: [D] [Question] How to generate manifest files and manifest list [iceberg-rust]

2025-04-23 Thread via GitHub
GitHub user dentiny closed a discussion: [Question] How to generate manifest files and manifest list Hi community, I'm trying to use iceberg-rust to write columns into parquet files, and generate iceberg table. >From my understand (and what chatgpt told me), to generate manifest files and >m

Re: [PR] Build: Specify -XX:-OmitStackTraceInFastThrow in tests [iceberg]

2025-04-23 Thread via GitHub
ebyhr closed pull request #12806: Build: Specify -XX:-OmitStackTraceInFastThrow in tests URL: https://github.com/apache/iceberg/pull/12806 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] Spark: Add separate action to rewrite DVs [iceberg]

2025-04-23 Thread via GitHub
github-actions[bot] commented on PR #12403: URL: https://github.com/apache/iceberg/pull/12403#issuecomment-2825828489 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [PR] Spark: Remove closing of IO in SerializableTable* [iceberg]

2025-04-23 Thread via GitHub
github-actions[bot] closed pull request #12129: Spark: Remove closing of IO in SerializableTable* URL: https://github.com/apache/iceberg/pull/12129 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] Spark: Remove closing of IO in SerializableTable* [iceberg]

2025-04-23 Thread via GitHub
github-actions[bot] commented on PR #12129: URL: https://github.com/apache/iceberg/pull/12129#issuecomment-2825828439 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If

Re: [I] REST API path resolution is not general enough when using rest catalog [iceberg]

2025-04-23 Thread via GitHub
github-actions[bot] commented on issue #11391: URL: https://github.com/apache/iceberg/issues/11391#issuecomment-2825828351 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [I] Parquet bloom filter doesn't work with nested fields [iceberg]

2025-04-23 Thread via GitHub
github-actions[bot] commented on issue #9898: URL: https://github.com/apache/iceberg/issues/9898#issuecomment-2825828258 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] Catalog: Add BigQuery Metastore Catalog Support [iceberg]

2025-04-23 Thread via GitHub
talatuyarer commented on code in PR #12808: URL: https://github.com/apache/iceberg/pull/12808#discussion_r2057079908 ## bigquery/src/main/java/org/apache/iceberg/gcp/bigquery/BigQueryMetastoreCatalog.java: ## @@ -0,0 +1,400 @@ +/* + * Licensed to the Apache Software Foundation (

Re: [D] [Question] How to generate manifest files and manifest list [iceberg-rust]

2025-04-23 Thread via GitHub
GitHub user dentiny closed the discussion with a comment: [Question] How to generate manifest files and manifest list I confirmed rest catalog works with transaction, which correctly generates manifest files and manifest list, example code ```rust let txn = Transaction::new(&iceberg_table); le

Re: [PR] AVRO: Support UUID logical type on string fields in Avro schema [iceberg]

2025-04-23 Thread via GitHub
vanshb03 commented on code in PR #12877: URL: https://github.com/apache/iceberg/pull/12877#discussion_r2056819304 ## core/src/main/java/org/apache/iceberg/avro/SchemaToType.java: ## @@ -208,6 +208,8 @@ public Type logicalType(Schema primitive, LogicalType logical) { } el

Re: [PR] SPARK: Remove dependency on hadoop's filesystem class from remove orphan files [iceberg]

2025-04-23 Thread via GitHub
fuzing commented on PR #12254: URL: https://github.com/apache/iceberg/pull/12254#issuecomment-2825652819 @RussellSpitzer - We've applied this PR and performed some cursory testing with a minio S3 compatible store. We scattered a number of random files inside and outside the table's s

Re: [PR] [wip] feat: `validate_deleted_data_files` [iceberg-python]

2025-04-23 Thread via GitHub
jayceslesar commented on code in PR #1938: URL: https://github.com/apache/iceberg-python/pull/1938#discussion_r2056909183 ## pyiceberg/table/update/validate.py: ## @@ -0,0 +1,150 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agre

Re: [PR] Scan Delete Support Part 4: Delete File Loading; Skeleton for Processing [iceberg-rust]

2025-04-23 Thread via GitHub
sdd commented on PR #982: URL: https://github.com/apache/iceberg-rust/pull/982#issuecomment-2825567957 Back to you @liurenjie1024 - I've made the changes around missing functionality. Still the open question of if you are ok to defer the structural / performance changes to a follow-up so th

Re: [PR] Scan Delete Support Part 4: Delete File Loading; Skeleton for Processing [iceberg-rust]

2025-04-23 Thread via GitHub
sdd commented on code in PR #982: URL: https://github.com/apache/iceberg-rust/pull/982#discussion_r2056910313 ## crates/iceberg/src/arrow/delete_file_manager.rs: ## @@ -47,47 +60,533 @@ impl DeleteFileManager for CachingDeleteFileManager { )) } } +// Equality dele

Re: [PR] Scan Delete Support Part 4: Delete File Loading; Skeleton for Processing [iceberg-rust]

2025-04-23 Thread via GitHub
sdd commented on code in PR #982: URL: https://github.com/apache/iceberg-rust/pull/982#discussion_r2056910313 ## crates/iceberg/src/arrow/delete_file_manager.rs: ## @@ -47,47 +60,533 @@ impl DeleteFileManager for CachingDeleteFileManager { )) } } +// Equality dele

Re: [PR] feat: `validation_history` and `ancestors_between` [iceberg-python]

2025-04-23 Thread via GitHub
jayceslesar commented on code in PR #1935: URL: https://github.com/apache/iceberg-python/pull/1935#discussion_r2056902130 ## pyiceberg/table/update/validate.py: ## @@ -0,0 +1,72 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agree

Re: [PR] Scan Delete Support Part 4: Delete File Loading; Skeleton for Processing [iceberg-rust]

2025-04-23 Thread via GitHub
sdd commented on code in PR #982: URL: https://github.com/apache/iceberg-rust/pull/982#discussion_r2056855131 ## crates/iceberg/src/arrow/delete_file_manager.rs: ## @@ -47,47 +60,533 @@ impl DeleteFileManager for CachingDeleteFileManager { )) } } +// Equality dele

Re: [PR] feat: `validation_history` and `ancestors_between` [iceberg-python]

2025-04-23 Thread via GitHub
jayceslesar commented on code in PR #1935: URL: https://github.com/apache/iceberg-python/pull/1935#discussion_r2056894516 ## pyiceberg/table/update/validate.py: ## @@ -0,0 +1,70 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agree

Re: [PR] feat: `validation_history` and `ancestors_between` [iceberg-python]

2025-04-23 Thread via GitHub
jayceslesar commented on code in PR #1935: URL: https://github.com/apache/iceberg-python/pull/1935#discussion_r2056894516 ## pyiceberg/table/update/validate.py: ## @@ -0,0 +1,70 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agree

Re: [I] restcatalog::namespace_exists always fails [iceberg-rust]

2025-04-23 Thread via GitHub
dentiny commented on issue #1234: URL: https://github.com/apache/iceberg-rust/issues/1234#issuecomment-2825097292 I seems to suffer the same issue, so temporarily I workaround with list + contains: ```rust let namespaces = catalog.list_namespaces(None).await?; let namespace_exists =

Re: [PR] Scan Delete Support Part 4: Delete File Loading; Skeleton for Processing [iceberg-rust]

2025-04-23 Thread via GitHub
sdd commented on code in PR #982: URL: https://github.com/apache/iceberg-rust/pull/982#discussion_r2056855131 ## crates/iceberg/src/arrow/delete_file_manager.rs: ## @@ -47,47 +60,533 @@ impl DeleteFileManager for CachingDeleteFileManager { )) } } +// Equality dele

Re: [I] REST catalog: support AWS sigV4 [iceberg-rust]

2025-04-23 Thread via GitHub
ananthaksr commented on issue #1236: URL: https://github.com/apache/iceberg-rust/issues/1236#issuecomment-2825436562 @xxchan I'd like to work on this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] Spark 4.0 integration [iceberg]

2025-04-23 Thread via GitHub
szehon-ho commented on code in PR #12494: URL: https://github.com/apache/iceberg/pull/12494#discussion_r2056830271 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/procedures/CreateChangelogViewProcedure.java: ## @@ -0,0 +1,321 @@ +/* + * Licensed to the Apache Softwar

Re: [PR] Spark 4.0 integration [iceberg]

2025-04-23 Thread via GitHub
szehon-ho commented on code in PR #12494: URL: https://github.com/apache/iceberg/pull/12494#discussion_r2056828692 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/procedures/CreateChangelogViewProcedure.java: ## @@ -0,0 +1,321 @@ +/* + * Licensed to the Apache Softwar

[I] Not exposing minio port in test script [iceberg-rust]

2025-04-23 Thread via GitHub
gfee-home opened a new issue, #1238: URL: https://github.com/apache/iceberg-rust/issues/1238 ### Apache Iceberg Rust version 0.4.0 (latest version) ### Describe the bug The docker-compose.yaml creates a minio container with ports 9000 and 9001, but only maps 9001. This c

Re: [PR] feat(playground): Add S3Tables catalog support (#1161) [iceberg-rust]

2025-04-23 Thread via GitHub
ananthaksr closed pull request #1229: feat(playground): Add S3Tables catalog support (#1161) URL: https://github.com/apache/iceberg-rust/pull/1229 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] feat(playground): Add S3Tables catalog support (#1161) [iceberg-rust]

2025-04-23 Thread via GitHub
ananthaksr commented on PR #1229: URL: https://github.com/apache/iceberg-rust/pull/1229#issuecomment-2825426124 Sounds good to me. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] Update-schema: Add support for `initial-default` [iceberg-python]

2025-04-23 Thread via GitHub
Fokko commented on PR #1770: URL: https://github.com/apache/iceberg-python/pull/1770#issuecomment-2825423274 @kevinjqliu I did commit, but didn't push them 😀 Thanks for the reminder! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [PR] Flink: Add support for Flink 2.0 [iceberg]

2025-04-23 Thread via GitHub
stevenzwu commented on PR #12527: URL: https://github.com/apache/iceberg/pull/12527#issuecomment-2825421769 thanks @mxm for the contribution and @pvary @ajantha-bhat for the reviews -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

Re: [PR] Flink: Add support for Flink 2.0 [iceberg]

2025-04-23 Thread via GitHub
stevenzwu merged PR #12527: URL: https://github.com/apache/iceberg/pull/12527 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] AVRO: Support UUID logical type on string fields in Avro schema [iceberg]

2025-04-23 Thread via GitHub
raphaelauv commented on code in PR #12877: URL: https://github.com/apache/iceberg/pull/12877#discussion_r2056802939 ## core/src/main/java/org/apache/iceberg/avro/SchemaToType.java: ## @@ -208,6 +208,8 @@ public Type logicalType(Schema primitive, LogicalType logical) { }

Re: [PR] feat: add an AddFieldsAction to transaction [iceberg-rust]

2025-04-23 Thread via GitHub
cmcarthur closed pull request #1176: feat: add an AddFieldsAction to transaction URL: https://github.com/apache/iceberg-rust/pull/1176 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] feat: add table metadata reader and writer [iceberg-cpp]

2025-04-23 Thread via GitHub
Fokko merged PR #85: URL: https://github.com/apache/iceberg-cpp/pull/85 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

Re: [PR] feat: add table metadata reader and writer [iceberg-cpp]

2025-04-23 Thread via GitHub
Fokko commented on PR #85: URL: https://github.com/apache/iceberg-cpp/pull/85#issuecomment-2825304539 Thanks @wgtmac for working on this, and thanks @lidavidm, @zhjwpku and @yingcai-cy for the review 🙌 -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] feat: add formatter specialization [iceberg-cpp]

2025-04-23 Thread via GitHub
Fokko commented on PR #86: URL: https://github.com/apache/iceberg-cpp/pull/86#issuecomment-2825301231 Thanks @wgtmac for working on this, and thanks @lidavidm, @zhjwpku and @yingcai-cy for the review 🙌 -- This is an automated message from the Apache Git Service. To respond to the messag

Re: [PR] feat: add formatter specialization [iceberg-cpp]

2025-04-23 Thread via GitHub
Fokko merged PR #86: URL: https://github.com/apache/iceberg-cpp/pull/86 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apach

Re: [PR] Spark: Add 'skip_file_list' option to RewriteTablePathProcedure for optional file-list generation [iceberg]

2025-04-23 Thread via GitHub
szehon-ho commented on PR #12844: URL: https://github.com/apache/iceberg/pull/12844#issuecomment-2825286985 Makes sense, i think adding a count sounds fine to me. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] Spec: Avoid struct field conflicts in default values [iceberg]

2025-04-23 Thread via GitHub
aokolnychyi commented on code in PR #12841: URL: https://github.com/apache/iceberg/pull/12841#discussion_r2056706718 ## format/spec.md: ## @@ -266,7 +266,9 @@ The `initial-default` is set only when a field is added to an existing schema. T The `initial-default` and `write-de

[I] repair_table (or similar) procedure call for iceberg/spark [iceberg]

2025-04-23 Thread via GitHub
fuzing opened a new issue, #12883: URL: https://github.com/apache/iceberg/issues/12883 ### Feature Request / Improvement At the moment, once a data file goes missing or becomes corrupted, table functionality is diminished or completely lost due to cascading errors as a result of

Re: [PR] Add timestamp_ns, time and UUID types for Variant [iceberg]

2025-04-23 Thread via GitHub
rdblue commented on code in PR #12682: URL: https://github.com/apache/iceberg/pull/12682#discussion_r2056668528 ## parquet/src/main/java/org/apache/iceberg/parquet/VariantWriterBuilder.java: ## @@ -229,19 +228,32 @@ public Optional> visit(DateLogicalTypeAnnotation ignored)

Re: [PR] Core: Fix a cast that is too narrow [iceberg]

2025-04-23 Thread via GitHub
angelo-DNAStack commented on PR #12743: URL: https://github.com/apache/iceberg/pull/12743#issuecomment-2825125101 Done, apologies for the delay it's been a busy few weeks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

Re: [PR] Flink: Add support for Flink 2.0 [iceberg]

2025-04-23 Thread via GitHub
mxm commented on PR #12527: URL: https://github.com/apache/iceberg/pull/12527#issuecomment-2825122500 @stevenzwu Thanks for the review! I squashed the commits. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [PR] Spark 3.5 row lineage [iceberg]

2025-04-23 Thread via GitHub
amogh-jahagirdar commented on code in PR #12736: URL: https://github.com/apache/iceberg/pull/12736#discussion_r2056321860 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestRowLevelOperationsWithLineage.java: ## @@ -0,0 +1,443 @@ +/* + * License

Re: [PR] Spec: Avoid struct field conflicts in default values [iceberg]

2025-04-23 Thread via GitHub
rdblue commented on code in PR #12841: URL: https://github.com/apache/iceberg/pull/12841#discussion_r2056567032 ## format/spec.md: ## @@ -315,7 +317,7 @@ Struct evolution requires the following rules for default values: * The `write-default` must be set when a field is added a

Re: [PR] Spec: Avoid struct field conflicts in default values [iceberg]

2025-04-23 Thread via GitHub
rdblue commented on code in PR #12841: URL: https://github.com/apache/iceberg/pull/12841#discussion_r2056563664 ## format/spec.md: ## @@ -266,7 +266,9 @@ The `initial-default` is set only when a field is added to an existing schema. T The `initial-default` and `write-default

Re: [PR] Catalog: Add BigQuery Metastore Catalog Support [iceberg]

2025-04-23 Thread via GitHub
talatuyarer commented on code in PR #12808: URL: https://github.com/apache/iceberg/pull/12808#discussion_r2056513109 ## bigquery/src/test/java/org/apache/iceberg/gcp/bigquery/BigQueryTableOperationsTest.java: ## @@ -0,0 +1,228 @@ +/* + * Licensed to the Apache Software Foundatio

Re: [PR] Catalog: Add BigQuery Metastore Catalog Support [iceberg]

2025-04-23 Thread via GitHub
talatuyarer commented on code in PR #12808: URL: https://github.com/apache/iceberg/pull/12808#discussion_r2056511962 ## bigquery/src/test/java/org/apache/iceberg/gcp/bigquery/TestBigQueryCatalog.java: ## @@ -0,0 +1,673 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] Spark: Avoid closing deserialized copies of shared resources like FileIO [iceberg]

2025-04-23 Thread via GitHub
singhpk234 commented on code in PR #12868: URL: https://github.com/apache/iceberg/pull/12868#discussion_r2056413702 ## spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/source/SerializableTableWithSize.java: ## @@ -65,8 +66,7 @@ public static Table copyOf(Table table) {

[PR] Core: Fix Kryo ser/de with StorageCredential config [iceberg]

2025-04-23 Thread via GitHub
nastra opened a new pull request, #12882: URL: https://github.com/apache/iceberg/pull/12882 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-m

[I] load_table showing old table schema [iceberg-python]

2025-04-23 Thread via GitHub
scottjarman opened a new issue, #1948: URL: https://github.com/apache/iceberg-python/issues/1948 ### Apache Iceberg version 0.8.1 ### Please describe the bug 🐞 I have an Iceberg table in AWS Glue Catalog that has numerous schema versions. When I run load_table the

Re: [PR] Spark 3.5 row lineage [iceberg]

2025-04-23 Thread via GitHub
amogh-jahagirdar commented on code in PR #12736: URL: https://github.com/apache/iceberg/pull/12736#discussion_r2056312051 ## core/src/main/java/org/apache/iceberg/TableUtil.java: ## @@ -60,4 +61,28 @@ public static String metadataFileLocation(Table table) { "%s do

[PR] feat: support rewrite manifest action [iceberg-rust]

2025-04-23 Thread via GitHub
ZENOTME opened a new pull request, #1237: URL: https://github.com/apache/iceberg-rust/pull/1237 ## Which issue does this PR close? - Closes #. ## What changes are included in this PR? ## Are these changes tested? -- This is an automated message

Re: [PR] Catalog: Add BigQuery Metastore Catalog Support [iceberg]

2025-04-23 Thread via GitHub
nastra commented on code in PR #12808: URL: https://github.com/apache/iceberg/pull/12808#discussion_r2055657656 ## bigquery/src/main/java/org/apache/iceberg/gcp/bigquery/BigQueryMetastoreCatalog.java: ## @@ -0,0 +1,400 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] Update-schema: Add support for `initial-default` [iceberg-python]

2025-04-23 Thread via GitHub
kevinjqliu commented on PR #1770: URL: https://github.com/apache/iceberg-python/pull/1770#issuecomment-2824673925 did you push the new commits? @Fokko -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] refactor partition_summary_limit into SnapshotSummaryCollector constr… [iceberg-python]

2025-04-23 Thread via GitHub
stevie9868 commented on PR #1940: URL: https://github.com/apache/iceberg-python/pull/1940#issuecomment-2824642604 Thanks @Fokko ! I agree and have updated my changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] refactor partition_summary_limit into SnapshotSummaryCollector constr… [iceberg-python]

2025-04-23 Thread via GitHub
kevinjqliu commented on code in PR #1940: URL: https://github.com/apache/iceberg-python/pull/1940#discussion_r2056298150 ## pyiceberg/table/snapshots.py: ## @@ -272,10 +272,10 @@ class SnapshotSummaryCollector: partition_metrics: DefaultDict[str, UpdateMetrics] max_cha

  1   2   >