Re: [PR] Flink: Supports RewriteManifests in TableMaintenance [iceberg]

2025-07-16 Thread via GitHub
Guosmilesmile commented on code in PR #13579: URL: https://github.com/apache/iceberg/pull/13579#discussion_r2212424380 ## flink/v2.0/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/WriteManifests.java: ## @@ -0,0 +1,236 @@ +/* + * Licensed to the Apache Softwar

Re: [I] Connect to S3 catalog [iceberg-python]

2025-07-16 Thread via GitHub
dingo4dev commented on issue #1683: URL: https://github.com/apache/iceberg-python/issues/1683#issuecomment-3082763363 @IanVlasov AWS has recently released S3 Table which supports iceberg. Check out the document https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-tables-integratin

[PR] Flink: Supports RewriteManifests in TableMaintenance [iceberg]

2025-07-16 Thread via GitHub
Guosmilesmile opened a new pull request, #13579: URL: https://github.com/apache/iceberg/pull/13579 This PR aims to add support for deleting orphan files in TableMaintenance for Flink. The relevant design document can be found at https://docs.google.com/document/d/16g3vR18mVBy8jbFaLjf

Re: [PR] [Spec]: Fix wrong type for snapshot-id in table statistics [iceberg]

2025-07-16 Thread via GitHub
ajantha-bhat commented on code in PR #13513: URL: https://github.com/apache/iceberg/pull/13513#discussion_r2212343376 ## format/spec.md: ## @@ -970,7 +970,7 @@ Statistics files metadata within `statistics` table metadata field is a struct w | v1 | v2 | Field name | Type | De

Re: [I] Cannot cast java.util.UUID to java.lang.CharSequence [iceberg]

2025-07-16 Thread via GitHub
aperture147 commented on issue #13077: URL: https://github.com/apache/iceberg/issues/13077#issuecomment-3082618280 I'm having the same problem when I run `catalog_name.system.rewrite_manifests`. When will the related PR be merged in the next release of iceberg/spark? Thanks. -- This is a

Re: [PR] fix: coerce UUID to String in readable_metrics to avoid ClassCastException in Spark [iceberg]

2025-07-16 Thread via GitHub
aperture147 commented on PR #13087: URL: https://github.com/apache/iceberg/pull/13087#issuecomment-3082617300 I'm having the same problem when I run `catalog_name.system.rewrite_manifests`. When will this be merged in the next release of iceberg/spark? Thanks. -- This is an automated mes

Re: [PR] kafka-connect: resolve CVE-2025-48734 [iceberg]

2025-07-16 Thread via GitHub
ajantha-bhat commented on code in PR #13561: URL: https://github.com/apache/iceberg/pull/13561#discussion_r2212319525 ## kafka-connect/build.gradle: ## @@ -262,4 +263,4 @@ project(':iceberg-kafka-connect:iceberg-kafka-connect-transforms') { test { useJUnitPlatform()

Re: [PR] feat: support merge append action [iceberg-rust]

2025-07-16 Thread via GitHub
CTTY commented on code in PR #902: URL: https://github.com/apache/iceberg-rust/pull/902#discussion_r2209181903 ## crates/iceberg/src/transaction/merge_append.rs: ## @@ -0,0 +1,332 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license a

Re: [PR] feat(transaction): Add retry logic to transaction [iceberg-rust]

2025-07-16 Thread via GitHub
liurenjie1024 commented on PR #1484: URL: https://github.com/apache/iceberg-rust/pull/1484#issuecomment-3082522642 Hi, @Fokko Thanks for the comments. > I assume that this focusses on the second point. Yes. > But to make this effective we need to check if we still can do

Re: [PR] Docs: Document compute_partition_stats procedure [iceberg]

2025-07-16 Thread via GitHub
ajantha-bhat commented on code in PR #13532: URL: https://github.com/apache/iceberg/pull/13532#discussion_r2212092271 ## docs/docs/spark-procedures.md: ## @@ -974,6 +974,38 @@ Collect statistics of the snapshot with id `snap1` of table `my_table` for colum CALL catalog_name.sy

Re: [PR] Bump version to 0.6.0 (Round 1) [iceberg-rust]

2025-07-16 Thread via GitHub
Xuanwo commented on PR #1506: URL: https://github.com/apache/iceberg-rust/pull/1506#issuecomment-3082339315 Oh, let me fix this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

Re: [PR] feat: add manifest list reader [iceberg-cpp]

2025-07-16 Thread via GitHub
Xuanwo merged PR #143: URL: https://github.com/apache/iceberg-cpp/pull/143 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] refactor: Add SchemaById and SnapshotById to TableMetadata [iceberg-cpp]

2025-07-16 Thread via GitHub
Xuanwo merged PR #144: URL: https://github.com/apache/iceberg-cpp/pull/144 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [I] Can Flink Write to Iceberg and Update Partial Columns [iceberg]

2025-07-16 Thread via GitHub
manuzhang commented on issue #13566: URL: https://github.com/apache/iceberg/issues/13566#issuecomment-3082307367 Nope, all changes are at row level. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [I] BatchDataReader never closes CloseableIterable it creates [iceberg]

2025-07-16 Thread via GitHub
manuzhang commented on issue #13567: URL: https://github.com/apache/iceberg/issues/13567#issuecomment-3082306093 I think it's acceptable when `CloseableIterable` has no resources to close. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [I] Update Table Error: UPDATE TABLE is not supported temporarily. [iceberg]

2025-07-16 Thread via GitHub
gunarevuri commented on issue #9960: URL: https://github.com/apache/iceberg/issues/9960#issuecomment-3082294265 I'm using EMR version 7.0.0, Spark 3.5.0, scala 2.12.17 facing the same error ---ERROR in EMR--- org.apache.spark.SparkUnsupportedOperationException: UPDATE

Re: [I] Kafka Connect sink connector plugin consumer group prefix [iceberg]

2025-07-16 Thread via GitHub
Leonti commented on issue #13236: URL: https://github.com/apache/iceberg/issues/13236#issuecomment-3082281551 We have experienced the same issue. After trying to migrate from Tabular/Databricks version we have noticed some data loss. Turns out the previous implementation used [config.con

Re: [PR] feat: `validate_no_new_added_delete_files` [iceberg-python]

2025-07-16 Thread via GitHub
sungwy commented on code in PR #2176: URL: https://github.com/apache/iceberg-python/pull/2176#discussion_r2212019384 ## pyiceberg/partitioning.py: ## @@ -272,6 +272,60 @@ def assign_fresh_partition_spec_ids(spec: PartitionSpec, old_schema: Schema, fre T = TypeVar("T") +cla

Re: [I] EPIC: Implement register_table for catalogs. [iceberg-rust]

2025-07-16 Thread via GitHub
liurenjie1024 commented on issue #1508: URL: https://github.com/apache/iceberg-rust/issues/1508#issuecomment-3082203677 > Hi [@liurenjie1024](https://github.com/liurenjie1024) , I've added a list of sub-issues in the description. it seems like I can convert the tasks to issues instead of su

Re: [PR] Make 'Drop table' behavior same in hive4 and spark3.5 when use hiveca… [iceberg]

2025-07-16 Thread via GitHub
hidataplus commented on PR #13497: URL: https://github.com/apache/iceberg/pull/13497#issuecomment-3082122115 @aokolnychyi @rdblue @nastra @RussellSpitzer can help to review? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [PR] Docs: Document compute_partition_stats procedure [iceberg]

2025-07-16 Thread via GitHub
szehon-ho commented on code in PR #13532: URL: https://github.com/apache/iceberg/pull/13532#discussion_r2211933262 ## docs/docs/spark-procedures.md: ## @@ -974,6 +974,38 @@ Collect statistics of the snapshot with id `snap1` of table `my_table` for colum CALL catalog_name.syste

Re: [I] Improve `dev/Dockerfile` [iceberg-python]

2025-07-16 Thread via GitHub
github-actions[bot] commented on issue #1527: URL: https://github.com/apache/iceberg-python/issues/1527#issuecomment-3081925063 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity

Re: [PR] Spark: Show owner while describing views w/ extended info [iceberg]

2025-07-16 Thread via GitHub
github-actions[bot] commented on PR #13293: URL: https://github.com/apache/iceberg/pull/13293#issuecomment-3081919081 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [PR] Core: Enhance logging messages for snapshot expiration [iceberg]

2025-07-16 Thread via GitHub
github-actions[bot] closed pull request #13279: Core: Enhance logging messages for snapshot expiration URL: https://github.com/apache/iceberg/pull/13279 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] Core: Enhance logging messages for snapshot expiration [iceberg]

2025-07-16 Thread via GitHub
github-actions[bot] commented on PR #13279: URL: https://github.com/apache/iceberg/pull/13279#issuecomment-3081918996 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If

Re: [PR] BigQuery: Support loading credentials from JSON file [iceberg]

2025-07-16 Thread via GitHub
github-actions[bot] commented on PR #13052: URL: https://github.com/apache/iceberg/pull/13052#issuecomment-3081918767 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [PR] Introduce MetricsMaxInferredColumnDefaultsStrategy [iceberg]

2025-07-16 Thread via GitHub
github-actions[bot] commented on PR #13039: URL: https://github.com/apache/iceberg/pull/13039#issuecomment-3081918679 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [I] Iceberg Maven artifacts do not declare proper dependencies [iceberg]

2025-07-16 Thread via GitHub
github-actions[bot] commented on issue #11994: URL: https://github.com/apache/iceberg/issues/11994#issuecomment-3081918482 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [I] Flink Iceberg Writer : To be able to use copy-on-write mode to write the iceberg tables for batch jobs [iceberg]

2025-07-16 Thread via GitHub
github-actions[bot] commented on issue #11893: URL: https://github.com/apache/iceberg/issues/11893#issuecomment-3081918007 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [I] Support both adjust-to-utc and local-timestamp-micros in Avro Data Type Mappings [iceberg]

2025-07-16 Thread via GitHub
github-actions[bot] commented on issue #11903: URL: https://github.com/apache/iceberg/issues/11903#issuecomment-3081918082 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [I] Field not found in source schema [iceberg]

2025-07-16 Thread via GitHub
github-actions[bot] closed issue #11843: Field not found in source schema URL: https://github.com/apache/iceberg/issues/11843 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

Re: [I] Field not found in source schema [iceberg]

2025-07-16 Thread via GitHub
github-actions[bot] commented on issue #11843: URL: https://github.com/apache/iceberg/issues/11843#issuecomment-3081917898 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [I] Flink Iceberg Writer : To be able to use copy-on-write mode to write the iceberg tables for batch jobs [iceberg]

2025-07-16 Thread via GitHub
github-actions[bot] closed issue #11893: Flink Iceberg Writer : To be able to use copy-on-write mode to write the iceberg tables for batch jobs URL: https://github.com/apache/iceberg/issues/11893 -- This is an automated message from the Apache Git Service. To respond to the message, please lo

Re: [I] when MERGE INTO a merge-on-read table got NoSuchMethodError [iceberg]

2025-07-16 Thread via GitHub
github-actions[bot] closed issue #11821: when MERGE INTO a merge-on-read table got NoSuchMethodError URL: https://github.com/apache/iceberg/issues/11821 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [I] when MERGE INTO a merge-on-read table got NoSuchMethodError [iceberg]

2025-07-16 Thread via GitHub
github-actions[bot] commented on issue #11821: URL: https://github.com/apache/iceberg/issues/11821#issuecomment-3081917781 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [PR] INFRA: Add 1.9.2 to latest [iceberg]

2025-07-16 Thread via GitHub
stevenzwu merged PR #13577: URL: https://github.com/apache/iceberg/pull/13577 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] feat(datafusion): Support insert_into in IcebergTableProvider [iceberg-rust]

2025-07-16 Thread via GitHub
CTTY commented on code in PR #1511: URL: https://github.com/apache/iceberg-rust/pull/1511#discussion_r2211833861 ## crates/integrations/datafusion/src/physical_plan/write.rs: ## @@ -0,0 +1,371 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contribu

Re: [PR] Core: Add Schema evolution test with partition transform on a field with default values [iceberg]

2025-07-16 Thread via GitHub
anoopj commented on code in PR #13570: URL: https://github.com/apache/iceberg/pull/13570#discussion_r2211791069 ## core/src/test/java/org/apache/iceberg/TestScansAndSchemaEvolution.java: ## @@ -182,4 +183,77 @@ public void testAddColumnWithDefaultValueAndQuery() throws IOExcept

Re: [PR] Core: Registering tables to nonexistent target namespace leads to metadata deletion in HiveCatalog [iceberg]

2025-07-16 Thread via GitHub
dramaticlly commented on PR #13434: URL: https://github.com/apache/iceberg/pull/13434#issuecomment-3081658347 > We're registering existing Iceberg tables to HiveCatalog and realize that the metadata.json files used are deleted when the target namespace doesn't exist. I think the othe

Re: [PR] Spark 4: Support Parquet dictionary encoded UUIDs [iceberg]

2025-07-16 Thread via GitHub
amogh-jahagirdar merged PR #13573: URL: https://github.com/apache/iceberg/pull/13573 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@

Re: [PR] Site: Updates for 1.9.2 Release [iceberg]

2025-07-16 Thread via GitHub
stevenzwu merged PR #13578: URL: https://github.com/apache/iceberg/pull/13578 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] feat(transaction): Add retry logic to transaction [iceberg-rust]

2025-07-16 Thread via GitHub
CTTY commented on PR #1484: URL: https://github.com/apache/iceberg-rust/pull/1484#issuecomment-3081245436 Hi @Fokko , On the question of to retry or not retry I think you will find this [thread](https://github.com/apache/iceberg-rust/pull/1383) interesting:) --- tldr: It's controlled

[PR] Site: Updates for 1.9.2 Release [iceberg]

2025-07-16 Thread via GitHub
singhpk234 opened a new pull request, #13578: URL: https://github.com/apache/iceberg/pull/13578 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [PR] Docs: Adds 1.9.2 Versioned JavaDocs [iceberg]

2025-07-16 Thread via GitHub
stevenzwu merged PR #13576: URL: https://github.com/apache/iceberg/pull/13576 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] Site: Add Versioned Docs for 1.9.2 [iceberg]

2025-07-16 Thread via GitHub
stevenzwu merged PR #13575: URL: https://github.com/apache/iceberg/pull/13575 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

[PR] INFRA: Add 1.9.2 to latest [iceberg]

2025-07-16 Thread via GitHub
singhpk234 opened a new pull request, #13577: URL: https://github.com/apache/iceberg/pull/13577 ### About changes - Add 1.9.2 for bug report - updates the doap file -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[PR] Docs: Adds 1.9.2 Versioned JavaDocs [iceberg]

2025-07-16 Thread via GitHub
singhpk234 opened a new pull request, #13576: URL: https://github.com/apache/iceberg/pull/13576 ### About the change Add java versioned docs : Steps followed as mentioned here - https://iceberg.apache.org/how-to-release/#versioned-javadoc - downloaded 1.9.2 - https:/

Re: [PR] Core: Batch load new files when validating replaced partitions [iceberg]

2025-07-16 Thread via GitHub
gabeiglio commented on code in PR #13556: URL: https://github.com/apache/iceberg/pull/13556#discussion_r2211624736 ## core/src/main/java/org/apache/iceberg/util/SnapshotUtil.java: ## @@ -303,6 +309,32 @@ public static List newFiles( return newFiles; } + public static

Re: [PR] Metrics reporting [iceberg-rust]

2025-07-16 Thread via GitHub
DerGut commented on PR #1496: URL: https://github.com/apache/iceberg-rust/pull/1496#issuecomment-3080807218 Thanks for sharing! This is great 👏 I'm working on a doc to bring our discussion to a wider audience as it touches on points that are relevant for all the client implementations. Wi

[PR] Site: Add Versioned Docs for 1.9.2 [iceberg]

2025-07-16 Thread via GitHub
singhpk234 opened a new pull request, #13575: URL: https://github.com/apache/iceberg/pull/13575 ### About the change Adds versioned docs for 1.9.2 double verified the diff with 1.9.1, 1.9.0 ➜ iceberg git:(release-192/versioned-docs) ✗ diff 1.9.1 1.9.0 ``` Common s

Re: [PR] Core: Add Schema evolution test with partition transform on a field with default values [iceberg]

2025-07-16 Thread via GitHub
ebyhr commented on code in PR #13570: URL: https://github.com/apache/iceberg/pull/13570#discussion_r2211601364 ## core/src/test/java/org/apache/iceberg/TestScansAndSchemaEvolution.java: ## @@ -182,4 +183,77 @@ public void testAddColumnWithDefaultValueAndQuery() throws IOExcepti

Re: [PR] kafka-connect: resolve CVE-2025-48734 [iceberg]

2025-07-16 Thread via GitHub
stevenzwu commented on code in PR #13561: URL: https://github.com/apache/iceberg/pull/13561#discussion_r2211564032 ## kafka-connect/build.gradle: ## @@ -262,4 +263,4 @@ project(':iceberg-kafka-connect:iceberg-kafka-connect-transforms') { test { useJUnitPlatform() } -

Re: [PR] Metrics reporting [iceberg-rust]

2025-07-16 Thread via GitHub
sdd commented on PR #1496: URL: https://github.com/apache/iceberg-rust/pull/1496#issuecomment-3080046185 I'm in the process of adding an integration test that showcases how to export traces via OTEL OTLP to Jaeger running in a container alongside the rest of the integration test containers.

Re: [I] kafka-connect: non-hive build includes hive-related deps [iceberg]

2025-07-16 Thread via GitHub
liko9 commented on issue #13574: URL: https://github.com/apache/iceberg/issues/13574#issuecomment-3080034069 Specifically speaking, https://github.com/apache/iceberg/blob/main/kafka-connect/build.gradle#L96-L110 is used by both hive and non hive - probably have to reassess this section: ht

Re: [PR] Core: Batch load new files when validating replaced partitions [iceberg]

2025-07-16 Thread via GitHub
gabeiglio commented on code in PR #13556: URL: https://github.com/apache/iceberg/pull/13556#discussion_r2211382128 ## core/src/main/java/org/apache/iceberg/util/SnapshotUtil.java: ## @@ -303,6 +309,32 @@ public static List newFiles( return newFiles; } + public static

[I] kafka-connect: non-hive build includes hive-related deps [iceberg]

2025-07-16 Thread via GitHub
liko9 opened a new issue, #13574: URL: https://github.com/apache/iceberg/issues/13574 ### Apache Iceberg version main (development) ### Query engine Kafka Connect ### Please describe the bug 🐞 When investigating a CVE in the Kafka Connect build, I discovered

Re: [PR] Bump version to 0.6.0 (Round 1) [iceberg-rust]

2025-07-16 Thread via GitHub
Fokko commented on PR #1506: URL: https://github.com/apache/iceberg-rust/pull/1506#issuecomment-3080016179 Hey @Xuanwo I think we've missed the `pyiceberg-core` one: https://github.com/apache/iceberg-rust/blob/145afdf4b553e0f1b79d27b6cbf9b3af04f87e4e/bindings/python/Cargo.toml#L23 -

Re: [PR] refactor: consolidate snapshot expiration into MaintenanceTable [iceberg-python]

2025-07-16 Thread via GitHub
ForeverAngry commented on code in PR #2143: URL: https://github.com/apache/iceberg-python/pull/2143#discussion_r2211367952 ## pyiceberg/table/inspect.py: ## @@ -681,6 +681,32 @@ def all_manifests(self) -> "pa.Table": ) return pa.concat_tables(manifests_by_snaps

Re: [I] Merge snapshots into 1 under transaction of multiple operations [iceberg-python]

2025-07-16 Thread via GitHub
Fokko commented on issue #2201: URL: https://github.com/apache/iceberg-python/issues/2201#issuecomment-3079986442 To add to the above, the `upsert` produces both `DELETE` and `APPEND` operations, we could of course merge consecutive `APPEND` operations. The question here is, should the use

Re: [I] Merge snapshots into 1 under transaction of multiple operations [iceberg-python]

2025-07-16 Thread via GitHub
Fokko commented on issue #2201: URL: https://github.com/apache/iceberg-python/issues/2201#issuecomment-3079940333 This is very common for PyIceberg. For example, the `upsert` operation can easily produce four snapshots. Technically, you can squash them all in an `OVERWRITE` snapshot, but t

Re: [PR] add a `Makefile` to `vendor/` [iceberg-python]

2025-07-16 Thread via GitHub
Fokko commented on code in PR #2218: URL: https://github.com/apache/iceberg-python/pull/2218#discussion_r2211290683 ## vendor/Makefile: ## @@ -0,0 +1,40 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE fi

Re: [PR] chore: upgrade nanoarrow dependency [iceberg-cpp]

2025-07-16 Thread via GitHub
Fokko commented on PR #146: URL: https://github.com/apache/iceberg-cpp/pull/146#issuecomment-3079831628 Thanks @gty404 for bumping nanoarrow, and thanks @zhjwpku, @wgtmac and @lidavidm for the review 💪 -- This is an automated message from the Apache Git Service. To respond to the message

[PR] Spark 4: Support Parquet dictionary encoded UUIDs [iceberg]

2025-07-16 Thread via GitHub
Fokko opened a new pull request, #13573: URL: https://github.com/apache/iceberg/pull/13573 https://github.com/apache/iceberg/pull/13324 for Spark 4.0 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] chore: upgrade nanoarrow dependency [iceberg-cpp]

2025-07-16 Thread via GitHub
Fokko merged PR #146: URL: https://github.com/apache/iceberg-cpp/pull/146 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apa

Re: [I] org.apache.thrift.TApplicationException: Invalid method name: 'get_table' [iceberg]

2025-07-16 Thread via GitHub
opendoc-tree commented on issue #12878: URL: https://github.com/apache/iceberg/issues/12878#issuecomment-3079792165 > [@opendoc-tree](https://github.com/opendoc-tree) where does the `HiveMetaStoreClient` comes from? it's not from `hive-standalone-metastore-common-4.0.1.jar` https://github.

Re: [PR] Spark: Support Parquet dictionary encoded UUIDs [iceberg]

2025-07-16 Thread via GitHub
Fokko commented on code in PR #13324: URL: https://github.com/apache/iceberg/pull/13324#discussion_r2211247258 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/data/vectorized/parquet/TestParquetVectorizedReads.java: ## @@ -400,4 +400,19 @@ public void testUnsupportedR

Re: [PR] Core: Batch load new files when validating replaced partitions [iceberg]

2025-07-16 Thread via GitHub
bryanck commented on code in PR #13556: URL: https://github.com/apache/iceberg/pull/13556#discussion_r221125 ## core/src/main/java/org/apache/iceberg/util/SnapshotUtil.java: ## @@ -303,6 +309,32 @@ public static List newFiles( return newFiles; } + public static Cl

Re: [PR] feat(transaction): Add retry logic to transaction [iceberg-rust]

2025-07-16 Thread via GitHub
Fokko commented on PR #1484: URL: https://github.com/apache/iceberg-rust/pull/1484#issuecomment-3079684007 This PR introduces retries, but I think there are two main reasons to retry: - *Due to network interruptions.* E.g., some network issue, or a malfunctioning load-balancer (5xx er

Re: [I] Implement register_table for rest catalog [iceberg-rust]

2025-07-16 Thread via GitHub
gabeiglio commented on issue #1516: URL: https://github.com/apache/iceberg-rust/issues/1516#issuecomment-3079677968 Working on this one, should have a pr soon. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [PR] Use short string in Variant when possible [iceberg]

2025-07-16 Thread via GitHub
manirajv06 commented on PR #13284: URL: https://github.com/apache/iceberg/pull/13284#issuecomment-3079643808 @RussellSpitzer Can we merge? I have follow up changes to cover multi byte headers tests sitting on my local working copy ready to raise pr. -- This is an automated message from th

Re: [PR] kafka-connect: resolve CVE-2025-48734 [iceberg]

2025-07-16 Thread via GitHub
ajantha-bhat commented on code in PR #13561: URL: https://github.com/apache/iceberg/pull/13561#discussion_r225485 ## kafka-connect/build.gradle: ## @@ -64,9 +64,14 @@ project(':iceberg-kafka-connect:iceberg-kafka-connect-runtime') { configurations { hive { ex

[PR] add a `Makefile` to `vendor/` [iceberg-python]

2025-07-16 Thread via GitHub
kevinjqliu opened a new pull request, #2218: URL: https://github.com/apache/iceberg-python/pull/2218 # Rationale for this change Add a Makefile to `vendor/`. This helps with running commands to regenerate `vendor/` ``` # Generate all vendor packages: make all

Re: [PR] Support for TIME, TIMESTAMPNTZ_NANO, UUID types in Inclusive Metrics Evaluator [iceberg]

2025-07-16 Thread via GitHub
manirajv06 commented on code in PR #13195: URL: https://github.com/apache/iceberg/pull/13195#discussion_r2211108889 ## api/src/main/java/org/apache/iceberg/expressions/VariantExpressionUtil.java: ## @@ -111,8 +113,19 @@ static T castTo(VariantValue value, Type type) {

Re: [I] Add register_table to Catalog trait [iceberg-rust]

2025-07-16 Thread via GitHub
CTTY commented on issue #1508: URL: https://github.com/apache/iceberg-rust/issues/1508#issuecomment-3079604761 Hi @gabeiglio , please go ahead, and thanks for jumping on this! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] feat: update pyiceberg/catalog/hive.py to support hive 4.x.x [iceberg-python]

2025-07-16 Thread via GitHub
kevinjqliu commented on PR #2206: URL: https://github.com/apache/iceberg-python/pull/2206#issuecomment-3079573198 ah this is the gift that keeps on giving... I've made a couple changes in https://github.com/apache/iceberg-python/pull/2217 to - use hive 4.0.1 in integration tests

Re: [I] Add register_table to Catalog trait [iceberg-rust]

2025-07-16 Thread via GitHub
gabeiglio commented on issue #1508: URL: https://github.com/apache/iceberg-rust/issues/1508#issuecomment-3079570093 Is somebody working on the rest catalog one? if not Ill take it :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [PR] Adding promotion for UnknownType per V3+ spec [iceberg-python]

2025-07-16 Thread via GitHub
gabeiglio commented on PR #2155: URL: https://github.com/apache/iceberg-python/pull/2155#issuecomment-3079553626 Thanks for working on this! The changes LGTM. Tagging @Fokko for another pair of eyes :) -- This is an automated message from the Apache Git Service. To respond to the message

Re: [PR] refactor: consolidate snapshot expiration into MaintenanceTable [iceberg-python]

2025-07-16 Thread via GitHub
aammar5 commented on code in PR #2143: URL: https://github.com/apache/iceberg-python/pull/2143#discussion_r2211053844 ## pyiceberg/table/inspect.py: ## @@ -681,6 +681,32 @@ def all_manifests(self) -> "pa.Table": ) return pa.concat_tables(manifests_by_snapshots)

Re: [PR] refactor: consolidate snapshot expiration into MaintenanceTable [iceberg-python]

2025-07-16 Thread via GitHub
aammar5 commented on code in PR #2143: URL: https://github.com/apache/iceberg-python/pull/2143#discussion_r2211014819 ## pyiceberg/table/inspect.py: ## @@ -681,6 +681,32 @@ def all_manifests(self) -> "pa.Table": ) return pa.concat_tables(manifests_by_snapshots)

Re: [PR] refactor: consolidate snapshot expiration into MaintenanceTable [iceberg-python]

2025-07-16 Thread via GitHub
aammar5 commented on code in PR #2143: URL: https://github.com/apache/iceberg-python/pull/2143#discussion_r2211014819 ## pyiceberg/table/inspect.py: ## @@ -681,6 +681,32 @@ def all_manifests(self) -> "pa.Table": ) return pa.concat_tables(manifests_by_snapshots)

Re: [PR] kafka-connect: resolve CVE-2025-48734 [iceberg]

2025-07-16 Thread via GitHub
liko9 commented on code in PR #13561: URL: https://github.com/apache/iceberg/pull/13561#discussion_r2211034266 ## kafka-connect/build.gradle: ## @@ -64,9 +64,14 @@ project(':iceberg-kafka-connect:iceberg-kafka-connect-runtime') { configurations { hive { extendsFr

Re: [I] Proxy Settings for catalog REST API client [iceberg]

2025-07-16 Thread via GitHub
devonthomas35 commented on issue #12059: URL: https://github.com/apache/iceberg/issues/12059#issuecomment-3079523315 Has the above commit been released in a library version? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] Core: Batch load new files when validating replaced partitions [iceberg]

2025-07-16 Thread via GitHub
gabeiglio commented on code in PR #13556: URL: https://github.com/apache/iceberg/pull/13556#discussion_r2211022305 ## core/src/main/java/org/apache/iceberg/util/SnapshotUtil.java: ## @@ -281,6 +282,10 @@ private static Iterable toIds(Iterable snapshots) { return Iterables.

Re: [PR] refactor: consolidate snapshot expiration into MaintenanceTable [iceberg-python]

2025-07-16 Thread via GitHub
aammar5 commented on code in PR #2143: URL: https://github.com/apache/iceberg-python/pull/2143#discussion_r2211014819 ## pyiceberg/table/inspect.py: ## @@ -681,6 +681,32 @@ def all_manifests(self) -> "pa.Table": ) return pa.concat_tables(manifests_by_snapshots)

Re: [PR] refactor: consolidate snapshot expiration into MaintenanceTable [iceberg-python]

2025-07-16 Thread via GitHub
aammar5 commented on code in PR #2143: URL: https://github.com/apache/iceberg-python/pull/2143#discussion_r2211014819 ## pyiceberg/table/inspect.py: ## @@ -681,6 +681,32 @@ def all_manifests(self) -> "pa.Table": ) return pa.concat_tables(manifests_by_snapshots)

Re: [I] org.apache.thrift.TApplicationException: Invalid method name: 'get_table' [iceberg]

2025-07-16 Thread via GitHub
deniskuzZ commented on issue #12878: URL: https://github.com/apache/iceberg/issues/12878#issuecomment-3079439018 @opendoc-tree where does the `HiveMetaStoreClient` comes from? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] kafka-connect: resolve CVE-2025-48734 [iceberg]

2025-07-16 Thread via GitHub
ajantha-bhat commented on code in PR #13561: URL: https://github.com/apache/iceberg/pull/13561#discussion_r2210960084 ## kafka-connect/build.gradle: ## @@ -64,9 +64,14 @@ project(':iceberg-kafka-connect:iceberg-kafka-connect-runtime') { configurations { hive { ex

Re: [I] Expiring snapshot can erroneously delete data files that are still referenced [iceberg]

2025-07-16 Thread via GitHub
sqd commented on issue #13568: URL: https://github.com/apache/iceberg/issues/13568#issuecomment-3079427630 @amogh-jahagirdar yep, [the safeguard](https://github.com/apache/iceberg/blob/apache-iceberg-1.9.1/core/src/main/java/org/apache/iceberg/RemoveSnapshots.java#L376) only checks whether

Re: [PR] Support for TIME, TIMESTAMPNTZ_NANO, UUID types in Inclusive Metrics Evaluator [iceberg]

2025-07-16 Thread via GitHub
manirajv06 commented on code in PR #13195: URL: https://github.com/apache/iceberg/pull/13195#discussion_r2210954103 ## api/src/main/java/org/apache/iceberg/expressions/VariantExpressionUtil.java: ## @@ -111,8 +113,19 @@ static T castTo(VariantValue value, Type type) {

[I] Iceberg spark runtime 4.0 can't support hive 4 after change metastore version in spark conf [iceberg]

2025-07-16 Thread via GitHub
opendoc-tree opened a new issue, #13572: URL: https://github.com/apache/iceberg/issues/13572 ### Apache Iceberg version None ### Query engine Spark ### Please describe the bug 🐞 **VERSION :** spark - 4.0.0 hive - 4.0.1 iceberg-spark-runtime-4.0_2.13

Re: [PR] Spark 4.0: Preserve row lineage information on compaction [iceberg]

2025-07-16 Thread via GitHub
amogh-jahagirdar commented on code in PR #13555: URL: https://github.com/apache/iceberg/pull/13555#discussion_r2210910659 ## spark/v4.0/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestRewriteDataFilesProcedure.java: ## @@ -976,19 +978,67 @@ public void tes

Re: [PR] feat(datafusion): Support insert_into in IcebergTableProvider [iceberg-rust]

2025-07-16 Thread via GitHub
CTTY commented on code in PR #1511: URL: https://github.com/apache/iceberg-rust/pull/1511#discussion_r2210912250 ## crates/integrations/datafusion/src/physical_plan/write.rs: ## @@ -0,0 +1,371 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contribu

Re: [PR] feat(datafusion): Support insert_into in IcebergTableProvider [iceberg-rust]

2025-07-16 Thread via GitHub
CTTY commented on code in PR #1511: URL: https://github.com/apache/iceberg-rust/pull/1511#discussion_r2210895355 ## crates/integrations/datafusion/src/table/mod.rs: ## @@ -46,6 +50,8 @@ pub struct IcebergTableProvider { snapshot_id: Option, /// A reference-counted arro

Re: [PR] feat(datafusion): Support insert_into in IcebergTableProvider [iceberg-rust]

2025-07-16 Thread via GitHub
CTTY commented on code in PR #1511: URL: https://github.com/apache/iceberg-rust/pull/1511#discussion_r2210895355 ## crates/integrations/datafusion/src/table/mod.rs: ## @@ -46,6 +50,8 @@ pub struct IcebergTableProvider { snapshot_id: Option, /// A reference-counted arro

Re: [I] org.apache.thrift.TApplicationException: Invalid method name: 'get_table' [iceberg]

2025-07-16 Thread via GitHub
opendoc-tree commented on issue #12878: URL: https://github.com/apache/iceberg/issues/12878#issuecomment-3079346515 > This can let Spark work with Hive4.0.1(with HIVE-26537), although this is not an elegant solution. :( i try it but not working. spark 4.0.1 hive 4.0.1

Re: [PR] Spark 4.0: Preserve row lineage information on compaction [iceberg]

2025-07-16 Thread via GitHub
amogh-jahagirdar commented on PR #13555: URL: https://github.com/apache/iceberg/pull/13555#issuecomment-3079340267 I was talking to @aokolnychyi earlier and he brought up a promising approach of using a special identifier that's used internally in the spark compaction implementation, when l

Re: [PR] feat: support incremental scan between 2 snapshots [iceberg-rust]

2025-07-16 Thread via GitHub
CTTY commented on code in PR #1470: URL: https://github.com/apache/iceberg-rust/pull/1470#discussion_r2210875745 ## crates/iceberg/src/scan/context.rs: ## @@ -262,6 +346,61 @@ impl PlanContext { field_ids: self.field_ids.clone(), expression_evaluator_ca

Re: [PR] Core: Batch load new files when validating replaced partitions [iceberg]

2025-07-16 Thread via GitHub
bryanck commented on code in PR #13556: URL: https://github.com/apache/iceberg/pull/13556#discussion_r2210867745 ## core/src/main/java/org/apache/iceberg/util/SnapshotUtil.java: ## @@ -281,6 +282,10 @@ private static Iterable toIds(Iterable snapshots) { return Iterables.tr

Re: [I] Add register_table to Catalog trait [iceberg-rust]

2025-07-16 Thread via GitHub
CTTY commented on issue #1508: URL: https://github.com/apache/iceberg-rust/issues/1508#issuecomment-3079277675 Hi @liurenjie1024 , I've added a list of sub-issues in the description. it seems like I can convert the tasks to issues instead of sub-issues, maybe it's a permission problem --

[I] Spark runtime small packaging issues: service files & annotation dependencies. [iceberg]

2025-07-16 Thread via GitHub
LDVSOFT opened a new issue, #13571: URL: https://github.com/apache/iceberg/issues/13571 ## Description Packaged `iceberg-spark-runtime-‹sparkApi›_‹scalaAbi›` artifacts are shadow/fat jars with relocated dependencies to be used against Spark deployments without conflicts. Unfortunatel

  1   2   >