Re: [PR] Build: Bump pydantic from 2.10.6 to 2.11.1 [iceberg-python]

2025-04-03 Thread via GitHub
Fokko commented on PR #1869: URL: https://github.com/apache/iceberg-python/pull/1869#issuecomment-2777694368 @dependabot rebase -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [PR] HIVE-28801 Iceberg: Refactor HMS table parameter setting to be able to reuse [iceberg]

2025-04-03 Thread via GitHub
zratkai commented on code in PR #12461: URL: https://github.com/apache/iceberg/pull/12461#discussion_r2028197038 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HMSTablePropertyHelper.java: ## @@ -0,0 +1,264 @@ +/* + * Licensed to the Apache Software Foundation (ASF) und

Re: [PR] HIVE-28801 Iceberg: Refactor HMS table parameter setting to be able to reuse [iceberg]

2025-04-03 Thread via GitHub
zratkai commented on code in PR #12461: URL: https://github.com/apache/iceberg/pull/12461#discussion_r2028193995 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HMSTablePropertyHelper.java: ## @@ -0,0 +1,264 @@ +/* + * Licensed to the Apache Software Foundation (ASF) und

Re: [PR] Fix creation of Bucket Transforms with pydantic>=2.11.0 [iceberg-python]

2025-04-03 Thread via GitHub
Fokko merged PR #1881: URL: https://github.com/apache/iceberg-python/pull/1881 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceber

Re: [PR] Spec: Allow the use of `source-id` in V3 [iceberg]

2025-04-03 Thread via GitHub
Fokko commented on code in PR #12644: URL: https://github.com/apache/iceberg/pull/12644#discussion_r2028181000 ## format/spec.md: ## @@ -1453,13 +1457,15 @@ Each sort field in the fields list is stored as an object with the following pro | V1 | V2 | V3 | Fi

Re: [PR] Core: Update deprecation msg [iceberg]

2025-04-03 Thread via GitHub
nastra commented on PR #12720: URL: https://github.com/apache/iceberg/pull/12720#issuecomment-2777678813 merging this, since test failures are unrelated -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] Core: Update deprecation msg [iceberg]

2025-04-03 Thread via GitHub
nastra merged PR #12720: URL: https://github.com/apache/iceberg/pull/12720 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Build and test hive-metastore with Hive 3 and Hive 4 [iceberg]

2025-04-03 Thread via GitHub
wypoon commented on PR #12681: URL: https://github.com/apache/iceberg/pull/12681#issuecomment-2777665104 @pvary see https://github.com/apache/iceberg/pull/12721. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[PR] Alternative implementation for building and testing hive-metastore with Hive 3 and Hive 4 [iceberg]

2025-04-03 Thread via GitHub
wypoon opened a new pull request, #12721: URL: https://github.com/apache/iceberg/pull/12721 ... where we attempt to have a single version of TestHiveMetastore.java. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

Re: [PR] Build and test hive-metastore with Hive 3 and Hive 4 [iceberg]

2025-04-03 Thread via GitHub
wypoon commented on PR #12681: URL: https://github.com/apache/iceberg/pull/12681#issuecomment-2777659880 @pvary I replied on the dev list. > I really would like to see HiveCatalog testing against Hive4, but Spark has constraints around testing newer Hive versions which we need to solv

Re: [PR] Spark: support rewrite on specified target branch [iceberg]

2025-04-03 Thread via GitHub
amitgilad3 commented on code in PR #12257: URL: https://github.com/apache/iceberg/pull/12257#discussion_r2028156549 ## core/src/main/java/org/apache/iceberg/actions/RewriteDataFilesCommitManager.java: ## @@ -51,7 +53,12 @@ public RewriteDataFilesCommitManager(Table table, long

[PR] Core: Update deprecation msg [iceberg]

2025-04-03 Thread via GitHub
nastra opened a new pull request, #12720: URL: https://github.com/apache/iceberg/pull/12720 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-m

Re: [PR] AWS: Add AWS integ tests to check task and enable tests based on required environment variables [iceberg]

2025-04-03 Thread via GitHub
nastra commented on code in PR #12671: URL: https://github.com/apache/iceberg/pull/12671#discussion_r2028127620 ## build.gradle: ## @@ -530,6 +530,7 @@ project(':iceberg-aws') { recommend.set(true) } check.dependsOn('validateS3SignerSpec') + check.dependsOn('integrat

[I] Ice berg sink gives error: Invalid method name: 'get_table` [iceberg]

2025-04-03 Thread via GitHub
Souldiv opened a new issue, #12719: URL: https://github.com/apache/iceberg/issues/12719 ### Query engine trino ### Question I have been playing around with kafka connect and iceberg with hms as catalog. I followed the 1.8.1 iceberg kafka-connect documentation with the s

[PR] AWS: Fix DynamoDB and Glue integration test failures [iceberg]

2025-04-03 Thread via GitHub
lliangyu-lin opened a new pull request, #12718: URL: https://github.com/apache/iceberg/pull/12718 ### Description * Fix integration test failures in aws dynamodb and glue * Some fixes are directly copied from this [closed PR](https://github.com/apache/iceberg/pull/7234) from @jackye199

Re: [PR] Added ExpireSnapshots Feature [iceberg-python]

2025-04-03 Thread via GitHub
ForeverAngry commented on code in PR #1880: URL: https://github.com/apache/iceberg-python/pull/1880#discussion_r2027638423 ## pyproject.toml: ## @@ -83,6 +83,7 @@ cachetools = "^5.5.0" pyiceberg-core = { version = "^0.4.0", optional = true } polars = { version = "^1.21.0", opt

Re: [PR] HIVE-28801 Iceberg: Refactor HMS table parameter setting to be able to reuse [iceberg]

2025-04-03 Thread via GitHub
wypoon commented on code in PR #12461: URL: https://github.com/apache/iceberg/pull/12461#discussion_r2027937054 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java: ## @@ -230,12 +190,14 @@ protected void doCommit(TableMetadata base, TableMetadata m

Re: [PR] feat: re-export name mapping [iceberg-rust]

2025-04-03 Thread via GitHub
jdockerty commented on code in PR #1116: URL: https://github.com/apache/iceberg-rust/pull/1116#discussion_r2025155007 ## crates/iceberg/src/spec/name_mapping.rs: ## @@ -33,12 +48,38 @@ pub struct NameMapping { #[serde(rename_all = "kebab-case")] pub struct MappedField { #

Re: [PR] Core: ability to inject an AuthManager in RESTCatalog [iceberg]

2025-04-03 Thread via GitHub
gh-yzou commented on code in PR #12655: URL: https://github.com/apache/iceberg/pull/12655#discussion_r2027459032 ## core/src/main/java/org/apache/iceberg/rest/RESTCatalog.java: ## @@ -65,7 +68,14 @@ public RESTCatalog(Function, RESTClient> clientBuilder) { public RESTCatalog

[PR] Use InputFile.location() Instead of Direct Object Reference in Error Messages [iceberg]

2025-04-03 Thread via GitHub
Jordano-Dremio opened a new pull request, #12716: URL: https://github.com/apache/iceberg/pull/12716 The InputFile Interface/Object is often used directly in error messages. [Example](https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/avro/AvroIterable.java#L104

Re: [PR] Doc: Add Hive 2.x/3.x support notes in hive.md [iceberg]

2025-04-03 Thread via GitHub
jackylee-ch commented on code in PR #12700: URL: https://github.com/apache/iceberg/pull/12700#discussion_r2027914368 ## docs/docs/hive.md: ## @@ -126,9 +90,6 @@ To enable Hive support globally for an application, set `iceberg.engine.hive.ena For example, setting this in the `h

Re: [PR] Doc: Add Hive 2.x/3.x support notes in hive.md [iceberg]

2025-04-03 Thread via GitHub
jackylee-ch commented on PR #12700: URL: https://github.com/apache/iceberg/pull/12700#issuecomment-2777263902 Done. I have checked the document again and didn't find any other problems. Please take another look at it. -- This is an automated message from the Apache Git Service. To respond

Re: [PR] AWS: Fix DynamoDB and Glue integration test failures [iceberg]

2025-04-03 Thread via GitHub
lliangyu-lin commented on code in PR #12718: URL: https://github.com/apache/iceberg/pull/12718#discussion_r2027881602 ## aws/src/integration/java/org/apache/iceberg/aws/glue/TestGlueCatalogTable.java: ## @@ -333,28 +332,33 @@ public void testRenameTableFailsToCreateNewTable() {

[PR] Fix creation of Bucket Transforms with pydantic>=2.11.0 [iceberg-python]

2025-04-03 Thread via GitHub
b-rick opened a new pull request, #1881: URL: https://github.com/apache/iceberg-python/pull/1881 When using pydantic>=2.11.0, we get an error when creating bucket transforms. In this version, it's illegal to access self before calling super. To fix this, we just need to ensure we call `supe

Re: [PR] AWS: Fix DynamoDB and Glue integration test failures [iceberg]

2025-04-03 Thread via GitHub
lliangyu-lin commented on code in PR #12718: URL: https://github.com/apache/iceberg/pull/12718#discussion_r2027881602 ## aws/src/integration/java/org/apache/iceberg/aws/glue/TestGlueCatalogTable.java: ## @@ -333,28 +332,33 @@ public void testRenameTableFailsToCreateNewTable() {

Re: [I] Implement expire snapshots maintenance operation [iceberg-go]

2025-04-03 Thread via GitHub
arnaudbriche commented on issue #367: URL: https://github.com/apache/iceberg-go/issues/367#issuecomment-2776562353 After digging a bit deeper, I noticed that most of the features I need have already been implemented. I can see this in the code: ```go MetadataDeleteAfterCommitEna

Re: [I] Adding deleteFile method to existing RowDelta operation [iceberg]

2025-04-03 Thread via GitHub
RussellSpitzer commented on issue #12717: URL: https://github.com/apache/iceberg/issues/12717#issuecomment-2777194037 I'm removing the Specification tag, because this is a Reference IMPL change and shouldn't effect the spec -- This is an automated message from the Apache Git Service. To r

Re: [I] Adding deleteFile method to existing RowDelta operation [iceberg]

2025-04-03 Thread via GitHub
RussellSpitzer commented on issue #12717: URL: https://github.com/apache/iceberg/issues/12717#issuecomment-2777192668 Thanks @sn2479 . I think this is a reasonable thing to do, I believe our original implementation is just set up that way because of Spark's limitations but I think widening

Re: [PR] Fix thrift client connection for Kerberos Hive Client [iceberg-python]

2025-04-03 Thread via GitHub
Fokko commented on PR #1747: URL: https://github.com/apache/iceberg-python/pull/1747#issuecomment-2776740059 Gentle ping @kevinjqliu. Any thoughts on the `cached_property` suggested by @hussein-awala? -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] Core: Enhance remove snapshots efficiency by executing them in bulk [iceberg]

2025-04-03 Thread via GitHub
amogh-jahagirdar commented on PR #12670: URL: https://github.com/apache/iceberg/pull/12670#issuecomment-2777098589 Thanks @ricardopereira33 and thanks @nastra for reviewing! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

Re: [PR] Core: lazy init workerPool [iceberg]

2025-04-03 Thread via GitHub
abstractdog commented on code in PR #12427: URL: https://github.com/apache/iceberg/pull/12427#discussion_r2026705407 ## core/src/main/java/org/apache/iceberg/RemoveSnapshots.java: ## @@ -153,6 +153,13 @@ public ExpireSnapshots planWith(ExecutorService executorService) { re

Re: [PR] Spec: update to reflect lineage is required [iceberg]

2025-04-03 Thread via GitHub
amogh-jahagirdar merged PR #12580: URL: https://github.com/apache/iceberg/pull/12580 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@

[PR] feat (catalog/rest): Add create view for rest catalog [iceberg-go]

2025-04-03 Thread via GitHub
dttung2905 opened a new pull request, #376: URL: https://github.com/apache/iceberg-go/pull/376 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

[I] Adding deleteFile method to existing RowDelta operation [iceberg]

2025-04-03 Thread via GitHub
sn2479 opened a new issue, #12717: URL: https://github.com/apache/iceberg/issues/12717 ### Proposed Change Some engines are able to not only produce new data file during row-level change to a table, but can also eliminate entire data files files in the same pending update. Alternativ

Re: [PR] Spec: update to reflect lineage is required [iceberg]

2025-04-03 Thread via GitHub
amogh-jahagirdar commented on PR #12580: URL: https://github.com/apache/iceberg/pull/12580#issuecomment-2777032298 Thanks @danielcweeks , and @RussellSpitzer @rdblue @aokolnychyi for reviewing, and [all others in the community](https://lists.apache.org/thread/61w6lj3tmnyv1gtvmz08twbp6lb8nzs

Re: [PR] Core: Enhance remove snapshots efficiency by executing them in bulk [iceberg]

2025-04-03 Thread via GitHub
nastra commented on code in PR #12670: URL: https://github.com/apache/iceberg/pull/12670#discussion_r2026825325 ## core/src/main/java/org/apache/iceberg/MetadataUpdateParser.java: ## @@ -417,11 +426,9 @@ private static void writeAddSnapshot(MetadataUpdate.AddSnapshot update, Js

Re: [PR] Core: Enhance remove snapshots efficiency by executing them in bulk [iceberg]

2025-04-03 Thread via GitHub
ricardopereira33 commented on PR #12670: URL: https://github.com/apache/iceberg/pull/12670#issuecomment-2776402107 @nastra @amogh-jahagirdar Pushed the change you requested. Thank you guys for the fast reviews and feedback! 💪🏽 -- This is an automated message from the Apache Git Service.

Re: [I] Support data file compaction on branch via Spark `rewrite_data_files` procedure [iceberg]

2025-04-03 Thread via GitHub
relentless-leader commented on issue #7272: URL: https://github.com/apache/iceberg/issues/7272#issuecomment-2776801561 Latest PR tracked by: [#12257 ](https://github.com/apache/iceberg/pull/12257) -- This is an automated message from the Apache Git Service. To respond to the message, pleas

Re: [PR] Spark: support rewrite on specified target branch [iceberg]

2025-04-03 Thread via GitHub
relentless-leader commented on PR #12257: URL: https://github.com/apache/iceberg/pull/12257#issuecomment-2776797664 Are we good to merge this PR ? Is any work pending on this ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] Added ExpireSnapshots Feature [iceberg-python]

2025-04-03 Thread via GitHub
ForeverAngry commented on code in PR #1880: URL: https://github.com/apache/iceberg-python/pull/1880#discussion_r2027645539 ## pyiceberg/table/update/snapshot.py: ## @@ -745,6 +763,8 @@ class ManageSnapshots(UpdateTableMetadata["ManageSnapshots"]): def _commit(self) -> Up

Re: [PR] AWS: Add AWS integ tests to check task and enable tests based on required environment variables [iceberg]

2025-04-03 Thread via GitHub
lliangyu-lin commented on code in PR #12671: URL: https://github.com/apache/iceberg/pull/12671#discussion_r2027343151 ## aws/src/integration/java/org/apache/iceberg/aws/TestAssumeRoleAwsClientFactory.java: ## @@ -47,6 +48,16 @@ import software.amazon.awssdk.services.iam.model.P

Re: [PR] Added ExpireSnapshots Feature [iceberg-python]

2025-04-03 Thread via GitHub
ForeverAngry commented on code in PR #1880: URL: https://github.com/apache/iceberg-python/pull/1880#discussion_r2027643598 ## pyiceberg/table/update/snapshot.py: ## @@ -239,7 +257,7 @@ def _summary(self, snapshot_properties: Dict[str, str] = EMPTY_DICT) -> Summary:

Re: [PR] Added ExpireSnapshots Feature [iceberg-python]

2025-04-03 Thread via GitHub
Fokko commented on code in PR #1880: URL: https://github.com/apache/iceberg-python/pull/1880#discussion_r2027605325 ## pyiceberg/table/update/snapshot.py: ## @@ -745,6 +763,8 @@ class ManageSnapshots(UpdateTableMetadata["ManageSnapshots"]): def _commit(self) -> UpdatesAn

Re: [PR] Flink: Backport RowConverter to Flink 1.19 and 1.18 [iceberg]

2025-04-03 Thread via GitHub
pvary commented on PR #12713: URL: https://github.com/apache/iceberg/pull/12713#issuecomment-2776533157 @Guosmilesmile: Please keep separate PRs for every backport. It is easier on developers maintaining their own release. Could you please create 2 different PRs? Thanks, Peter

Re: [PR] Doc: Add Hive 2.x/3.x support notes in hive.md [iceberg]

2025-04-03 Thread via GitHub
pvary commented on code in PR #12700: URL: https://github.com/apache/iceberg/pull/12700#discussion_r2027452101 ## docs/docs/hive.md: ## @@ -602,7 +567,6 @@ Here are the features highlights for Iceberg Hive read support: 1. **Predicate pushdown**: Pushdown of the Hive SQL `WHER

Re: [PR] Core: Enhance remove snapshots efficiency by executing them in bulk [iceberg]

2025-04-03 Thread via GitHub
nastra commented on code in PR #12670: URL: https://github.com/apache/iceberg/pull/12670#discussion_r2026838728 ## core/src/main/java/org/apache/iceberg/MetadataUpdateParser.java: ## @@ -150,7 +150,8 @@ private MetadataUpdateParser() {} .put(MetadataUpdate.SetPartitio

Re: [PR] Fix the snapshot summary of a partial overwrite [iceberg-python]

2025-04-03 Thread via GitHub
Fokko commented on code in PR #1879: URL: https://github.com/apache/iceberg-python/pull/1879#discussion_r2027379456 ## pyiceberg/table/update/snapshot.py: ## @@ -236,7 +236,6 @@ def _summary(self, snapshot_properties: Dict[str, str] = EMPTY_DICT) -> Summary: return upd

Re: [PR] Core: Enhance remove snapshots efficiency by executing them in bulk [iceberg]

2025-04-03 Thread via GitHub
nastra commented on code in PR #12670: URL: https://github.com/apache/iceberg/pull/12670#discussion_r2026831937 ## core/src/main/java/org/apache/iceberg/MetadataUpdateParser.java: ## @@ -557,11 +564,18 @@ private static MetadataUpdate readAddSnapshot(JsonNode node) { private

Re: [PR] AWS: Add AWS integ tests to check task and enable tests based on required environment variables [iceberg]

2025-04-03 Thread via GitHub
lliangyu-lin commented on code in PR #12671: URL: https://github.com/apache/iceberg/pull/12671#discussion_r2027542721 ## aws/src/integration/java/org/apache/iceberg/aws/TestAssumeRoleAwsClientFactory.java: ## @@ -47,6 +49,14 @@ import software.amazon.awssdk.services.iam.model.P

[I] Move current AWS integration test to separate group and add mock based tests in CI pipeline [iceberg]

2025-04-03 Thread via GitHub
lliangyu-lin opened a new issue, #12715: URL: https://github.com/apache/iceberg/issues/12715 ### Feature Request / Improvement The current [AWS integration tests](https://github.com/apache/iceberg/tree/main/aws/src/integration/java/org/apache/iceberg/aws) are not well maintained beca

Re: [PR] Fix the snapshot summary of a partial overwrite [iceberg-python]

2025-04-03 Thread via GitHub
Fokko commented on code in PR #1879: URL: https://github.com/apache/iceberg-python/pull/1879#discussion_r2027521950 ## pyiceberg/table/update/snapshot.py: ## @@ -236,7 +236,6 @@ def _summary(self, snapshot_properties: Dict[str, str] = EMPTY_DICT) -> Summary: return upd

Re: [PR] Fix the snapshot summary of a partial overwrite [iceberg-python]

2025-04-03 Thread via GitHub
Fokko commented on code in PR #1879: URL: https://github.com/apache/iceberg-python/pull/1879#discussion_r2027518680 ## pyiceberg/table/update/snapshot.py: ## @@ -236,7 +236,6 @@ def _summary(self, snapshot_properties: Dict[str, str] = EMPTY_DICT) -> Summary: return upd

Re: [PR] Fix the snapshot summary of a partial overwrite [iceberg-python]

2025-04-03 Thread via GitHub
Fokko commented on code in PR #1879: URL: https://github.com/apache/iceberg-python/pull/1879#discussion_r2027518680 ## pyiceberg/table/update/snapshot.py: ## @@ -236,7 +236,6 @@ def _summary(self, snapshot_properties: Dict[str, str] = EMPTY_DICT) -> Summary: return upd

Re: [PR] partial overwrite operation [iceberg-python]

2025-04-03 Thread via GitHub
Fokko commented on PR #1840: URL: https://github.com/apache/iceberg-python/pull/1840#issuecomment-2775312549 Ahh thanks @kevinjqliu I don't think we've really implemented this properly. ![image](https://github.com/user-attachments/assets/526fa4a4-d0d6-420a-9b92-eb7886ec2325) Ja

Re: [PR] Core: Simplify AuthManager API [iceberg]

2025-04-03 Thread via GitHub
danielcweeks commented on PR #12555: URL: https://github.com/apache/iceberg/pull/12555#issuecomment-2776494155 I'm more than a little hesitant to start changing this now. Pivoting the way we think about sessions/scopes doesn't really add a lot at this point and we've just completed a major

Re: [PR] Doc: Add Hive 2.x/3.x support notes in hive.md [iceberg]

2025-04-03 Thread via GitHub
pvary commented on code in PR #12700: URL: https://github.com/apache/iceberg/pull/12700#discussion_r2027454952 ## docs/docs/hive.md: ## @@ -26,28 +26,18 @@ a [StorageHandler](https://cwiki.apache.org/confluence/display/Hive/StorageHandl ## Feature support The following featur

Re: [PR] HIVE-28801 Iceberg: Refactor HMS table parameter setting to be able to reuse [iceberg]

2025-04-03 Thread via GitHub
pvary commented on code in PR #12461: URL: https://github.com/apache/iceberg/pull/12461#discussion_r2027439420 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HMSTablePropertyHelper.java: ## @@ -0,0 +1,264 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] Spec: Allow the use of `source-id` in V3 [iceberg]

2025-04-03 Thread via GitHub
rdblue commented on code in PR #12644: URL: https://github.com/apache/iceberg/pull/12644#discussion_r2027429855 ## format/spec.md: ## @@ -1414,12 +1414,16 @@ Each partition field in `fields` is stored as a JSON object with the following p | V1 | V2 | V3 | F

Re: [PR] Spec: Allow the use of `source-id` in V3 [iceberg]

2025-04-03 Thread via GitHub
rdblue commented on code in PR #12644: URL: https://github.com/apache/iceberg/pull/12644#discussion_r2027415792 ## format/spec.md: ## @@ -1453,13 +1457,15 @@ Each sort field in the fields list is stored as an object with the following pro | V1 | V2 | V3 | F

Re: [PR] Fix the snapshot summary of a partial overwrite [iceberg-python]

2025-04-03 Thread via GitHub
kevinjqliu commented on code in PR #1879: URL: https://github.com/apache/iceberg-python/pull/1879#discussion_r2027386623 ## tests/integration/test_writes/test_writes.py: ## @@ -262,6 +262,100 @@ def test_summaries(spark: SparkSession, session_catalog: Catalog, arrow_table_wi

[PR] Hive: Support custom HMSClient [iceberg]

2025-04-03 Thread via GitHub
jackylee-ch opened a new pull request, #12712: URL: https://github.com/apache/iceberg/pull/12712 This PR introduces configurable HMSClient customization to address specific use cases where enterprises require custom Hive Metastore client implementations. -- This is an automated message f

Re: [PR] AWS: Add AWS integ tests to check task and enable tests based on required environment variables [iceberg]

2025-04-03 Thread via GitHub
xiaoxuandev commented on code in PR #12671: URL: https://github.com/apache/iceberg/pull/12671#discussion_r2027383951 ## aws/src/integration/java/org/apache/iceberg/aws/TestAssumeRoleAwsClientFactory.java: ## @@ -47,6 +49,14 @@ import software.amazon.awssdk.services.iam.model.Pu

Re: [PR] Core: Enhance remove snapshots efficiency by executing them in bulk [iceberg]

2025-04-03 Thread via GitHub
nastra commented on code in PR #12670: URL: https://github.com/apache/iceberg/pull/12670#discussion_r2026824248 ## core/src/main/java/org/apache/iceberg/MetadataUpdateParser.java: ## @@ -229,7 +230,15 @@ public static void toJson(MetadataUpdate metadataUpdate, JsonGenerator gen

Re: [PR] spec: Variant lower/upper bounds [iceberg]

2025-04-03 Thread via GitHub
danielcweeks commented on PR #12658: URL: https://github.com/apache/iceberg/pull/12658#issuecomment-2776344693 @aihuaxu and @rdblue is there a reason we need to explicitly restrict the lower/upper bounds to shredded fields? I would think that the stats pruning would be useful for any field

Re: [PR] Fix the snapshot summary of a partial overwrite [iceberg-python]

2025-04-03 Thread via GitHub
kevinjqliu commented on code in PR #1879: URL: https://github.com/apache/iceberg-python/pull/1879#discussion_r2027360226 ## tests/integration/test_writes/test_writes.py: ## @@ -262,6 +262,100 @@ def test_summaries(spark: SparkSession, session_catalog: Catalog, arrow_table_wi

Re: [PR] Build: Retry flaky test [iceberg]

2025-04-03 Thread via GitHub
manuzhang commented on PR #12707: URL: https://github.com/apache/iceberg/pull/12707#issuecomment-2776330349 Let me try with https://github.com/apache/iceberg/pull/12714 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

Re: [PR] AWS: Add AWS integ tests to check task and enable tests based on required environment variables [iceberg]

2025-04-03 Thread via GitHub
lliangyu-lin commented on code in PR #12671: URL: https://github.com/apache/iceberg/pull/12671#discussion_r2027343151 ## aws/src/integration/java/org/apache/iceberg/aws/TestAssumeRoleAwsClientFactory.java: ## @@ -47,6 +48,16 @@ import software.amazon.awssdk.services.iam.model.P

Re: [PR] Core: ability to inject an AuthManager in RESTCatalog [iceberg]

2025-04-03 Thread via GitHub
ajantha-bhat commented on PR #12655: URL: https://github.com/apache/iceberg/pull/12655#issuecomment-2776096706 Moving this out of 1.9.0 milestone as of now as we don't have conclusion on this and this is not really a release blocker. -- This is an automated message from the Apache Git Se

Re: [PR] Doc: Add doc for flink exec config [iceberg]

2025-04-03 Thread via GitHub
Guosmilesmile commented on PR #12691: URL: https://github.com/apache/iceberg/pull/12691#issuecomment-2776038624 @pvary As you mentioned, some configs are about to expire, and some are mistakes. I’ve summarized all the configs: **Source:** table.exec.iceberg.infer-source-paral

Re: [PR] Flink: Backport RowConverter to Flink 1.19 and 1.18 [iceberg]

2025-04-03 Thread via GitHub
Guosmilesmile commented on PR #12713: URL: https://github.com/apache/iceberg/pull/12713#issuecomment-2776042921 @pvary This is a clearn backport. Please take some time to review it , Thanks ! -- This is an automated message from the Apache Git Service. To respond to the message, please lo

Re: [PR] Core: Enhance remove snapshots efficiency by executing them in bulk [iceberg]

2025-04-03 Thread via GitHub
nastra commented on code in PR #12670: URL: https://github.com/apache/iceberg/pull/12670#discussion_r2027131882 ## core/src/main/java/org/apache/iceberg/MetadataUpdate.java: ## @@ -328,6 +328,11 @@ public void applyTo(TableMetadata.Builder metadataBuilder) { } } + /*

Re: [PR] Flink: Fix npe in SketchUtil when numPartitions bigger than length of samples [iceberg]

2025-04-03 Thread via GitHub
Guosmilesmile commented on PR #12703: URL: https://github.com/apache/iceberg/pull/12703#issuecomment-2775797791 @pvary The array made by this method is used in SketchUtil.partition to find a SortKey in which part. So, I think a shorter array in this case is ok . https://github.com/

Re: [PR] HIVE-28801 Iceberg: Refactor HMS table parameter setting to be able to reuse [iceberg]

2025-04-03 Thread via GitHub
zratkai commented on PR #12461: URL: https://github.com/apache/iceberg/pull/12461#issuecomment-2775788409 @pvary thanks for the reivew. Fixed your comments. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] HIVE-28801 Iceberg: Refactor HMS table parameter setting to be able to reuse [iceberg]

2025-04-03 Thread via GitHub
zratkai commented on code in PR #12461: URL: https://github.com/apache/iceberg/pull/12461#discussion_r2026993865 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveViewOperations.java: ## @@ -175,11 +172,14 @@ public void doCommit(ViewMetadata base, ViewMetadata metada

Re: [PR] HIVE-28801 Iceberg: Refactor HMS table parameter setting to be able to reuse [iceberg]

2025-04-03 Thread via GitHub
zratkai commented on code in PR #12461: URL: https://github.com/apache/iceberg/pull/12461#discussion_r2026993015 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HMSTablePropertyHelper.java: ## @@ -0,0 +1,261 @@ +/* + * Licensed to the Apache Software Foundation (ASF) und

Re: [PR] Doc: Add Hive 2.x/3.x support notes in hive.md [iceberg]

2025-04-03 Thread via GitHub
deniskuzZ commented on code in PR #12700: URL: https://github.com/apache/iceberg/pull/12700#discussion_r2026970779 ## docs/docs/hive.md: ## @@ -145,17 +130,14 @@ The table level configuration overrides the global Hadoop configuration. # Hive on Tez configuration -To us

Re: [PR] Doc: Add Hive 2.x/3.x support notes in hive.md [iceberg]

2025-04-03 Thread via GitHub
deniskuzZ commented on code in PR #12700: URL: https://github.com/apache/iceberg/pull/12700#discussion_r2026952361 ## docs/docs/hive.md: ## @@ -145,17 +130,14 @@ The table level configuration overrides the global Hadoop configuration. # Hive on Tez configuration -To us

Re: [PR] Doc: Add Hive 2.x/3.x support notes in hive.md [iceberg]

2025-04-03 Thread via GitHub
deniskuzZ commented on code in PR #12700: URL: https://github.com/apache/iceberg/pull/12700#discussion_r2026959066 ## docs/docs/hive.md: ## @@ -89,29 +89,14 @@ Hive supports the following additional features with Hive version 4.0.0 and abov ## Enabling Iceberg support in Hiv

Re: [PR] Backport #11702 to FLink1.19 and 1.18 [iceberg]

2025-04-03 Thread via GitHub
pvary merged PR #12080: URL: https://github.com/apache/iceberg/pull/12080 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apa

Re: [PR] Flink: Fix npe in SketchUtil when numPartitions bigger than length of samples [iceberg]

2025-04-03 Thread via GitHub
pvary commented on PR #12703: URL: https://github.com/apache/iceberg/pull/12703#issuecomment-2775654775 @Guosmilesmile: Based on my understanding, if we fix the issue in the way you proposed then we will have a shorter array than expected. This might cause issues later when some algorithm e

Re: [PR] Core: lazy init workerPool [iceberg]

2025-04-03 Thread via GitHub
pvary commented on code in PR #12427: URL: https://github.com/apache/iceberg/pull/12427#discussion_r2026883731 ## core/src/main/java/org/apache/iceberg/SnapshotProducer.java: ## @@ -197,7 +198,7 @@ protected String targetBranch() { } protected ExecutorService workerPool(

Re: [PR] Doc: Add Hive 2.x/3.x support notes in hive.md [iceberg]

2025-04-03 Thread via GitHub
pvary commented on code in PR #12700: URL: https://github.com/apache/iceberg/pull/12700#discussion_r2026879916 ## docs/docs/hive.md: ## @@ -213,7 +195,7 @@ SET iceberg.catalog.glue.lock.table=myGlueLockTable; ## DDL Commands -Not all the features below are supported with Hi

Re: [PR] Doc: Add Hive 2.x/3.x support notes in hive.md [iceberg]

2025-04-03 Thread via GitHub
pvary commented on code in PR #12700: URL: https://github.com/apache/iceberg/pull/12700#discussion_r2026876312 ## docs/docs/hive.md: ## @@ -145,17 +130,14 @@ The table level configuration overrides the global Hadoop configuration. # Hive on Tez configuration -To use th

Re: [PR] Doc: Add Hive 2.x/3.x support notes in hive.md [iceberg]

2025-04-03 Thread via GitHub
pvary commented on code in PR #12700: URL: https://github.com/apache/iceberg/pull/12700#discussion_r2026877554 ## docs/docs/hive.md: ## @@ -145,17 +130,14 @@ The table level configuration overrides the global Hadoop configuration. # Hive on Tez configuration -To use th

Re: [PR] Build: Retry flaky test [iceberg]

2025-04-03 Thread via GitHub
pvary commented on PR #12707: URL: https://github.com/apache/iceberg/pull/12707#issuecomment-2775527627 @manuzhang: Is it flaky because the waiting time is too short? Shall we just decrease the number of threads, or commits, or increase the timeout? I usually prefer to avoid retries a

Re: [PR] Flink: Fix npe in SketchUtil when numPartitions bigger than length of samples [iceberg]

2025-04-03 Thread via GitHub
Guosmilesmile commented on PR #12703: URL: https://github.com/apache/iceberg/pull/12703#issuecomment-2775513562 @pvary Yes, I just removed the null values from the array, and the order of the array stayed the same. This modification did not change the original behavior of the method.

Re: [PR] Core: Enhance remove snapshots efficiency by executing them in bulk [iceberg]

2025-04-03 Thread via GitHub
nastra commented on code in PR #12670: URL: https://github.com/apache/iceberg/pull/12670#discussion_r2026833933 ## core/src/test/java/org/apache/iceberg/TestMetadataUpdateParser.java: ## @@ -1018,9 +1041,16 @@ public void assertEquals( (MetadataUpdate.AddSnapshot) e

Re: [PR] Core: Enhance remove snapshots efficiency by executing them in bulk [iceberg]

2025-04-03 Thread via GitHub
nastra commented on code in PR #12670: URL: https://github.com/apache/iceberg/pull/12670#discussion_r2026832528 ## core/src/main/java/org/apache/iceberg/TableMetadata.java: ## @@ -1864,8 +1870,13 @@ private static List updateSnapshotLog( List changes) { Set inter

Re: [PR] Core: Enhance remove snapshots efficiency by executing them in bulk [iceberg]

2025-04-03 Thread via GitHub
nastra commented on code in PR #12670: URL: https://github.com/apache/iceberg/pull/12670#discussion_r2026828023 ## core/src/main/java/org/apache/iceberg/MetadataUpdateParser.java: ## @@ -557,11 +564,18 @@ private static MetadataUpdate readAddSnapshot(JsonNode node) { private

[I] [OpenAPI spec] Etag header with CommitTableResponse [iceberg]

2025-04-03 Thread via GitHub
drnta opened a new issue, #12711: URL: https://github.com/apache/iceberg/issues/12711 ### Query engine _No response_ ### Question Hi everyone. It seems that in https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml the `ETag` header is re

Re: [PR] Core: Enhance remove snapshots efficiency by executing them in bulk [iceberg]

2025-04-03 Thread via GitHub
nastra commented on code in PR #12670: URL: https://github.com/apache/iceberg/pull/12670#discussion_r2026821208 ## core/src/main/java/org/apache/iceberg/MetadataUpdate.java: ## @@ -328,6 +328,11 @@ public void applyTo(TableMetadata.Builder metadataBuilder) { } } + /*

Re: [PR] Spec: Allow the use of `source-id` in V3 [iceberg]

2025-04-03 Thread via GitHub
Fokko commented on PR #12644: URL: https://github.com/apache/iceberg/pull/12644#issuecomment-2775432019 @szehon-ho I think that's a good idea. Let me rework the PR 👍 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Spark 3.5: Support case sensitive in replace where statement [iceberg]

2025-04-03 Thread via GitHub
dolcino-li commented on code in PR #12706: URL: https://github.com/apache/iceberg/pull/12706#discussion_r2026070423 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestReplaceWhere.java: ## @@ -0,0 +1,60 @@ +/* + * Licensed to the Apache Software

[PR] Fix the snapshot summary of a partial overwrite [iceberg-python]

2025-04-03 Thread via GitHub
Fokko opened a new pull request, #1879: URL: https://github.com/apache/iceberg-python/pull/1879 # Rationale for this change @kevinjqliu PTAL. I took the liberty of providing a fix for this since I was curious where this was coming from, hope you don't mind! I've cherry-picked your co

Re: [PR] Flink: Fix npe in SketchUtil when numPartitions bigger than length of samples [iceberg]

2025-04-03 Thread via GitHub
pvary commented on PR #12703: URL: https://github.com/apache/iceberg/pull/12703#issuecomment-2775191541 @Guosmilesmile: Did we check that the sorter array will not cause later issues when the statistics is used? -- This is an automated message from the Apache Git Service. To respond to th

Re: [PR] backport #11301(rowconverter) to Flink 1.19 and 1.18 [iceberg]

2025-04-03 Thread via GitHub
pvary commented on PR #11826: URL: https://github.com/apache/iceberg/pull/11826#issuecomment-2775170197 Thanks @Guosmilesmile! Please create the backport PR -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [PR] Doc: Add Hive 2.x/3.x support notes in hive.md [iceberg]

2025-04-03 Thread via GitHub
jackylee-ch commented on PR #12700: URL: https://github.com/apache/iceberg/pull/12700#issuecomment-2774933815 @manuzhang @pvary I have updated the PR. PTAL, thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

Re: [PR] Doc: Add doc for flink exec config [iceberg]

2025-04-03 Thread via GitHub
pvary commented on code in PR #12691: URL: https://github.com/apache/iceberg/pull/12691#discussion_r2026600147 ## docs/docs/flink-configuration.md: ## @@ -198,4 +198,42 @@ they are. This is only applicable to {@link StatisticsType#Map} for low-cardinality scenario. For {@link

Re: [PR] Doc: Add doc for flink exec config [iceberg]

2025-04-03 Thread via GitHub
pvary commented on code in PR #12691: URL: https://github.com/apache/iceberg/pull/12691#discussion_r2026602432 ## docs/docs/flink-configuration.md: ## @@ -198,4 +198,42 @@ they are. This is only applicable to {@link StatisticsType#Map} for low-cardinality scenario. For {@link

Re: [PR] Doc: Add doc for flink exec config [iceberg]

2025-04-03 Thread via GitHub
pvary commented on code in PR #12691: URL: https://github.com/apache/iceberg/pull/12691#discussion_r2026604541 ## docs/docs/flink-configuration.md: ## @@ -198,4 +198,42 @@ they are. This is only applicable to {@link StatisticsType#Map} for low-cardinality scenario. For {@link

  1   2   >