Re: [PR] S3: Disable strong integrity checksums [iceberg]

2025-02-15 Thread via GitHub
wendigo commented on code in PR #12264: URL: https://github.com/apache/iceberg/pull/12264#discussion_r1957256878 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3RequestUtil.java: ## @@ -149,4 +151,10 @@ static void configurePermission( Function aclSetter) { aclSette

Re: [PR] Add upsert docs [iceberg-python]

2025-02-15 Thread via GitHub
kevinjqliu commented on PR #1665: URL: https://github.com/apache/iceberg-python/pull/1665#issuecomment-2661288902 > jon_cols seems focused on the primary key. How do we specify the partition column to enable partition pruning @ananthdurai the partition columns are part of the Iceberg

Re: [I] previous eq deletes handling on new write [iceberg]

2025-02-15 Thread via GitHub
eshishki commented on issue #12280: URL: https://github.com/apache/iceberg/issues/12280#issuecomment-2661287874 currently we use starrocks which plans the scan like ``` UNION ├── ICEBERG_SCAN (data files with only pos deletes) │ └── OutputRows: 73 │ ── HASH_JOIN (LEFT ANT

Re: [I] previous eq deletes handling on new write [iceberg]

2025-02-15 Thread via GitHub
singhpk234 commented on issue #12280: URL: https://github.com/apache/iceberg/issues/12280#issuecomment-2661252700 sounds fair, eq deletes are partition scoped, may be we need to stack it either per write or as part of async process like `rewrite_position_delete_files`. Side note : Can you p

[PR] Build: Bump io.netty:netty-buffer from 4.1.117.Final to 4.1.118.Final [iceberg]

2025-02-15 Thread via GitHub
dependabot[bot] opened a new pull request, #12287: URL: https://github.com/apache/iceberg/pull/12287 Bumps [io.netty:netty-buffer](https://github.com/netty/netty) from 4.1.117.Final to 4.1.118.Final. Commits https://github.com/netty/netty/commit/36f95cfaeed0c1313b21f1b5350c1943

[PR] Build: Bump software.amazon.awssdk:bom from 2.30.16 to 2.30.21 [iceberg]

2025-02-15 Thread via GitHub
dependabot[bot] opened a new pull request, #12286: URL: https://github.com/apache/iceberg/pull/12286 Bumps software.amazon.awssdk:bom from 2.30.16 to 2.30.21. [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=soft

[PR] Build: Bump datamodel-code-generator from 0.27.2 to 0.28.1 [iceberg]

2025-02-15 Thread via GitHub
dependabot[bot] opened a new pull request, #12285: URL: https://github.com/apache/iceberg/pull/12285 Bumps [datamodel-code-generator](https://github.com/koxudaxi/datamodel-code-generator) from 0.27.2 to 0.28.1. Release notes Sourced from https://github.com/koxudaxi/datamodel-code-

[PR] Build: Bump mkdocs-material from 9.6.3 to 9.6.4 [iceberg]

2025-02-15 Thread via GitHub
dependabot[bot] opened a new pull request, #12284: URL: https://github.com/apache/iceberg/pull/12284 Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 9.6.3 to 9.6.4. Release notes Sourced from https://github.com/squidfunk/mkdocs-material/releases";>mkdocs-

[I] REST API responses with Spark return status code 200 instead of 204 [iceberg]

2025-02-15 Thread via GitHub
connortsui20 opened a new issue, #12283: URL: https://github.com/apache/iceberg/issues/12283 ### Apache Iceberg version 1.8.0 (latest release) ### Query engine Spark ### Please describe the bug 🐞 In the [REST API yaml specification](https://github.com/apach

Re: [PR] Data: Handle case where partition location is missing for `TableMigrationUtil` [iceberg]

2025-02-15 Thread via GitHub
jshmchenxi commented on PR #12212: URL: https://github.com/apache/iceberg/pull/12212#issuecomment-2661200379 > @jshmchenxi Can we add an end-to-end test in `TestSnapshotTableAction`? @manuzhang I've added the end-to-end test. Please take a look. -- This is an automated message from

Re: [PR] Add upsert docs [iceberg-python]

2025-02-15 Thread via GitHub
ananthdurai commented on PR #1665: URL: https://github.com/apache/iceberg-python/pull/1665#issuecomment-2661196077 jon_cols seems focused on the primary key. How do we specify the partition column to enable partition pruning? -- This is an automated message from the Apache Git Service. T

Re: [PR] S3: Disable strong integrity checksums [iceberg]

2025-02-15 Thread via GitHub
ebyhr commented on code in PR #12264: URL: https://github.com/apache/iceberg/pull/12264#discussion_r1955642546 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3RequestUtil.java: ## @@ -149,4 +151,10 @@ static void configurePermission( Function aclSetter) { aclSetter.

[PR] Spark 3.5: Fix job description of RewriteTablePathSparkAction [iceberg]

2025-02-15 Thread via GitHub
ebyhr opened a new pull request, #12282: URL: https://github.com/apache/iceberg/pull/12282 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-ma

Re: [PR] Add upsert docs [iceberg-python]

2025-02-15 Thread via GitHub
kevinjqliu commented on code in PR #1665: URL: https://github.com/apache/iceberg-python/pull/1665#discussion_r1957222927 ## pyiceberg/table/__init__.py: ## @@ -1148,6 +1148,15 @@ def upsert( """ from pyiceberg.table import upsert_util +if join_cols is

Re: [I] Does iceberg has plan to support Json Type? [iceberg]

2025-02-15 Thread via GitHub
github-actions[bot] commented on issue #6467: URL: https://github.com/apache/iceberg/issues/6467#issuecomment-2661158986 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] Athena Iceberg does not delete orphan files [iceberg]

2025-02-15 Thread via GitHub
github-actions[bot] closed issue #10878: Athena Iceberg does not delete orphan files URL: https://github.com/apache/iceberg/issues/10878 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Core,REST: extend httpClient builder to support tls factory [iceberg]

2025-02-15 Thread via GitHub
github-actions[bot] commented on PR #11979: URL: https://github.com/apache/iceberg/pull/11979#issuecomment-2661159077 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [I] throw exception : InvalidOperationException(message:The following columns have types incompatible with the existing columns in their respective positions : idd1) when add column [iceberg]

2025-02-15 Thread via GitHub
github-actions[bot] commented on issue #3747: URL: https://github.com/apache/iceberg/issues/3747#issuecomment-2661158975 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache Gi

Re: [PR] REST: Avoid deprecated execute without HttpClientResponseHandler [iceberg]

2025-02-15 Thread via GitHub
github-actions[bot] commented on PR #11870: URL: https://github.com/apache/iceberg/pull/11870#issuecomment-2661159059 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If

Re: [PR] [Core] Support Truncate(0) for metrics [iceberg]

2025-02-15 Thread via GitHub
github-actions[bot] commented on PR #11905: URL: https://github.com/apache/iceberg/pull/11905#issuecomment-2661159063 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [PR] REST: Avoid deprecated execute without HttpClientResponseHandler [iceberg]

2025-02-15 Thread via GitHub
github-actions[bot] closed pull request #11870: REST: Avoid deprecated execute without HttpClientResponseHandler URL: https://github.com/apache/iceberg/pull/11870 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [PR] Spark 3.5: Add query runner in test module [iceberg]

2025-02-15 Thread via GitHub
github-actions[bot] commented on PR #11758: URL: https://github.com/apache/iceberg/pull/11758#issuecomment-2661159047 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [PR] Core, Rest: Enable useSystemProperties on RESTClient [iceberg]

2025-02-15 Thread via GitHub
github-actions[bot] commented on PR #11548: URL: https://github.com/apache/iceberg/pull/11548#issuecomment-2661159037 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [I] Specify in lower/upper bounds in data_file struct are exact [iceberg]

2025-02-15 Thread via GitHub
github-actions[bot] commented on issue #10930: URL: https://github.com/apache/iceberg/issues/10930#issuecomment-2661159019 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [I] Athena Iceberg does not delete orphan files [iceberg]

2025-02-15 Thread via GitHub
github-actions[bot] commented on issue #10878: URL: https://github.com/apache/iceberg/issues/10878#issuecomment-2661159000 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [I] MERGE INTO TABLE is not supported temporarily. [iceberg]

2025-02-15 Thread via GitHub
github-actions[bot] closed issue #10882: MERGE INTO TABLE is not supported temporarily. URL: https://github.com/apache/iceberg/issues/10882 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [I] throw exception : InvalidOperationException(message:The following columns have types incompatible with the existing columns in their respective positions : idd1) when add column [iceberg]

2025-02-15 Thread via GitHub
github-actions[bot] closed issue #3747: throw exception : InvalidOperationException(message:The following columns have types incompatible with the existing columns in their respective positions : idd1) when add column URL: https://github.com/apache/iceberg/issues/3747 -- This is an automated

Re: [PR] Add upsert docs [iceberg-python]

2025-02-15 Thread via GitHub
soumilshah1995 commented on PR #1665: URL: https://github.com/apache/iceberg-python/pull/1665#issuecomment-2661156798 lovely -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Add upsert docs [iceberg-python]

2025-02-15 Thread via GitHub
soumilshah1995 commented on PR #1665: URL: https://github.com/apache/iceberg-python/pull/1665#issuecomment-2661154101 Hi im trying an example I am getting no method upsert I using 0.9.0 version I read the docs attached am I missing something ``` import os import pyarrow as pa

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-15 Thread via GitHub
Fokko commented on PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#issuecomment-2661107134 @soumilshah1995 I've raised a PR here https://github.com/apache/iceberg-python/pull/1665 -- This is an automated message from the Apache Git Service. To respond to the message, ple

[PR] Add upsert docs [iceberg-python]

2025-02-15 Thread via GitHub
Fokko opened a new pull request, #1665: URL: https://github.com/apache/iceberg-python/pull/1665 And make the join-cols optional using the identifier fields. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [I] Integrate pyiceberg with Dask [iceberg]

2025-02-15 Thread via GitHub
andoni-garcia-fgp commented on issue #5800: URL: https://github.com/apache/iceberg/issues/5800#issuecomment-2661089695 @Fokko I see you marked this as complete under the milestone PyIceberg 0.6.0 release. Is that accurate? I currently use a variant of grobgl's `DataFrameIOFunction` from abo

[I] Print un-pretty metadata JSON files without whitespace [iceberg]

2025-02-15 Thread via GitHub
istreeter opened a new issue, #12281: URL: https://github.com/apache/iceberg/issues/12281 ### Feature Request / Improvement Currently, metadata files are pretty-printed, with lots of new-lines and whitespace indentations. [This is the relevant line of code](https://github.com/apache

Re: [PR] Feature: MERGE/Upsert Support [iceberg-python]

2025-02-15 Thread via GitHub
soumilshah1995 commented on PR #1534: URL: https://github.com/apache/iceberg-python/pull/1534#issuecomment-2661067345 Hi can you please help us with some example code ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

Re: [I] previous eq deletes handling on new write [iceberg]

2025-02-15 Thread via GitHub
eshishki commented on issue #12280: URL: https://github.com/apache/iceberg/issues/12280#issuecomment-2661063015 maybe we could have rewrite_position_delete_files but for eq -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

Re: [PR] Kafka Connect: Add SMTs for Debezium and AWS DMS [iceberg]

2025-02-15 Thread via GitHub
liko9 commented on PR #11936: URL: https://github.com/apache/iceberg/pull/11936#issuecomment-2661037022 Can someone please add this to the milestone for 1.9.0? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

[I] previous eq deletes handling on new write [iceberg]

2025-02-15 Thread via GitHub
eshishki opened a new issue, #12280: URL: https://github.com/apache/iceberg/issues/12280 ### Feature Request / Improvement We do ingestion from debezium to iceberg via https://github.com/databricks/iceberg-kafka-connect/ Basically it uses flink delta writer. Each batch of da

Re: [PR] feat(catalog/rest): Add support for view related operations [iceberg-go]

2025-02-15 Thread via GitHub
zeroshade merged PR #290: URL: https://github.com/apache/iceberg-go/pull/290 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [I] Enhance iceberg-go to Support Nessie API for All Catalog Operations [iceberg-go]

2025-02-15 Thread via GitHub
zeroshade commented on issue #291: URL: https://github.com/apache/iceberg-go/issues/291#issuecomment-2660993692 This URI seems wrong? `http://localhost:19120/iceberg/v1/main%7Cs3%3A%2F%2Fwarehouse/namespaces/my_namespace_1/tables` Looking at Nessie's API description, i'm not seeing `/

[PR] Spark: Remove Spark 3.3 support [iceberg]

2025-02-15 Thread via GitHub
manuzhang opened a new pull request, #12279: URL: https://github.com/apache/iceberg/pull/12279 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [PR] Spark: Detect dangling DVs properly [iceberg]

2025-02-15 Thread via GitHub
singhpk234 commented on code in PR #12270: URL: https://github.com/apache/iceberg/pull/12270#discussion_r1956530298 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RemoveDanglingDeletesSparkAction.java: ## @@ -156,7 +162,12 @@ private List findDanglingDeletes(

Re: [PR] Fix: `SqlCatalog` list_namespaces() should return only sub-namespaces [iceberg-python]

2025-02-15 Thread via GitHub
alessandro-nori commented on code in PR #1629: URL: https://github.com/apache/iceberg-python/pull/1629#discussion_r1957078977 ## tests/catalog/test_sql.py: ## @@ -1117,17 +1117,30 @@ def test_create_namespace_with_empty_identifier(catalog: SqlCatalog, empty_names lazy_

Re: [PR] Fix: `SqlCatalog` list_namespaces() should return only sub-namespaces [iceberg-python]

2025-02-15 Thread via GitHub
alessandro-nori commented on code in PR #1629: URL: https://github.com/apache/iceberg-python/pull/1629#discussion_r1957078613 ## tests/catalog/test_sql.py: ## @@ -1117,17 +1117,30 @@ def test_create_namespace_with_empty_identifier(catalog: SqlCatalog, empty_names lazy_

Re: [PR] Docs: Add rewrite-table-path in spark procedure [iceberg]

2025-02-15 Thread via GitHub
szehon-ho commented on code in PR #12115: URL: https://github.com/apache/iceberg/pull/12115#discussion_r1957072281 ## docs/docs/spark-procedures.md: ## @@ -972,4 +972,101 @@ CALL catalog_name.system.compute_table_stats(table => 'my_table', snapshot_id => Collect statistics of

Re: [PR] Docs: Add rewrite-table-path in spark procedure [iceberg]

2025-02-15 Thread via GitHub
szehon-ho commented on code in PR #12115: URL: https://github.com/apache/iceberg/pull/12115#discussion_r1957074174 ## docs/docs/spark-procedures.md: ## @@ -972,4 +972,101 @@ CALL catalog_name.system.compute_table_stats(table => 'my_table', snapshot_id => Collect statistics of

Re: [PR] Docs: Add rewrite-table-path in spark procedure [iceberg]

2025-02-15 Thread via GitHub
szehon-ho commented on code in PR #12115: URL: https://github.com/apache/iceberg/pull/12115#discussion_r1957074174 ## docs/docs/spark-procedures.md: ## @@ -972,4 +972,101 @@ CALL catalog_name.system.compute_table_stats(table => 'my_table', snapshot_id => Collect statistics of

Re: [PR] Docs: Add rewrite-table-path in spark procedure [iceberg]

2025-02-15 Thread via GitHub
szehon-ho commented on code in PR #12115: URL: https://github.com/apache/iceberg/pull/12115#discussion_r1957073932 ## docs/docs/spark-procedures.md: ## @@ -972,4 +972,101 @@ CALL catalog_name.system.compute_table_stats(table => 'my_table', snapshot_id => Collect statistics of

Re: [PR] Docs: Add rewrite-table-path in spark procedure [iceberg]

2025-02-15 Thread via GitHub
szehon-ho commented on code in PR #12115: URL: https://github.com/apache/iceberg/pull/12115#discussion_r1957073691 ## docs/docs/spark-procedures.md: ## @@ -972,4 +972,101 @@ CALL catalog_name.system.compute_table_stats(table => 'my_table', snapshot_id => Collect statistics of

Re: [PR] Docs: Add rewrite-table-path in spark procedure [iceberg]

2025-02-15 Thread via GitHub
szehon-ho commented on code in PR #12115: URL: https://github.com/apache/iceberg/pull/12115#discussion_r1957073492 ## docs/docs/spark-procedures.md: ## @@ -972,4 +972,98 @@ CALL catalog_name.system.compute_table_stats(table => 'my_table', snapshot_id => Collect statistics of t

Re: [PR] Docs: Add rewrite-table-path in spark procedure [iceberg]

2025-02-15 Thread via GitHub
szehon-ho commented on code in PR #12115: URL: https://github.com/apache/iceberg/pull/12115#discussion_r1957073109 ## docs/docs/spark-procedures.md: ## @@ -972,4 +972,101 @@ CALL catalog_name.system.compute_table_stats(table => 'my_table', snapshot_id => Collect statistics of

Re: [PR] Docs: Add rewrite-table-path in spark procedure [iceberg]

2025-02-15 Thread via GitHub
dramaticlly commented on code in PR #12115: URL: https://github.com/apache/iceberg/pull/12115#discussion_r1957071171 ## docs/docs/spark-procedures.md: ## @@ -972,4 +972,98 @@ CALL catalog_name.system.compute_table_stats(table => 'my_table', snapshot_id => Collect statistics of

Re: [PR] Docs: Add rewrite-table-path in spark procedure [iceberg]

2025-02-15 Thread via GitHub
szehon-ho commented on code in PR #12115: URL: https://github.com/apache/iceberg/pull/12115#discussion_r1957066734 ## docs/docs/spark-procedures.md: ## @@ -972,4 +972,98 @@ CALL catalog_name.system.compute_table_stats(table => 'my_table', snapshot_id => Collect statistics of t