Re: [PR] Add pyiceberg DataFusion e2e test [iceberg-rust]

2025-01-04 Thread via GitHub
gruuya commented on code in PR #825: URL: https://github.com/apache/iceberg-rust/pull/825#discussion_r1903216351 ## crates/integration_tests/testdata/pyiceberg/provision.py: ## @@ -0,0 +1,87 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor

Re: [PR] Add pyiceberg DataFusion e2e test [iceberg-rust]

2025-01-04 Thread via GitHub
gruuya commented on code in PR #825: URL: https://github.com/apache/iceberg-rust/pull/825#discussion_r1903215769 ## crates/integration_tests/testdata/pyiceberg/provision.py: ## @@ -0,0 +1,87 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor

Re: [PR] Spark 3.5: Implement RewriteTablePath [iceberg]

2025-01-04 Thread via GitHub
szehon-ho commented on PR #11555: URL: https://github.com/apache/iceberg/pull/11555#issuecomment-2571515478 Rebased and addressed review comments, thanks a lot @flyrain and @dramaticlly for reviewing this big change. -- This is an automated message from the Apache Git Service. To respond

[PR] Build: Bump io.delta:delta-spark_2.12 from 3.2.1 to 3.3.0 [iceberg]

2025-01-04 Thread via GitHub
dependabot[bot] opened a new pull request, #11911: URL: https://github.com/apache/iceberg/pull/11911 Bumps [io.delta:delta-spark_2.12](https://github.com/delta-io/delta) from 3.2.1 to 3.3.0. Release notes Sourced from https://github.com/delta-io/delta/releases";>io.delta:delta-spar

[PR] Build: Bump software.amazon.awssdk:bom from 2.29.43 to 2.29.45 [iceberg]

2025-01-04 Thread via GitHub
dependabot[bot] opened a new pull request, #11910: URL: https://github.com/apache/iceberg/pull/11910 Bumps software.amazon.awssdk:bom from 2.29.43 to 2.29.45. [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=soft

[PR] Build: Bump org.assertj:assertj-core from 3.27.0 to 3.27.2 [iceberg]

2025-01-04 Thread via GitHub
dependabot[bot] opened a new pull request, #11908: URL: https://github.com/apache/iceberg/pull/11908 Bumps [org.assertj:assertj-core](https://github.com/assertj/assertj) from 3.27.0 to 3.27.2. Release notes Sourced from https://github.com/assertj/assertj/releases";>org.assertj:asse

[PR] Build: Bump io.delta:delta-standalone_2.12 from 3.2.1 to 3.3.0 [iceberg]

2025-01-04 Thread via GitHub
dependabot[bot] opened a new pull request, #11909: URL: https://github.com/apache/iceberg/pull/11909 Bumps [io.delta:delta-standalone_2.12](https://github.com/delta-io/delta) from 3.2.1 to 3.3.0. Release notes Sourced from https://github.com/delta-io/delta/releases";>io.delta:delta

[PR] Build: Bump org.xerial:sqlite-jdbc from 3.47.1.0 to 3.47.2.0 [iceberg]

2025-01-04 Thread via GitHub
dependabot[bot] opened a new pull request, #11907: URL: https://github.com/apache/iceberg/pull/11907 Bumps [org.xerial:sqlite-jdbc](https://github.com/xerial/sqlite-jdbc) from 3.47.1.0 to 3.47.2.0. Release notes Sourced from https://github.com/xerial/sqlite-jdbc/releases";>org.xeri

Re: [PR] Hive: Add Hive 4 support and remove Hive runtime [iceberg]

2025-01-04 Thread via GitHub
manuzhang commented on PR #11750: URL: https://github.com/apache/iceberg/pull/11750#issuecomment-2571491572 Spark (and other modules) also depend on `TestHiveMetastore` from test modules of `iceberg-hive-metastore`. We cannot use old Hive dependency from Spark to run this class due to API c

Re: [PR] Core: Map methods should return immutable collections [iceberg]

2025-01-04 Thread via GitHub
github-actions[bot] commented on PR #11304: URL: https://github.com/apache/iceberg/pull/11304#issuecomment-2571448891 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [PR] Core: Suppress exceptions in case of dropTableData [iceberg]

2025-01-04 Thread via GitHub
github-actions[bot] commented on PR #9184: URL: https://github.com/apache/iceberg/pull/9184#issuecomment-2571448843 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull

Re: [I] Allow adding synthetic partition for existing data in table [iceberg]

2025-01-04 Thread via GitHub
github-actions[bot] commented on issue #10658: URL: https://github.com/apache/iceberg/issues/10658#issuecomment-2571448887 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [PR] Core: add variant type support [iceberg]

2025-01-04 Thread via GitHub
shohamyamin commented on PR #11831: URL: https://github.com/apache/iceberg/pull/11831#issuecomment-2571424964 @aihuaxu very important feature that will allow a lot more options for iceberg, thank you for your contribution -- This is an automated message from the Apache Git Service. To r

Re: [PR] Fix read from multiple s3 regions [iceberg-python]

2025-01-04 Thread via GitHub
jiakai-li commented on PR #1453: URL: https://github.com/apache/iceberg-python/pull/1453#issuecomment-2571396659 > somethings going on with github runner, `make lint` works locally https://github.com/apache/iceberg-python/actions/runs/12612627362/job/35150820385?pr=1453 I saw that as

Re: [PR] Fix read from multiple s3 regions [iceberg-python]

2025-01-04 Thread via GitHub
kevinjqliu commented on PR #1453: URL: https://github.com/apache/iceberg-python/pull/1453#issuecomment-2571394282 somethings going on with github runner, `make lint` works locally https://github.com/apache/iceberg-python/actions/runs/12612627362/job/35150820385?pr=1453 -- This is an au

Re: [PR] Use SupportsPrefixOperations for Remove OrphanFile Procedure [iceberg]

2025-01-04 Thread via GitHub
ismailsimsek commented on code in PR #11906: URL: https://github.com/apache/iceberg/pull/11906#discussion_r1903139413 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/actions/TestRemoveOrphanFilesAction.java: ## @@ -610,9 +613,12 @@ public void testHiddenPathsStarting

Re: [I] [bug] Cannot perform table scan on V1 table [iceberg-python]

2025-01-04 Thread via GitHub
kevinjqliu commented on issue #1194: URL: https://github.com/apache/iceberg-python/issues/1194#issuecomment-2571387324 Spark's V1 manifest list writer contains the optional `added_rows_count` field. https://github.com/apache/iceberg/blob/fcd5dd932a21066d6127c94c50f3de43e8c2d80c/cor

Re: [I] [bug] Cannot perform table scan on V1 table [iceberg-python]

2025-01-04 Thread via GitHub
kevinjqliu commented on issue #1194: URL: https://github.com/apache/iceberg-python/issues/1194#issuecomment-2571385510 #1484 is a better, isolated test. It's using the minimal required schema for a v1 table manifest list -- This is an automated message from the Apache Git Service. To res

[PR] Use SupportsPrefixOperations for Remove OrphanFile Procedure [iceberg]

2025-01-04 Thread via GitHub
ismailsimsek opened a new pull request, #11906: URL: https://github.com/apache/iceberg/pull/11906 Continuing #7914 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

Re: [PR] Kafka-connect-runtime: remove code duplications in integration tests [iceberg]

2025-01-04 Thread via GitHub
bryanck merged PR #11883: URL: https://github.com/apache/iceberg/pull/11883 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

Re: [PR] Kafka-connect-runtime: remove code duplications in integration tests [iceberg]

2025-01-04 Thread via GitHub
bryanck commented on PR #11883: URL: https://github.com/apache/iceberg/pull/11883#issuecomment-2571375637 Thanks for the cleanup @wombatu-kun ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] Fix read from multiple s3 regions [iceberg-python]

2025-01-04 Thread via GitHub
jiakai-li commented on code in PR #1453: URL: https://github.com/apache/iceberg-python/pull/1453#discussion_r1903135543 ## pyiceberg/io/pyarrow.py: ## @@ -352,7 +352,7 @@ def parse_location(location: str) -> Tuple[str, str, str]: def _initialize_fs(self, scheme: str, netl

Re: [PR] Add pyiceberg DataFusion e2e test [iceberg-rust]

2025-01-04 Thread via GitHub
kevinjqliu commented on code in PR #825: URL: https://github.com/apache/iceberg-rust/pull/825#discussion_r1903134096 ## crates/integration_tests/testdata/pyiceberg/Dockerfile: ## @@ -0,0 +1,22 @@ +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributo

Re: [I] Better error messages when creating a table with unsupported types [iceberg-python]

2025-01-04 Thread via GitHub
kevinjqliu commented on issue #860: URL: https://github.com/apache/iceberg-python/issues/860#issuecomment-2571370127 @DevChrisCross assigned to you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [I] Data Loss in Flink Job with Iceberg Sink After Restart: How to Ensure Consistent Writes? [iceberg]

2025-01-04 Thread via GitHub
pvary commented on issue #11894: URL: https://github.com/apache/iceberg/issues/11894#issuecomment-2571365659 > it seems that the writers are not contributing to the checkpoint/savepoint data size, whereas the committer is. This is because the writers are stateless. The committers are

Re: [PR] Implement column projection [iceberg-python]

2025-01-04 Thread via GitHub
gabeiglio commented on code in PR #1443: URL: https://github.com/apache/iceberg-python/pull/1443#discussion_r1903118090 ## pyiceberg/io/pyarrow.py: ## @@ -1286,14 +1310,20 @@ def _task_to_record_batches( continue output_batches = arr

Re: [PR] Implement column projection [iceberg-python]

2025-01-04 Thread via GitHub
gabeiglio commented on code in PR #1443: URL: https://github.com/apache/iceberg-python/pull/1443#discussion_r1903117970 ## tests/io/test_pyarrow.py: ## @@ -1122,6 +1127,63 @@ def test_projection_concat_files(schema_int: Schema, file_int: str) -> None: assert repr(result_ta

Re: [PR] feat: support S3 Table Buckets with S3TablesCatalog [iceberg-python]

2025-01-04 Thread via GitHub
felixscherz commented on code in PR #1429: URL: https://github.com/apache/iceberg-python/pull/1429#discussion_r1903096481 ## tests/catalog/test_s3tables.py: ## @@ -0,0 +1,180 @@ +import uuid + +import boto3 +import pytest + +from pyiceberg.catalog.s3tables import S3TableCatalog

Re: [PR] feat: support S3 Table Buckets with S3TablesCatalog [iceberg-python]

2025-01-04 Thread via GitHub
felixscherz commented on code in PR #1429: URL: https://github.com/apache/iceberg-python/pull/1429#discussion_r1903092999 ## tests/catalog/test_s3tables.py: ## @@ -0,0 +1,180 @@ +import uuid + +import boto3 +import pytest + +from pyiceberg.catalog.s3tables import S3TableCatalog

Re: [I] Support for S3 catalog to work with S3 Tables [iceberg-python]

2025-01-04 Thread via GitHub
felixscherz commented on issue #1404: URL: https://github.com/apache/iceberg-python/issues/1404#issuecomment-2571275722 > @felixscherz thanks for catching this (and thanks to everyone who's interested in building S3 Tables support for PyIceberg!). We're working on an S3-side fix for the `x

Re: [PR] feat: support S3 Table Buckets with S3TablesCatalog [iceberg-python]

2025-01-04 Thread via GitHub
felixscherz commented on code in PR #1429: URL: https://github.com/apache/iceberg-python/pull/1429#discussion_r1903092532 ## pyiceberg/catalog/s3tables.py: ## @@ -0,0 +1,318 @@ +import re +from typing import TYPE_CHECKING, List, Optional, Set, Tuple, Union + +import boto3 + +fro

Re: [I] Better error messages when creating a table with unsupported types [iceberg-python]

2025-01-04 Thread via GitHub
DevChrisCross commented on issue #860: URL: https://github.com/apache/iceberg-python/issues/860#issuecomment-2571080332 Hi @kevinjqliu, I'm not sure if this issue is being worked on, in case not, can you assign it to me? :) -- This is an automated message from the Apache Git Service. To r

Re: [PR] Add pyiceberg DataFusion e2e test [iceberg-rust]

2025-01-04 Thread via GitHub
gruuya commented on code in PR #825: URL: https://github.com/apache/iceberg-rust/pull/825#discussion_r1902931761 ## crates/integration_tests/testdata/pyiceberg/provision.py: ## @@ -0,0 +1,80 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor