Re: [PR] Modified exception objects being thrown when converting Pyarrow tables [iceberg-python]

2025-01-10 Thread via GitHub
DevChrisCross commented on code in PR #1498: URL: https://github.com/apache/iceberg-python/pull/1498#discussion_r1911918896 ## tests/io/test_pyarrow_visitor.py: ## @@ -583,46 +584,79 @@ def test_pyarrow_schema_to_schema_fresh_ids_nested_schema( assert visit_pyarrow(pyarrow_

Re: [PR] Modified exception objects being thrown when converting Pyarrow tables [iceberg-python]

2025-01-10 Thread via GitHub
DevChrisCross commented on code in PR #1498: URL: https://github.com/apache/iceberg-python/pull/1498#discussion_r1911914991 ## pyiceberg/exceptions.py: ## @@ -122,3 +125,19 @@ class CommitStateUnknownException(RESTError): class WaitingForLockException(Exception): """Need

Re: [PR] Modified exception objects being thrown when converting Pyarrow tables [iceberg-python]

2025-01-10 Thread via GitHub
DevChrisCross commented on code in PR #1498: URL: https://github.com/apache/iceberg-python/pull/1498#discussion_r1911913704 ## pyiceberg/io/pyarrow.py: ## @@ -1003,6 +1000,20 @@ def _(obj: pa.DictionaryType, visitor: PyArrowSchemaVisitor[T]) -> T: return visit_pyarrow(obj.

Re: [PR] Core: Parsing and Writing Tests for V3 Metadata [iceberg]

2025-01-10 Thread via GitHub
HonahX commented on code in PR #11947: URL: https://github.com/apache/iceberg/pull/11947#discussion_r1911913284 ## core/src/test/java/org/apache/iceberg/MetadataTestUtils.java: ## @@ -0,0 +1,336 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more co

Re: [PR] Core: Parsing and Writing Tests for V3 Metadata [iceberg]

2025-01-10 Thread via GitHub
HonahX commented on code in PR #11947: URL: https://github.com/apache/iceberg/pull/11947#discussion_r1911912918 ## core/src/main/java/org/apache/iceberg/TableMetadataParser.java: ## @@ -240,6 +242,13 @@ public static void toJson(TableMetadata metadata, JsonGenerator generator)

Re: [PR] Core: Parsing and Writing Tests for V3 Metadata [iceberg]

2025-01-10 Thread via GitHub
HonahX commented on code in PR #11947: URL: https://github.com/apache/iceberg/pull/11947#discussion_r1911912918 ## core/src/main/java/org/apache/iceberg/TableMetadataParser.java: ## @@ -240,6 +242,13 @@ public static void toJson(TableMetadata metadata, JsonGenerator generator)

Re: [I] [feat] add missing metadata tables [iceberg-python]

2025-01-10 Thread via GitHub
soumya-ghosh commented on issue #1053: URL: https://github.com/apache/iceberg-python/issues/1053#issuecomment-2585084764 Yes I will start working on that soon, have been busy last few weeks so couldn't make any progress. -- This is an automated message from the Apache Git Service. To res

Re: [PR] [Docs] Update spark-getting-started docs page to make the example valid [iceberg]

2025-01-10 Thread via GitHub
kevinjqliu commented on PR #11923: URL: https://github.com/apache/iceberg/pull/11923#issuecomment-2585058777 There's an error in CI, looks like you need to run the linter ``` Execution failed for task ':iceberg-spark:iceberg-spark-runtime-3.3_2.12:spotlessJavaCheck'. ``` -- This

Re: [PR] [Docs] Update spark-getting-started docs page to make the example valid [iceberg]

2025-01-10 Thread via GitHub
kevinjqliu commented on code in PR #11923: URL: https://github.com/apache/iceberg/pull/11923#discussion_r1911874163 ## docs/docs/spark-getting-started.md: ## @@ -77,21 +77,24 @@ Once your table is created, insert data using [`INSERT INTO`](spark-writes.md#in ```sql INSERT I

Re: [PR] Update cmake instructions in README [iceberg-cpp]

2025-01-10 Thread via GitHub
wgtmac commented on code in PR #24: URL: https://github.com/apache/iceberg-cpp/pull/24#discussion_r1911861722 ## README.md: ## @@ -32,7 +32,7 @@ C++ implementation of [Apache Iceberg™](https://iceberg.apache.org/). ```bash cd iceberg-cpp -cmake -S . -B build -DCMAKE_INSTALL

Re: [PR] Update cmake instructions in README [iceberg-cpp]

2025-01-10 Thread via GitHub
zhjwpku commented on code in PR #24: URL: https://github.com/apache/iceberg-cpp/pull/24#discussion_r1911797657 ## .gitignore: ## @@ -16,6 +16,7 @@ # under the License. build/ +install/ Review Comment: Why this? Is this generated by some IDE? ## README.md:

[PR] Introduce Schema update APIs [iceberg-rust]

2025-01-10 Thread via GitHub
Lordworms opened a new pull request, #883: URL: https://github.com/apache/iceberg-rust/pull/883 closes #697 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [I] [feat] add missing metadata tables [iceberg-python]

2025-01-10 Thread via GitHub
kevinjqliu commented on issue #1053: URL: https://github.com/apache/iceberg-python/issues/1053#issuecomment-2584977445 Thanks for your contribution here @soumya-ghosh. I just merged #1241 for `all_manifests`. Are you still interested in adding `all_files`, `all_data_files` and `all_delete_

Re: [PR] Add `all_manifests` metadata table with tests [iceberg-python]

2025-01-10 Thread via GitHub
kevinjqliu commented on PR #1241: URL: https://github.com/apache/iceberg-python/pull/1241#issuecomment-2584976662 Thanks for the contribution @soumya-ghosh ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

Re: [PR] Add `all_manifests` metadata table with tests [iceberg-python]

2025-01-10 Thread via GitHub
kevinjqliu merged PR #1241: URL: https://github.com/apache/iceberg-python/pull/1241 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [PR] Parquet: Add readers and writers for the internal object model [iceberg]

2025-01-10 Thread via GitHub
ajantha-bhat commented on code in PR #11904: URL: https://github.com/apache/iceberg/pull/11904#discussion_r1911784070 ## .palantir/revapi.yml: ## @@ -1171,6 +1171,28 @@ acceptedBreaks: \ java.util.function.Function, org.apache.iceberg.io.CloseableIterable,\ \

Re: [I] [discuss] `Transaction` API's `autocommit` [iceberg-python]

2025-01-10 Thread via GitHub
kevinjqliu commented on issue #1253: URL: https://github.com/apache/iceberg-python/issues/1253#issuecomment-2584974311 > The reason behind my head is, to my understanding, autocommit is still somewhat a related concept with transaction. I think its fair to ask if we consider `UpdateS

Re: [I] create_changelog_view returns no record when end-timestamp is missing [iceberg]

2025-01-10 Thread via GitHub
vinitamaloo-asu commented on issue #11922: URL: https://github.com/apache/iceberg/issues/11922#issuecomment-2584973826 Hi @flyrain, should I add someone else to this thread? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] Modified exception objects being thrown when converting Pyarrow tables [iceberg-python]

2025-01-10 Thread via GitHub
kevinjqliu commented on code in PR #1498: URL: https://github.com/apache/iceberg-python/pull/1498#discussion_r1911622483 ## pyiceberg/io/pyarrow.py: ## @@ -1098,8 +1109,10 @@ class _ConvertToIceberg(PyArrowSchemaVisitor[Union[IcebergType, Schema]]): """Converts PyArrowSche

Re: [PR] AWS: Add integration with Glue catalog extensions for Amazon SageMaker Lakehouse [iceberg]

2025-01-10 Thread via GitHub
github-actions[bot] commented on PR #11692: URL: https://github.com/apache/iceberg/pull/11692#issuecomment-2584944357 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [PR] Core: Fix failure when reading files table with branch [iceberg]

2025-01-10 Thread via GitHub
github-actions[bot] commented on PR #11719: URL: https://github.com/apache/iceberg/pull/11719#issuecomment-2584944399 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [I] AWS Glue skip validation does not work while using table overrrides [iceberg]

2025-01-10 Thread via GitHub
github-actions[bot] commented on issue #10701: URL: https://github.com/apache/iceberg/issues/10701#issuecomment-2584944281 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [PR] Spark: add property to disable client-side purging in spark [iceberg]

2025-01-10 Thread via GitHub
github-actions[bot] commented on PR #11317: URL: https://github.com/apache/iceberg/pull/11317#issuecomment-2584944327 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [I] Accessing S3 Express one zone bucket from pyiceberg [iceberg]

2025-01-10 Thread via GitHub
github-actions[bot] commented on issue #10702: URL: https://github.com/apache/iceberg/issues/10702#issuecomment-2584944295 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [PR] Fix ParallelIterable deadlock [iceberg]

2025-01-10 Thread via GitHub
stevenzwu commented on PR #11781: URL: https://github.com/apache/iceberg/pull/11781#issuecomment-2584586732 @findepi the deadlock issue is probably the worse of the two. the deadlock issue probably needs to be addressed urgently. -- This is an automated message from the Apache Git Service

Re: [I] [feature] Add support for `write.data.path` and `write.metadata.path` [iceberg-python]

2025-01-10 Thread via GitHub
kevinjqliu commented on issue #1492: URL: https://github.com/apache/iceberg-python/issues/1492#issuecomment-2584575339 #1452 is merged, we can now work on adding support for the above -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] Support Location Providers [iceberg-python]

2025-01-10 Thread via GitHub
kevinjqliu commented on PR #1452: URL: https://github.com/apache/iceberg-python/pull/1452#issuecomment-2584573283 Thanks @smaheshwar-pltr for working on this and @Fokko for the review :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] Support Location Providers [iceberg-python]

2025-01-10 Thread via GitHub
kevinjqliu merged PR #1452: URL: https://github.com/apache/iceberg-python/pull/1452 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [I] Support LocationProviders like the Java Iceberg Reference Implementaiton [iceberg-python]

2025-01-10 Thread via GitHub
kevinjqliu closed issue #861: Support LocationProviders like the Java Iceberg Reference Implementaiton URL: https://github.com/apache/iceberg-python/issues/861 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[PR] Build: Bump deptry from 0.21.2 to 0.22.0 [iceberg-python]

2025-01-10 Thread via GitHub
dependabot[bot] opened a new pull request, #1508: URL: https://github.com/apache/iceberg-python/pull/1508 Bumps [deptry](https://github.com/fpgmaas/deptry) from 0.21.2 to 0.22.0. Release notes Sourced from https://github.com/fpgmaas/deptry/releases";>deptry's releases. 0.22.

Re: [PR] [Docs] Update spark-getting-started docs page to make the example valid [iceberg]

2025-01-10 Thread via GitHub
nickdelnano commented on code in PR #11923: URL: https://github.com/apache/iceberg/pull/11923#discussion_r1911517154 ## docs/docs/spark-getting-started.md: ## @@ -77,21 +77,24 @@ Once your table is created, insert data using [`INSERT INTO`](spark-writes.md#in ```sql INSERT

[PR] Metadata Row Lineage [iceberg]

2025-01-10 Thread via GitHub
RussellSpitzer opened a new pull request, #11948: URL: https://github.com/apache/iceberg/pull/11948 Just the Metadata Changes Required for Row Lineage. Skips out on actual implementation in Manifests and Datafiles but allows for proper creation of Snapshots and Metadata.json with row lineag

Re: [PR] Build: Bump sqlalchemy from 2.0.36 to 2.0.37 [iceberg-python]

2025-01-10 Thread via GitHub
kevinjqliu merged PR #1502: URL: https://github.com/apache/iceberg-python/pull/1502 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [PR] Build: Bump getdaft from 0.4.1 to 0.4.2 [iceberg-python]

2025-01-10 Thread via GitHub
kevinjqliu merged PR #1503: URL: https://github.com/apache/iceberg-python/pull/1503 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [PR] Build: Bump pydantic from 2.10.4 to 2.10.5 [iceberg-python]

2025-01-10 Thread via GitHub
kevinjqliu merged PR #1504: URL: https://github.com/apache/iceberg-python/pull/1504 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [PR] Core: Parsing and Writing Tests for V3 Metadata [iceberg]

2025-01-10 Thread via GitHub
flyrain commented on code in PR #11947: URL: https://github.com/apache/iceberg/pull/11947#discussion_r1911495617 ## core/src/main/java/org/apache/iceberg/TableMetadataParser.java: ## @@ -240,6 +242,13 @@ public static void toJson(TableMetadata metadata, JsonGenerator generator)

Re: [PR] Support Location Providers [iceberg-python]

2025-01-10 Thread via GitHub
smaheshwar-pltr commented on code in PR #1452: URL: https://github.com/apache/iceberg-python/pull/1452#discussion_r1911453761 ## pyiceberg/table/locations.py: ## @@ -0,0 +1,81 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreeme

Re: [PR] Support Location Providers [iceberg-python]

2025-01-10 Thread via GitHub
smaheshwar-pltr commented on code in PR #1452: URL: https://github.com/apache/iceberg-python/pull/1452#discussion_r1911421207 ## pyiceberg/io/pyarrow.py: ## @@ -2622,13 +2631,15 @@ def _dataframe_to_data_files( property_name=TableProperties.WRITE_TARGET_FILE_SIZE_BYTES,

Re: [PR] Support Location Providers [iceberg-python]

2025-01-10 Thread via GitHub
smaheshwar-pltr commented on code in PR #1452: URL: https://github.com/apache/iceberg-python/pull/1452#discussion_r1911421207 ## pyiceberg/io/pyarrow.py: ## @@ -2622,13 +2631,15 @@ def _dataframe_to_data_files( property_name=TableProperties.WRITE_TARGET_FILE_SIZE_BYTES,

Re: [PR] Support Location Providers [iceberg-python]

2025-01-10 Thread via GitHub
smaheshwar-pltr commented on PR #1452: URL: https://github.com/apache/iceberg-python/pull/1452#issuecomment-2584223657 > I think we'd also want to add docs around this feature! Maybe similar to [FileIO](https://py.iceberg.apache.org/configuration/#fileio), we can add a new section about Loc

Re: [PR] Fix ParallelIterable deadlock [iceberg]

2025-01-10 Thread via GitHub
findepi commented on PR #11781: URL: https://github.com/apache/iceberg/pull/11781#issuecomment-2584202338 IIRC the OOM problem was a real production problem (cc @raunaqmorarka @dekimir @losipiuk), so I am not convinced it's OK to restore it. -- This is an automated message from the Apache

Re: [PR] Support Location Providers [iceberg-python]

2025-01-10 Thread via GitHub
Fokko commented on code in PR #1452: URL: https://github.com/apache/iceberg-python/pull/1452#discussion_r1911379496 ## pyiceberg/table/locations.py: ## @@ -0,0 +1,82 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See

Re: [PR] Support Location Providers [iceberg-python]

2025-01-10 Thread via GitHub
Fokko commented on code in PR #1452: URL: https://github.com/apache/iceberg-python/pull/1452#discussion_r1911368008 ## mkdocs/docs/api.md: ## @@ -1077,6 +1077,7 @@ with table.update_schema() as update: with table.update_schema() as update: update.add_column(("details", "co

Re: [PR] bump version to 0.9.0 [iceberg-python]

2025-01-10 Thread via GitHub
Fokko merged PR #1489: URL: https://github.com/apache/iceberg-python/pull/1489 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceber

Re: [PR] Build: Bump getdaft from 0.4.1 to 0.4.2 [iceberg-python]

2025-01-10 Thread via GitHub
kevinjqliu commented on PR #1503: URL: https://github.com/apache/iceberg-python/pull/1503#issuecomment-2584099795 @dependabot rebase -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] [ci] fix `make lint` [iceberg-python]

2025-01-10 Thread via GitHub
kevinjqliu commented on PR #1507: URL: https://github.com/apache/iceberg-python/pull/1507#issuecomment-2584096572 Addressed in #1499, closing -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] [ci] fix `make lint` [iceberg-python]

2025-01-10 Thread via GitHub
kevinjqliu closed pull request #1507: [ci] fix `make lint` URL: https://github.com/apache/iceberg-python/pull/1507 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscrib

Re: [PR] Build: Bump pydantic from 2.10.4 to 2.10.5 [iceberg-python]

2025-01-10 Thread via GitHub
kevinjqliu commented on PR #1504: URL: https://github.com/apache/iceberg-python/pull/1504#issuecomment-2584099867 @dependabot rebase -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] Build: Bump sqlalchemy from 2.0.36 to 2.0.37 [iceberg-python]

2025-01-10 Thread via GitHub
kevinjqliu commented on PR #1502: URL: https://github.com/apache/iceberg-python/pull/1502#issuecomment-2584099712 @dependabot rebase -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] Nit fixes to URL-encoding of partition field names [iceberg-python]

2025-01-10 Thread via GitHub
kevinjqliu commented on PR #1499: URL: https://github.com/apache/iceberg-python/pull/1499#issuecomment-2584095157 Thanks for following up on this @smaheshwar-pltr -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [PR] Nit fixes to URL-encoding of partition field names [iceberg-python]

2025-01-10 Thread via GitHub
kevinjqliu merged PR #1499: URL: https://github.com/apache/iceberg-python/pull/1499 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [PR] feat(catalog): Standardize Catalog create table function [iceberg-go]

2025-01-10 Thread via GitHub
zeroshade commented on code in PR #245: URL: https://github.com/apache/iceberg-go/pull/245#discussion_r1911296722 ## table/metadata_internal_test.go: ## @@ -491,3 +491,118 @@ func TestV1WriteMetadataToV2(t *testing.T) { assert.NotContains(t, rawData, "schema") as

Re: [PR] feat(catalog): Standardize Catalog create table function [iceberg-go]

2025-01-10 Thread via GitHub
Fokko commented on code in PR #245: URL: https://github.com/apache/iceberg-go/pull/245#discussion_r1911289679 ## table/metadata_internal_test.go: ## @@ -491,3 +491,118 @@ func TestV1WriteMetadataToV2(t *testing.T) { assert.NotContains(t, rawData, "schema") assert

Re: [PR] Core: Parsing and Writing Tests for V3 Metadata [iceberg]

2025-01-10 Thread via GitHub
HonahX commented on PR #11947: URL: https://github.com/apache/iceberg/pull/11947#issuecomment-2584041893 cc @RussellSpitzer @flyrain May I ask for your help to review this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] Core: Parsing and Writing Tests for V3 Metadata [iceberg]

2025-01-10 Thread via GitHub
HonahX commented on code in PR #11947: URL: https://github.com/apache/iceberg/pull/11947#discussion_r1911280517 ## core/src/main/java/org/apache/iceberg/TableMetadata.java: ## @@ -949,6 +972,8 @@ private Builder(int formatVersion) { this.schemasById = Maps.newHashMap();

Re: [PR] Fix ParallelIterable deadlock [iceberg]

2025-01-10 Thread via GitHub
RussellSpitzer commented on PR #11781: URL: https://github.com/apache/iceberg/pull/11781#issuecomment-2584026138 > > As noted in my comment to Piotr, I think this is a fix to the deadlock but I think it may be better to just remove the yielding behavior all together > > Would this res

Re: [PR] feat(catalog): Add Catalog Registry [iceberg-go]

2025-01-10 Thread via GitHub
Fokko commented on code in PR #244: URL: https://github.com/apache/iceberg-go/pull/244#discussion_r1911238543 ## catalog/glue.go: ## @@ -47,13 +50,65 @@ const ( // The ID of the Glue Data Catalog where the tables reside. If none is provided, Glue // automaticall

Re: [PR] Support Location Providers [iceberg-python]

2025-01-10 Thread via GitHub
smaheshwar-pltr commented on code in PR #1452: URL: https://github.com/apache/iceberg-python/pull/1452#discussion_r1911271050 ## pyiceberg/table/locations.py: ## @@ -0,0 +1,82 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreeme

Re: [PR] feat(catalog): Add Catalog Registry [iceberg-go]

2025-01-10 Thread via GitHub
zeroshade merged PR #244: URL: https://github.com/apache/iceberg-go/pull/244 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [PR] Nit fixes to URL-encoding of partition field names [iceberg-python]

2025-01-10 Thread via GitHub
smaheshwar-pltr commented on code in PR #1499: URL: https://github.com/apache/iceberg-python/pull/1499#discussion_r1911204940 ## mkdocs/docs/api.md: ## @@ -1077,6 +1077,7 @@ with table.update_schema() as update: with table.update_schema() as update: update.add_column(("det

Re: [PR] Support Location Providers [iceberg-python]

2025-01-10 Thread via GitHub
smaheshwar-pltr commented on code in PR #1452: URL: https://github.com/apache/iceberg-python/pull/1452#discussion_r1911203978 ## mkdocs/docs/api.md: ## @@ -1077,6 +1077,7 @@ with table.update_schema() as update: with table.update_schema() as update: update.add_column(("det

Re: [PR] Support Location Providers [iceberg-python]

2025-01-10 Thread via GitHub
smaheshwar-pltr commented on code in PR #1452: URL: https://github.com/apache/iceberg-python/pull/1452#discussion_r1911198932 ## pyiceberg/io/pyarrow.py: ## @@ -2234,7 +2235,9 @@ def data_file_statistics_from_parquet_metadata( ) -def write_file(io: FileIO, table_metadat

Re: [PR] Fix ParallelIterable deadlock [iceberg]

2025-01-10 Thread via GitHub
findepi commented on PR #11781: URL: https://github.com/apache/iceberg/pull/11781#issuecomment-2583924725 > As noted in my comment to Piotr, I think this is a fix to the deadlock but I think it may be better to just remove the yielding behavior all together Would this restore OOM prob

Re: [PR] Nit fixes to URL-encoding of partition field names [iceberg-python]

2025-01-10 Thread via GitHub
smaheshwar-pltr commented on code in PR #1499: URL: https://github.com/apache/iceberg-python/pull/1499#discussion_r1911187965 ## tests/integration/test_partitioning_key.py: ## @@ -823,11 +789,6 @@ def test_partition_key( snapshot.manifests(iceberg_table.io)[0].fetc

Re: [PR] Support Location Providers [iceberg-python]

2025-01-10 Thread via GitHub
kevinjqliu commented on code in PR #1452: URL: https://github.com/apache/iceberg-python/pull/1452#discussion_r1911172676 ## pyiceberg/io/pyarrow.py: ## @@ -2234,7 +2235,9 @@ def data_file_statistics_from_parquet_metadata( ) -def write_file(io: FileIO, table_metadata: Ta

Re: [I] [discuss] `Transaction` API's `autocommit` [iceberg-python]

2025-01-10 Thread via GitHub
jiakai-li commented on issue #1253: URL: https://github.com/apache/iceberg-python/issues/1253#issuecomment-2583876079 Adding an example for the issue: ```python from pyiceberg.catalog.sql import SqlCatalog from pyiceberg.schema import Schema from pyiceberg.types import IntegerTyp

Re: [PR] Materialized View Spec [iceberg]

2025-01-10 Thread via GitHub
stevenzwu commented on code in PR #11041: URL: https://github.com/apache/iceberg/pull/11041#discussion_r1911163141 ## format/view-spec.md: ## @@ -82,9 +98,12 @@ Each version in `versions` is a struct with the following fields: | _required_ | `representations` | A list of [

Re: [PR] Materialized View Spec [iceberg]

2025-01-10 Thread via GitHub
JanKaul commented on code in PR #11041: URL: https://github.com/apache/iceberg/pull/11041#discussion_r1911104542 ## format/view-spec.md: ## @@ -82,9 +98,12 @@ Each version in `versions` is a struct with the following fields: | _required_ | `representations` | A list of [re

Re: [I] [discuss] `Transaction` API's `autocommit` [iceberg-python]

2025-01-10 Thread via GitHub
jiakai-li commented on issue #1253: URL: https://github.com/apache/iceberg-python/issues/1253#issuecomment-2583844373 Just raise another possibility for discussion. We could also modify the `_transaction._autocommit` when calling `UpdateSchema.__enter__` method, which indicates multiple up

Re: [PR] Support Location Providers [iceberg-python]

2025-01-10 Thread via GitHub
smaheshwar-pltr commented on code in PR #1452: URL: https://github.com/apache/iceberg-python/pull/1452#discussion_r1911149088 ## pyiceberg/table/locations.py: ## @@ -0,0 +1,81 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreeme

Re: [PR] Support Location Providers [iceberg-python]

2025-01-10 Thread via GitHub
smaheshwar-pltr commented on code in PR #1452: URL: https://github.com/apache/iceberg-python/pull/1452#discussion_r1911146551 ## pyiceberg/io/pyarrow.py: ## @@ -2234,7 +2235,9 @@ def data_file_statistics_from_parquet_metadata( ) -def write_file(io: FileIO, table_metadat

[PR] Update cmake instructions in README [iceberg-cpp]

2025-01-10 Thread via GitHub
zuyu opened a new pull request, #24: URL: https://github.com/apache/iceberg-cpp/pull/24 Build `Core Libraries` w/o `Arrow`, as there is another section about building `Iceberg Arrow Library` explicitly. -- This is an automated message from the Apache Git Service. To respond to the message

Re: [I] `table.update_schema()` continues to commit when exist with an exception [iceberg-python]

2025-01-10 Thread via GitHub
kevinjqliu commented on issue #1505: URL: https://github.com/apache/iceberg-python/issues/1505#issuecomment-2583824957 I like the example you have above! Do you mind transferring it to #1253? -- This is an automated message from the Apache Git Service. To respond to the message, please l

Re: [I] `table.update_schema()` continues to commit when exist with an exception [iceberg-python]

2025-01-10 Thread via GitHub
jiakai-li closed issue #1505: `table.update_schema()` continues to commit when exist with an exception URL: https://github.com/apache/iceberg-python/issues/1505 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

Re: [I] `table.update_schema()` continues to commit when exist with an exception [iceberg-python]

2025-01-10 Thread via GitHub
jiakai-li commented on issue #1505: URL: https://github.com/apache/iceberg-python/issues/1505#issuecomment-2583805429 Thank you @kevinjqliu , I'm closing this issue to avoid duplicates then. Let's keep the discussion at #1253 -- This is an automated message from the Apache Git Service.

Re: [PR] Nit fixes to URL-encoding of partition field names [iceberg-python]

2025-01-10 Thread via GitHub
kevinjqliu commented on code in PR #1499: URL: https://github.com/apache/iceberg-python/pull/1499#discussion_r1911101051 ## tests/integration/test_partitioning_key.py: ## @@ -823,11 +789,6 @@ def test_partition_key( snapshot.manifests(iceberg_table.io)[0].fetch_man

Re: [PR] Support Location Providers [iceberg-python]

2025-01-10 Thread via GitHub
kevinjqliu commented on code in PR #1452: URL: https://github.com/apache/iceberg-python/pull/1452#discussion_r1911063518 ## pyiceberg/table/locations.py: ## @@ -0,0 +1,82 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements.

Re: [I] Add REST catalog integration tests [iceberg-python]

2025-01-10 Thread via GitHub
kevinjqliu commented on issue #1439: URL: https://github.com/apache/iceberg-python/issues/1439#issuecomment-2583675951 > have we ever considered taking an approach similar (at a high-level, of course - some details don't transfer over to Python) to the Java-side that has [CatalogTests](htt

Re: [PR] Materialized View Spec [iceberg]

2025-01-10 Thread via GitHub
danielcweeks commented on code in PR #11041: URL: https://github.com/apache/iceberg/pull/11041#discussion_r1910988506 ## format/view-spec.md: ## @@ -160,6 +179,56 @@ Each entry in `version-log` is a struct with the following fields: | _required_ | `timestamp-ms` | Timestamp w

Re: [PR] Materialized View Spec [iceberg]

2025-01-10 Thread via GitHub
danielcweeks commented on code in PR #11041: URL: https://github.com/apache/iceberg/pull/11041#discussion_r1910985763 ## format/view-spec.md: ## @@ -160,6 +179,56 @@ Each entry in `version-log` is a struct with the following fields: | _required_ | `timestamp-ms` | Timestamp w

[PR] [ci] fix `make lint` [iceberg-python]

2025-01-10 Thread via GitHub
kevinjqliu opened a new pull request, #1507: URL: https://github.com/apache/iceberg-python/pull/1507 Ran `make lint` locally -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [I] `table.update_schema()` continues to commit when exist with an exception [iceberg-python]

2025-01-10 Thread via GitHub
kevinjqliu commented on issue #1505: URL: https://github.com/apache/iceberg-python/issues/1505#issuecomment-2583565807 Hey @jiakai-li thanks for raising this issue! The root cause is the same as the one i raised in #1253 and #1497 Specifically, when using UpdateSchema as a "transa

Re: [I] Explore potential issue with `scan` returning the incorrect results [iceberg-python]

2025-01-10 Thread via GitHub
kevinjqliu commented on issue #1506: URL: https://github.com/apache/iceberg-python/issues/1506#issuecomment-2583543119 In the second example, `total_row_count` is `99126`, which means we didnt hit the `10_000` limit and exit early.. -- This is an automated message from the Apache Git

Re: [I] Handling of Source Id => Source Ids [iceberg]

2025-01-10 Thread via GitHub
HonahX commented on issue #10762: URL: https://github.com/apache/iceberg/issues/10762#issuecomment-2583544593 Hi @RussellSpitzer, may I work on this if no one has started? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

Re: [PR] Avro: Add internal writer [iceberg]

2025-01-10 Thread via GitHub
rdblue merged PR #11919: URL: https://github.com/apache/iceberg/pull/11919 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Avro: Add internal writer [iceberg]

2025-01-10 Thread via GitHub
rdblue commented on PR #11919: URL: https://github.com/apache/iceberg/pull/11919#issuecomment-2583543250 Would have been nice to fix the nit from the last review, but it isn't a blocker. Thanks, @ajantha-bhat! I'll merge. -- This is an automated message from the Apache Git Service.

[PR] Core: Parsing and Writing Tests for V3 Metadata [iceberg]

2025-01-10 Thread via GitHub
HonahX opened a new pull request, #11947: URL: https://github.com/apache/iceberg/pull/11947 Fixes #10764 - Add new fields: `row-lineage` and `next-row-id` to TableMetadata - Update the parser to parse v3 new fields - Add new unit tests for V3 metadata - Refactor TableMetadata

Re: [PR] Core: Unimplement Map from CharSequenceMap to obey contract [iceberg]

2025-01-10 Thread via GitHub
findepi commented on PR #11704: URL: https://github.com/apache/iceberg/pull/11704#issuecomment-2583537159 I see benefit of having CharSequenceMap implement Map contract, eg getting all the rich Map interface. In my opinion it's not worth the risk though -- Map interface default implement

Re: [I] cannot load table thru glue catalog [iceberg-python]

2025-01-10 Thread via GitHub
kevinjqliu commented on issue #1501: URL: https://github.com/apache/iceberg-python/issues/1501#issuecomment-2583527417 Pyiceberg pass the configuration to the underlying FileIO implementation. Can you try to run that piece of code in isolation to see what the issue is? For example, you

Re: [PR] Spec: Document Snapshot Summary Optional Fields for Standardization [iceberg]

2025-01-10 Thread via GitHub
HonahX commented on code in PR #11660: URL: https://github.com/apache/iceberg/pull/11660#discussion_r1910897271 ## format/spec.md: ## @@ -1633,3 +1633,57 @@ might indicate different snapshot IDs for a specific timestamp. The discrepancie When processing point in time queries

Re: [I] Explore potential issue with `scan` returning the incorrect results [iceberg-python]

2025-01-10 Thread via GitHub
kevinjqliu commented on issue #1506: URL: https://github.com/apache/iceberg-python/issues/1506#issuecomment-2583467365 From the log above, presumably ran with the same table, snapshot, and filter. ``` total_row_count: 10, len(completed_futures): 114 total_row_count: 99126, len

Re: [I] Explore potential issue with `scan` returning the incorrect results [iceberg-python]

2025-01-10 Thread via GitHub
kevinjqliu commented on issue #1506: URL: https://github.com/apache/iceberg-python/issues/1506#issuecomment-2583459732 It looks like the results are sample data, it would be great if you can provide the source table so i can try to reproduce it -- This is an automated message from th

Re: [PR] Materialized View Spec [iceberg]

2025-01-10 Thread via GitHub
danielcweeks commented on code in PR #11041: URL: https://github.com/apache/iceberg/pull/11041#discussion_r1910784199 ## format/view-spec.md: ## @@ -42,12 +42,28 @@ An atomic swap of one view metadata file for another provides the basis for maki Writers create view metadata

[I] Explore potential issue with `scan` returning the incorrect results [iceberg-python]

2025-01-10 Thread via GitHub
kevinjqliu opened a new issue, #1506: URL: https://github.com/apache/iceberg-python/issues/1506 ### Apache Iceberg version None ### Please describe the bug 🐞 From slack, """ Hi team, There have been occasional reports from internal users that the number of records

Re: [PR] Auth Manager API part 3: OAuth2 Manager [iceberg]

2025-01-10 Thread via GitHub
adutra commented on code in PR #11844: URL: https://github.com/apache/iceberg/pull/11844#discussion_r1910786713 ## core/src/main/java/org/apache/iceberg/rest/auth/AuthSessionCache.java: ## @@ -0,0 +1,130 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * o

Re: [PR] Auth Manager API part 3: OAuth2 Manager [iceberg]

2025-01-10 Thread via GitHub
adutra commented on code in PR #11844: URL: https://github.com/apache/iceberg/pull/11844#discussion_r1910776037 ## core/src/main/java/org/apache/iceberg/rest/auth/AuthSessionCache.java: ## @@ -0,0 +1,130 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * o

Re: [PR] Materialized View Spec [iceberg]

2025-01-10 Thread via GitHub
danielcweeks commented on code in PR #11041: URL: https://github.com/apache/iceberg/pull/11041#discussion_r1910774327 ## format/view-spec.md: ## @@ -42,12 +42,28 @@ An atomic swap of one view metadata file for another provides the basis for maki Writers create view metadata

Re: [PR] Core: Allow adding files to multiple partition specs in FastAppend [iceberg]

2025-01-10 Thread via GitHub
aokolnychyi commented on PR #11771: URL: https://github.com/apache/iceberg/pull/11771#issuecomment-2583369545 Thanks, @anuragmantri! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] Core: Allow adding files to multiple partition specs in FastAppend [iceberg]

2025-01-10 Thread via GitHub
aokolnychyi merged PR #11771: URL: https://github.com/apache/iceberg/pull/11771 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@icebe

[I] `table.update_schema()` commits when context manager exist with an error [iceberg-python]

2025-01-10 Thread via GitHub
jiakai-li opened a new issue, #1505: URL: https://github.com/apache/iceberg-python/issues/1505 ### Apache Iceberg version 0.8.1 (latest release) ### Please describe the bug 🐞 While working on the issue of #1493 I noticed urrently when running below code, the field `field

  1   2   >