[PR] Add Column Name to the Error Message in StatsAggregator [iceberg-python]

2025-07-08 Thread via GitHub
james5418 opened a new pull request, #2190: URL: https://github.com/apache/iceberg-python/pull/2190 Closes #2017 # Rationale for this change Include the column name in the error message to make it more descriptive. # Are these changes tested? # Are there an

Re: [PR] chore: bump C++ standard to 23 [iceberg-cpp]

2025-07-08 Thread via GitHub
zhjwpku commented on PR #139: URL: https://github.com/apache/iceberg-cpp/pull/139#issuecomment-3051264088 > > I just changed to a more recent arrow commit which contains the Apache Thrift update[1], which should fix our frequent CI failure. > > [1] [apache/arrow#46912](https://github.com/

Re: [PR] Spark 4.0: Row Lineage support [iceberg]

2025-07-08 Thread via GitHub
amogh-jahagirdar commented on code in PR #13310: URL: https://github.com/apache/iceberg/pull/13310#discussion_r2194072984 ## spark/v4.0/spark/src/test/java/org/apache/iceberg/spark/data/TestSparkAvroReader.java: ## @@ -40,35 +41,51 @@ protected void writeAndValidate(Schema schem

Re: [PR] Spark-3.5: Add procedure to compute partition stats [iceberg]

2025-07-08 Thread via GitHub
ajantha-bhat commented on code in PR #13480: URL: https://github.com/apache/iceberg/pull/13480#discussion_r2194090686 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/procedures/ComputePartitionStatsProcedure.java: ## @@ -0,0 +1,130 @@ +/* + * Licensed to the Apache So

Re: [PR] Spec: Update common v2,v3 table headers as v2+ [iceberg]

2025-07-08 Thread via GitHub
ajantha-bhat commented on PR #13181: URL: https://github.com/apache/iceberg/pull/13181#issuecomment-3051214168 @nastra: Can you please help review and merge this? I have addressed the comments long back and it went to stale after that. -- This is an automated message from the Apache Git

[PR] Spark 4.0: Port Avro lineage reader test changes from #13070 [iceberg]

2025-07-08 Thread via GitHub
amogh-jahagirdar opened a new pull request, #13496: URL: https://github.com/apache/iceberg/pull/13496 This change ports the TestSparkAvroReader changes from #13070 to Spark 4.0, I'm breaking this away from #13310 -- This is an automated message from the Apache Git Service. To respond to t

Re: [PR] Parquet: Refactor parquet schema handling for variant type [iceberg]

2025-07-08 Thread via GitHub
xxubai closed pull request #12916: Parquet: Refactor parquet schema handling for variant type URL: https://github.com/apache/iceberg/pull/12916 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [PR] Spark-3.5: Add procedure to compute partition stats [iceberg]

2025-07-08 Thread via GitHub
szehon-ho commented on code in PR #13480: URL: https://github.com/apache/iceberg/pull/13480#discussion_r2194003460 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/procedures/ComputePartitionStatsProcedure.java: ## @@ -0,0 +1,130 @@ +/* + * Licensed to the Apache Softw

Re: [PR] update daft links [iceberg-python]

2025-07-08 Thread via GitHub
kevinjqliu merged PR #2169: URL: https://github.com/apache/iceberg-python/pull/2169 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [PR] fix: correct `UUIDType` partition representation for `BucketTransform` [iceberg-python]

2025-07-08 Thread via GitHub
dingo4dev closed pull request #2003: fix: correct `UUIDType` partition representation for `BucketTransform` URL: https://github.com/apache/iceberg-python/pull/2003 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [PR] fix: correct `UUIDType` partition representation for `BucketTransform` [iceberg-python]

2025-07-08 Thread via GitHub
dingo4dev commented on PR #2003: URL: https://github.com/apache/iceberg-python/pull/2003#issuecomment-3050859024 Closing this PR as the issue it addresses has been resolved in another https://github.com/apache/iceberg-python/issues/2002 -- This is an automated message from the Apache Git

Re: [PR] Spec: Add DV information in overview [iceberg]

2025-07-08 Thread via GitHub
stevenzwu commented on PR #13189: URL: https://github.com/apache/iceberg/pull/13189#issuecomment-3050787228 thanks @ajantha-bhat for the fix and @nastra for the review -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [PR] Spec: Add DV information in overview [iceberg]

2025-07-08 Thread via GitHub
stevenzwu merged PR #13189: URL: https://github.com/apache/iceberg/pull/13189 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] Spec: Add DV information in overview [iceberg]

2025-07-08 Thread via GitHub
stevenzwu commented on PR #13189: URL: https://github.com/apache/iceberg/pull/13189#issuecomment-3050783474 agree that voting is not needed in this case. let me merge this -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

Re: [PR] Add BigQuery Dependencies for Iceberg GCP Bundle [iceberg]

2025-07-08 Thread via GitHub
stevenzwu merged PR #13111: URL: https://github.com/apache/iceberg/pull/13111 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [PR] Spark-3.5: Add procedure to compute partition stats [iceberg]

2025-07-08 Thread via GitHub
ajantha-bhat commented on PR #13480: URL: https://github.com/apache/iceberg/pull/13480#issuecomment-3050694248 Thanks @nastra and @hussein-awala for the review. @amogh-jahagirdar or @RussellSpitzer or @szehon-ho: Anyone of you also wants to do a review? If not, we will go ahead with

Re: [PR] Spec: Add DV information in overview [iceberg]

2025-07-08 Thread via GitHub
ajantha-bhat commented on PR #13189: URL: https://github.com/apache/iceberg/pull/13189#issuecomment-3050686236 I don't think we need a discussion for this trivial change. IMO voting is to notify many people about new things added or change in behavior. This is kind of typo. -- This is a

Re: [PR] Add support for Bodo DataFrame [iceberg-python]

2025-07-08 Thread via GitHub
ehsantn commented on code in PR #2167: URL: https://github.com/apache/iceberg-python/pull/2167#discussion_r2193638945 ## tests/integration/test_writes/test_partitioned_writes.py: ## @@ -451,6 +451,11 @@ def test_dynamic_partition_overwrite_unpartitioned_evolve_to_identity_trans

Re: [I] Support Adding File Metadata Directly [iceberg-python]

2025-07-08 Thread via GitHub
github-actions[bot] commented on issue #1470: URL: https://github.com/apache/iceberg-python/issues/1470#issuecomment-3050652017 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the A

Re: [I] Support Adding File Metadata Directly [iceberg-python]

2025-07-08 Thread via GitHub
github-actions[bot] closed issue #1470: Support Adding File Metadata Directly URL: https://github.com/apache/iceberg-python/issues/1470 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] Parquet: Refactor parquet schema handling for variant type [iceberg]

2025-07-08 Thread via GitHub
github-actions[bot] commented on PR #12916: URL: https://github.com/apache/iceberg/pull/12916#issuecomment-3050647571 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [PR] AWS: Refactor DynamoDB and Glue properties into separated properties classes [iceberg]

2025-07-08 Thread via GitHub
github-actions[bot] commented on PR #12722: URL: https://github.com/apache/iceberg/pull/12722#issuecomment-3050647514 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If

Re: [PR] AWS: Refactor DynamoDB and Glue properties into separated properties classes [iceberg]

2025-07-08 Thread via GitHub
github-actions[bot] closed pull request #12722: AWS: Refactor DynamoDB and Glue properties into separated properties classes URL: https://github.com/apache/iceberg/pull/12722 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

Re: [PR] AWS: Fix DynamoDB and Glue integration test failures [iceberg]

2025-07-08 Thread via GitHub
github-actions[bot] commented on PR #12718: URL: https://github.com/apache/iceberg/pull/12718#issuecomment-3050647478 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If

Re: [PR] AWS: Fix DynamoDB and Glue integration test failures [iceberg]

2025-07-08 Thread via GitHub
github-actions[bot] closed pull request #12718: AWS: Fix DynamoDB and Glue integration test failures URL: https://github.com/apache/iceberg/pull/12718 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [I] CatalogUtil:dropTableData method doesn't remove old Puffin files [iceberg]

2025-07-08 Thread via GitHub
github-actions[bot] commented on issue #11876: URL: https://github.com/apache/iceberg/issues/11876#issuecomment-3050647167 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [I] CatalogUtil:dropTableData method doesn't remove old Puffin files [iceberg]

2025-07-08 Thread via GitHub
github-actions[bot] closed issue #11876: CatalogUtil:dropTableData method doesn't remove old Puffin files URL: https://github.com/apache/iceberg/issues/11876 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [I] Wrong name for parquet page row count min and max stats [iceberg]

2025-07-08 Thread via GitHub
github-actions[bot] commented on issue #11770: URL: https://github.com/apache/iceberg/issues/11770#issuecomment-3050647091 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [PR] Read ManifestList V1 with V2 projection. [iceberg-rust]

2025-07-08 Thread via GitHub
rambleraptor commented on code in PR #1482: URL: https://github.com/apache/iceberg-rust/pull/1482#discussion_r2193644577 ## crates/iceberg/src/avro/schema.rs: ## @@ -43,6 +43,41 @@ const MAP_LOGICAL_TYPE: &str = "map"; // This const may better to maintain in avro-rs. const LOG

Re: [PR] Read ManifestList V1 with V2 projection. [iceberg-rust]

2025-07-08 Thread via GitHub
rambleraptor commented on code in PR #1482: URL: https://github.com/apache/iceberg-rust/pull/1482#discussion_r2193644424 ## crates/iceberg/src/avro/schema.rs: ## @@ -81,20 +81,33 @@ impl SchemaVisitor for SchemaToAvroSchema { field_schema = avro_optional(field_schem

Re: [PR] Spark 4.0: Row Lineage support [iceberg]

2025-07-08 Thread via GitHub
stevenzwu commented on code in PR #13310: URL: https://github.com/apache/iceberg/pull/13310#discussion_r2193545442 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/source/SparkPositionDeltaWrite.java: ## @@ -736,7 +769,14 @@ public void update(InternalRow meta, Interna

Re: [PR] Add support for Bodo DataFrame [iceberg-python]

2025-07-08 Thread via GitHub
ehsantn commented on code in PR #2167: URL: https://github.com/apache/iceberg-python/pull/2167#discussion_r2193638945 ## tests/integration/test_writes/test_partitioned_writes.py: ## @@ -451,6 +451,11 @@ def test_dynamic_partition_overwrite_unpartitioned_evolve_to_identity_trans

Re: [PR] Spark 4.0: Row Lineage support [iceberg]

2025-07-08 Thread via GitHub
amogh-jahagirdar commented on code in PR #13310: URL: https://github.com/apache/iceberg/pull/13310#discussion_r2193614621 ## spark/v4.0/spark/src/test/java/org/apache/iceberg/spark/data/TestSparkAvroReader.java: ## @@ -40,35 +41,51 @@ protected void writeAndValidate(Schema schem

Re: [PR] Spark 4.0: Row Lineage support [iceberg]

2025-07-08 Thread via GitHub
amogh-jahagirdar commented on code in PR #13310: URL: https://github.com/apache/iceberg/pull/13310#discussion_r2193601358 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/source/SparkPositionDeltaWrite.java: ## @@ -736,7 +769,14 @@ public void update(InternalRow meta,

Re: [PR] Spark 4.0: Row Lineage support [iceberg]

2025-07-08 Thread via GitHub
amogh-jahagirdar commented on code in PR #13310: URL: https://github.com/apache/iceberg/pull/13310#discussion_r2193334597 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/source/SparkWrite.java: ## @@ -699,29 +708,41 @@ public DataWriter createWriter(int partitionId,

Re: [PR] fix(cli/rest) Support Glue REST operations with Iceberg-Go CLI [iceberg-go]

2025-07-08 Thread via GitHub
zeroshade commented on code in PR #459: URL: https://github.com/apache/iceberg-go/pull/459#discussion_r2193568773 ## config/config_test.go: ## @@ -78,6 +78,27 @@ catalog: Warehouse: "catalog_name", }, }, + // catalog with r

Re: [PR] typo: FILED_ID_PROP -> FIELD_ID_PROP [iceberg-rust]

2025-07-08 Thread via GitHub
liurenjie1024 merged PR #1497: URL: https://github.com/apache/iceberg-rust/pull/1497 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@

Re: [PR] Spark 4.0: Row Lineage support [iceberg]

2025-07-08 Thread via GitHub
stevenzwu commented on code in PR #13310: URL: https://github.com/apache/iceberg/pull/13310#discussion_r2193559119 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java: ## @@ -260,18 +260,52 @@ public MetadataColumn[] metadataColumns() { DataTyp

Re: [PR] Spark 4.0: Row Lineage support [iceberg]

2025-07-08 Thread via GitHub
stevenzwu commented on code in PR #13310: URL: https://github.com/apache/iceberg/pull/13310#discussion_r2193545442 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/source/SparkPositionDeltaWrite.java: ## @@ -736,7 +769,14 @@ public void update(InternalRow meta, Interna

Re: [PR] Add support for Bodo DataFrame [iceberg-python]

2025-07-08 Thread via GitHub
kevinjqliu commented on code in PR #2167: URL: https://github.com/apache/iceberg-python/pull/2167#discussion_r2193523198 ## tests/integration/test_writes/test_partitioned_writes.py: ## @@ -451,6 +451,11 @@ def test_dynamic_partition_overwrite_unpartitioned_evolve_to_identity_tr

Re: [PR] Spec: Add DV information in overview [iceberg]

2025-07-08 Thread via GitHub
stevenzwu commented on PR #13189: URL: https://github.com/apache/iceberg/pull/13189#issuecomment-3050427938 Do we need a dev ML vote for this one? It is not a spec change. just clarifying or fixing the wording. -- This is an automated message from the Apache Git Service. To respond

Re: [PR] Partition statistics metadata reading [iceberg-python]

2025-07-08 Thread via GitHub
kevinjqliu commented on code in PR #2146: URL: https://github.com/apache/iceberg-python/pull/2146#discussion_r2193517014 ## tests/table/test_metadata.py: ## @@ -173,13 +173,13 @@ def test_updating_metadata(example_table_metadata_v2: Dict[str, Any]) -> None: def test_serialize_

Re: [PR] Partition statistics metadata reading [iceberg-python]

2025-07-08 Thread via GitHub
kevinjqliu merged PR #2146: URL: https://github.com/apache/iceberg-python/pull/2146 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [PR] Merge-on-Read Write Support [iceberg-python]

2025-07-08 Thread via GitHub
rutb327 closed pull request #2189: Merge-on-Read Write Support URL: https://github.com/apache/iceberg-python/pull/2189 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubs

[PR] feat(table): Support Dynamic Partition Overwrite [iceberg-go]

2025-07-08 Thread via GitHub
dttung2905 opened a new pull request, #482: URL: https://github.com/apache/iceberg-go/pull/482 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [PR] Add support for Bodo DataFrame [iceberg-python]

2025-07-08 Thread via GitHub
ehsantn commented on PR #2167: URL: https://github.com/apache/iceberg-python/pull/2167#issuecomment-3050348312 All tests are passing locally for me now. Hopefully the CI works too. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [PR] Spark 4.0: Row Lineage support [iceberg]

2025-07-08 Thread via GitHub
stevenzwu commented on code in PR #13310: URL: https://github.com/apache/iceberg/pull/13310#discussion_r2193459841 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/source/ExtractRowLineageFromMetadata.java: ## @@ -0,0 +1,69 @@ +/* + * Licensed to the Apache Software Fo

[PR] Docs: Add Bodo to the docs sidebar [iceberg]

2025-07-08 Thread via GitHub
IsaacWarren opened a new pull request, #13495: URL: https://github.com/apache/iceberg/pull/13495 Link to Bodo docs in the sidebar. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] Spark 4.0: Row Lineage support [iceberg]

2025-07-08 Thread via GitHub
stevenzwu commented on code in PR #13310: URL: https://github.com/apache/iceberg/pull/13310#discussion_r2193459841 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/source/ExtractRowLineageFromMetadata.java: ## @@ -0,0 +1,69 @@ +/* + * Licensed to the Apache Software Fo

[I] It will be nice if redundant partitions will be allowed [iceberg]

2025-07-08 Thread via GitHub
alsugiliazova opened a new issue, #13494: URL: https://github.com/apache/iceberg/issues/13494 ### Proposed Change Since spec allows specifying one column several time in partition spec, it will be nice to have this possibility in PyIceberg as well. Spark also allows it. In my ca

[PR] docs: remove mentiones of dynamodb [iceberg-go]

2025-07-08 Thread via GitHub
laskoviymishka opened a new pull request, #481: URL: https://github.com/apache/iceberg-go/pull/481 see: apache/iceberg#9783 related: https://github.com/apache/iceberg-go/issues/477 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] Spark: Support Parquet dictionary encoded UUIDs [iceberg]

2025-07-08 Thread via GitHub
RussellSpitzer commented on code in PR #13324: URL: https://github.com/apache/iceberg/pull/13324#discussion_r2193421243 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/data/vectorized/parquet/TestParquetVectorizedReads.java: ## @@ -400,4 +400,19 @@ public void testUns

Re: [PR] Partition statistics metadata reading [iceberg-python]

2025-07-08 Thread via GitHub
Fokko commented on code in PR #2146: URL: https://github.com/apache/iceberg-python/pull/2146#discussion_r2193411814 ## tests/table/test_metadata.py: ## @@ -173,13 +173,13 @@ def test_updating_metadata(example_table_metadata_v2: Dict[str, Any]) -> None: def test_serialize_v1(ex

Re: [PR] Fix UUID support [iceberg-python]

2025-07-08 Thread via GitHub
Fokko commented on PR #2007: URL: https://github.com/apache/iceberg-python/pull/2007#issuecomment-3050242622 Thanks @kevinjqliu -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

Re: [PR] Fix UUID support [iceberg-python]

2025-07-08 Thread via GitHub
Fokko merged PR #2007: URL: https://github.com/apache/iceberg-python/pull/2007 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceber

Re: [I] UUIDType with BucketTransform incorrectly converts int to str in PartitionKey [iceberg-python]

2025-07-08 Thread via GitHub
Fokko closed issue #2002: UUIDType with BucketTransform incorrectly converts int to str in PartitionKey URL: https://github.com/apache/iceberg-python/issues/2002 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

Re: [I] Error creating table from pyarrow schema with pa.uuid() [iceberg-python]

2025-07-08 Thread via GitHub
Fokko closed issue #1986: Error creating table from pyarrow schema with pa.uuid() URL: https://github.com/apache/iceberg-python/issues/1986 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] docs: add ugi back to hive catalog config [iceberg-python]

2025-07-08 Thread via GitHub
Fokko merged PR #2188: URL: https://github.com/apache/iceberg-python/pull/2188 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceber

Re: [PR] Use Iceberg-Rust for parsing the ManifestList and Manifests [iceberg-python]

2025-07-08 Thread via GitHub
Fokko commented on PR #2004: URL: https://github.com/apache/iceberg-python/pull/2004#issuecomment-3050192327 @yogevyuval Thanks for asking, and yes, I do expect performance impact since just a part of the deserialization is cythonized. With this change, much more is pushed into Rust. This w

Re: [PR] fix(catalog/glue): case insensitive type match [iceberg-go]

2025-07-08 Thread via GitHub
vbekiaris commented on PR #480: URL: https://github.com/apache/iceberg-go/pull/480#issuecomment-3050126325 thanks @zeroshade for the quick review! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [I] "table X.Y is not an iceberg table" on a valid iceberg table [iceberg-go]

2025-07-08 Thread via GitHub
zeroshade closed issue #479: "table X.Y is not an iceberg table" on a valid iceberg table URL: https://github.com/apache/iceberg-go/issues/479 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] fix(catalog/glue): case insensitive type match [iceberg-go]

2025-07-08 Thread via GitHub
zeroshade merged PR #480: URL: https://github.com/apache/iceberg-go/pull/480 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [PR] Use Iceberg-Rust for parsing the ManifestList and Manifests [iceberg-python]

2025-07-08 Thread via GitHub
yogevyuval commented on PR #2004: URL: https://github.com/apache/iceberg-python/pull/2004#issuecomment-3050120401 @Fokko This is awesome! thanks for your work on this one. Do we expect any performance/memory differences with reading it with the rust module? -- This is an automated mess

Re: [PR] update daft links [iceberg-python]

2025-07-08 Thread via GitHub
kevinjqliu commented on PR #2169: URL: https://github.com/apache/iceberg-python/pull/2169#issuecomment-3050095128 Looks like something changed in the newer version This test failed https://github.com/apache/iceberg-python/blob/e33cf5ac1adf47131d4992bdb686f0e58f4e4669/tests/integrat

Re: [PR] fix(catalog/glue): case insensitive type match [iceberg-go]

2025-07-08 Thread via GitHub
vbekiaris commented on code in PR #480: URL: https://github.com/apache/iceberg-go/pull/480#discussion_r2193300692 ## catalog/glue/glue.go: ## @@ -694,7 +695,7 @@ func (c *Catalog) getTable(ctx context.Context, database, tableName string) (*ty return nil, fmt.Err

[PR] Merge-on-Read Write Support [iceberg-python]

2025-07-08 Thread via GitHub
rutb327 opened a new pull request, #2189: URL: https://github.com/apache/iceberg-python/pull/2189 PR for writing deletes in Merge-on-read mode -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] update daft links [iceberg-python]

2025-07-08 Thread via GitHub
ccmao1130 commented on PR #2169: URL: https://github.com/apache/iceberg-python/pull/2169#issuecomment-3050036221 @kevinjqliu Okay I think it should be okay now? (sorry not a developer hahaha) -- This is an automated message from the Apache Git Service. To respond to the message, pleas

Re: [PR] Spark 4.0: Row Lineage support [iceberg]

2025-07-08 Thread via GitHub
amogh-jahagirdar commented on code in PR #13310: URL: https://github.com/apache/iceberg/pull/13310#discussion_r2193104760 ## spark/v4.0/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestRowLevelOperationsWithLineage.java: ## @@ -0,0 +1,491 @@ +/* + * License

Re: [PR] fix(catalog/glue): case insensitive type match [iceberg-go]

2025-07-08 Thread via GitHub
zeroshade commented on code in PR #480: URL: https://github.com/apache/iceberg-go/pull/480#discussion_r2193079278 ## catalog/glue/glue.go: ## @@ -694,7 +695,7 @@ func (c *Catalog) getTable(ctx context.Context, database, tableName string) (*ty return nil, fmt.Err

Re: [PR] Use short string in Variant when possible [iceberg]

2025-07-08 Thread via GitHub
RussellSpitzer commented on code in PR #13284: URL: https://github.com/apache/iceberg/pull/13284#discussion_r2193056673 ## api/src/test/java/org/apache/iceberg/variants/TestSerializedObject.java: ## @@ -257,13 +246,39 @@ public void testLargeObject(boolean sortFieldNames) {

Re: [PR] Use short string in Variant when possible [iceberg]

2025-07-08 Thread via GitHub
RussellSpitzer commented on code in PR #13284: URL: https://github.com/apache/iceberg/pull/13284#discussion_r2193055956 ## api/src/test/java/org/apache/iceberg/variants/TestSerializedObject.java: ## @@ -257,13 +246,39 @@ public void testLargeObject(boolean sortFieldNames) {

[PR] [docs] Add two Iceberg blogs [iceberg]

2025-07-08 Thread via GitHub
rmoff opened a new pull request, #13493: URL: https://github.com/apache/iceberg/pull/13493 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-ma

Re: [PR] AWS: Prevent excessive creation of auth sessions in S3V4RestSignerClient [iceberg]

2025-07-08 Thread via GitHub
danielcweeks commented on code in PR #13215: URL: https://github.com/apache/iceberg/pull/13215#discussion_r2192968586 ## aws/src/main/java/org/apache/iceberg/aws/s3/signer/S3V4RestSignerClient.java: ## @@ -76,15 +80,25 @@ public abstract class S3V4RestSignerClient private s

Re: [PR] Core/REST: generify AuthSessionCache [iceberg]

2025-07-08 Thread via GitHub
danielcweeks commented on PR #12562: URL: https://github.com/apache/iceberg/pull/12562#issuecomment-3049574873 @adutra I'm not sure I fully understand the value of this change at this point. We're making a breaking change that's going to take multiple releases to finalize, but I'm not conv

Re: [PR] Add BigQuery Dependencies for Iceberg GCP Bundle [iceberg]

2025-07-08 Thread via GitHub
jbonofre commented on PR #13111: URL: https://github.com/apache/iceberg/pull/13111#issuecomment-3049552964 @talatuyarer it's good for me (just a couple of minor things). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] Add BigQuery Dependencies for Iceberg GCP Bundle [iceberg]

2025-07-08 Thread via GitHub
jbonofre commented on code in PR #13111: URL: https://github.com/apache/iceberg/pull/13111#discussion_r2192925788 ## gcp-bundle/LICENSE: ## @@ -210,37 +210,160 @@ This binary artifact contains: Group: com.fasterxml.jackson.core Name: jackson-core Version: 2.18.2 Project URL:

Re: [PR] Add BigQuery Dependencies for Iceberg GCP Bundle [iceberg]

2025-07-08 Thread via GitHub
jbonofre commented on code in PR #13111: URL: https://github.com/apache/iceberg/pull/13111#discussion_r2192927669 ## gcp-bundle/LICENSE: ## @@ -540,7 +1044,36 @@ License: Apache 2 - https://www.apache.org/licenses/LICENSE-2.0 Group: org.threeten Name: threetenbp Version: 1.7

Re: [PR] Add BigQuery Dependencies for Iceberg GCP Bundle [iceberg]

2025-07-08 Thread via GitHub
jbonofre commented on code in PR #13111: URL: https://github.com/apache/iceberg/pull/13111#discussion_r2192927669 ## gcp-bundle/LICENSE: ## @@ -540,7 +1044,36 @@ License: Apache 2 - https://www.apache.org/licenses/LICENSE-2.0 Group: org.threeten Name: threetenbp Version: 1.7

Re: [PR] Add BigQuery Dependencies for Iceberg GCP Bundle [iceberg]

2025-07-08 Thread via GitHub
jbonofre commented on code in PR #13111: URL: https://github.com/apache/iceberg/pull/13111#discussion_r2192925788 ## gcp-bundle/LICENSE: ## @@ -210,37 +210,160 @@ This binary artifact contains: Group: com.fasterxml.jackson.core Name: jackson-core Version: 2.18.2 Project URL:

[I] [docs] Default to the latest stable release [iceberg]

2025-07-08 Thread via GitHub
rmoff opened a new issue, #13492: URL: https://github.com/apache/iceberg/issues/13492 ### Feature Request / Improvement At the moment docs default to `nightly`. I would like to suggest that it defaults to the latest stable release. For an end user this is a more common default, a

[PR] typo: FILED_ID_PROP -> FIELD_ID_PROP [iceberg-rust]

2025-07-08 Thread via GitHub
kevinjqliu opened a new pull request, #1497: URL: https://github.com/apache/iceberg-rust/pull/1497 ## Which issue does this PR close? - Closes #. ## What changes are included in this PR? LLMs are great at catching typos. Found this while reviewing #1482

Re: [PR] [#13278] : Upgrade junit 5 to 5.13.x [iceberg]

2025-07-08 Thread via GitHub
manuzhang commented on code in PR #13280: URL: https://github.com/apache/iceberg/pull/13280#discussion_r2192890381 ## gradle/libs.versions.toml: ## @@ -65,8 +65,8 @@ jakarta-servlet-api = "6.1.0" jaxb-api = "2.3.1" jaxb-runtime = "2.3.9" jetty = "11.0.25" -junit = "5.12.2" -j

Re: [PR] Build: Bump nessie from 0.104.1 to 0.104.2 [iceberg]

2025-07-08 Thread via GitHub
manuzhang commented on PR #13314: URL: https://github.com/apache/iceberg/pull/13314#issuecomment-3049483284 @snazy Please help review https://github.com/apache/iceberg/pull/13490 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] Spark: Use native table FileIO instead of Hadoop to save file list in RewriteTablePath [iceberg]

2025-07-08 Thread via GitHub
kevinjqliu commented on PR #13459: URL: https://github.com/apache/iceberg/pull/13459#issuecomment-3049462165 @szehon-ho since we made changes to spark 3.4, 3.5, and 4.0, are there any special deployment steps we need to go through? -- This is an automated message from the Apache Git Serv

Re: [PR] chore: bump C++ standard to 23 [iceberg-cpp]

2025-07-08 Thread via GitHub
wgtmac commented on PR #139: URL: https://github.com/apache/iceberg-cpp/pull/139#issuecomment-3049425482 > I just changed to a more recent arrow commit which contains the Apache Thrift update[1], which should fix our frequent CI failure. > > [1] [apache/arrow#46912](https://github.com

Re: [PR] Spark: Use native table FileIO instead of Hadoop to save file list in RewriteTablePath [iceberg]

2025-07-08 Thread via GitHub
kevinjqliu commented on code in PR #13459: URL: https://github.com/apache/iceberg/pull/13459#discussion_r2192845891 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteTablePathSparkAction.java: ## @@ -312,22 +313,25 @@ private String rebuildMetadata() {

Re: [PR] Read ManifestList V1 with V2 projection. [iceberg-rust]

2025-07-08 Thread via GitHub
kevinjqliu commented on code in PR #1482: URL: https://github.com/apache/iceberg-rust/pull/1482#discussion_r2192842873 ## crates/iceberg/src/avro/schema.rs: ## @@ -81,20 +81,33 @@ impl SchemaVisitor for SchemaToAvroSchema { field_schema = avro_optional(field_schema)

[PR] [docs] Tidy up left-hand navigation [iceberg]

2025-07-08 Thread via GitHub
rmoff opened a new pull request, #13491: URL: https://github.com/apache/iceberg/pull/13491 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-ma

Re: [PR] Spark: Use native table FileIO instead of Hadoop to save file list in RewriteTablePath [iceberg]

2025-07-08 Thread via GitHub
NikitaMatskevich commented on code in PR #13459: URL: https://github.com/apache/iceberg/pull/13459#discussion_r2192822188 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteTablePathSparkAction.java: ## @@ -312,22 +313,25 @@ private String rebuildMetadata(

Re: [PR] Spark: Use native table FileIO instead of Hadoop to save file list in RewriteTablePath [iceberg]

2025-07-08 Thread via GitHub
NikitaMatskevich commented on code in PR #13459: URL: https://github.com/apache/iceberg/pull/13459#discussion_r2192822188 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteTablePathSparkAction.java: ## @@ -312,22 +313,25 @@ private String rebuildMetadata(

Re: [I] Incorrect type definition for identifier-field-ids [iceberg-rust]

2025-07-08 Thread via GitHub
ZENOTME commented on issue #1487: URL: https://github.com/apache/iceberg-rust/issues/1487#issuecomment-3049384002 Look like none of identifier-field-ids and empty of identifier-field-ids can be consider as the same effect. And represent both of them as empty identifier-field-ids make we use

Re: [PR] Spark 4.0: Row Lineage support [iceberg]

2025-07-08 Thread via GitHub
amogh-jahagirdar commented on code in PR #13310: URL: https://github.com/apache/iceberg/pull/13310#discussion_r2192814913 ## spark/v4.0/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestRowLevelOperationsWithLineage.java: ## @@ -0,0 +1,491 @@ +/* + * License

[PR] Build: Bump nessie to 0.104.2 skipping tests in JDK 11 [iceberg]

2025-07-08 Thread via GitHub
manuzhang opened a new pull request, #13490: URL: https://github.com/apache/iceberg/pull/13490 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [PR] Spark 4.0: Row Lineage support [iceberg]

2025-07-08 Thread via GitHub
amogh-jahagirdar commented on code in PR #13310: URL: https://github.com/apache/iceberg/pull/13310#discussion_r2192811369 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/source/SparkWrite.java: ## @@ -699,29 +708,41 @@ public DataWriter createWriter(int partitionId,

Re: [PR] Spec: Add DV information in overview [iceberg]

2025-07-08 Thread via GitHub
ajantha-bhat commented on code in PR #13189: URL: https://github.com/apache/iceberg/pull/13189#discussion_r2192790327 ## format/spec.md: ## @@ -1106,7 +1105,7 @@ Notes: This section details how to encode row-level deletes in Iceberg delete files. Row-level deletes are added

Re: [PR] Feat: replace sort order [iceberg-python]

2025-07-08 Thread via GitHub
mwa28 commented on PR #1500: URL: https://github.com/apache/iceberg-python/pull/1500#issuecomment-3049337981 Hello, any update on this PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [I] [feat] Support update table's sort order [iceberg-python]

2025-07-08 Thread via GitHub
mwa28 commented on issue #1245: URL: https://github.com/apache/iceberg-python/issues/1245#issuecomment-3049335993 Hello, any update on the PR ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] Spark: Use native table FileIO instead of Hadoop to save file list in RewriteTablePath [iceberg]

2025-07-08 Thread via GitHub
kevinjqliu commented on PR #13459: URL: https://github.com/apache/iceberg/pull/13459#issuecomment-3049302014 > > The following files had format violations: CI failed on formatting -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[PR] fix(catalog/glue): case insensitive type match [iceberg-go]

2025-07-08 Thread via GitHub
vbekiaris opened a new pull request, #480: URL: https://github.com/apache/iceberg-go/pull/480 table_type and database_type key values are now compared against "ICEBERG" string in case-insensitive way (similar to how PyIceberg and Java implementations work). Fixes #479 -- This

[PR] docs: add ugi back to hive catalog config [iceberg-python]

2025-07-08 Thread via GitHub
kevinjqliu opened a new pull request, #2188: URL: https://github.com/apache/iceberg-python/pull/2188 # Rationale for this change [`ugi` is a Hive Catalog property](https://github.com/apache/iceberg-python/blob/e33cf5ac1adf47131d4992bdb686f0e58f4e4669/pyiceberg/catalog/

  1   2   >