Re: [I] ALTER TABLE... WRITE LOCALLY ORDERED BY not properly setting the write.distribution-mode table property [iceberg]

2025-07-11 Thread via GitHub
ssandona commented on issue #13526: URL: https://github.com/apache/iceberg/issues/13526#issuecomment-3064784548 The previous behavior was setting the distribution mode to none. Do you now from which version Is this new behavior? Can you point the related PR? I was actually looking for

Re: [PR] fix: coerce UUID to String in readable_metrics to avoid ClassCastException in Spark [iceberg]

2025-07-11 Thread via GitHub
kadai0308 commented on PR #13087: URL: https://github.com/apache/iceberg/pull/13087#issuecomment-3064737732 @Fokko Hi, I am not sure about what's the next step of this PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

Re: [I] Support HMAC for GCS authentication [iceberg-rust]

2025-07-11 Thread via GitHub
dentiny commented on issue #1500: URL: https://github.com/apache/iceberg-rust/issues/1500#issuecomment-3064716118 > Sorry I missed a "can" here 🫣 Cool, thanks again for the quick and helpful answer! I will check s3 sdk then. I will check how hard it is to implement. -- This

Re: [I] Support HMAC for GCS authentication [iceberg-rust]

2025-07-11 Thread via GitHub
Xuanwo commented on issue #1500: URL: https://github.com/apache/iceberg-rust/issues/1500#issuecomment-3064698540 Sorry I missed a "can" here 🫣 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [docs] Tidy up left-hand navigation [iceberg]

2025-07-11 Thread via GitHub
stevenzwu commented on code in PR #13491: URL: https://github.com/apache/iceberg/pull/13491#discussion_r2202294312 ## docs/mkdocs.yml: ## @@ -22,69 +22,79 @@ plugins: nav: - index.md - - Tables: -- branching.md -- configuration.md -- evolution.md -- mainte

Re: [PR] Docs: Document compute_partition_stats procedure [iceberg]

2025-07-11 Thread via GitHub
ajantha-bhat commented on PR #13532: URL: https://github.com/apache/iceberg/pull/13532#issuecomment-3064613757 @szehon-ho: Thanks for the review. I have updated the document. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [I] ALTER TABLE... WRITE LOCALLY ORDERED BY not properly setting the write.distribution-mode table property [iceberg]

2025-07-11 Thread via GitHub
manuzhang commented on issue #13526: URL: https://github.com/apache/iceberg/issues/13526#issuecomment-3064481468 Yes, because we agree it was a bug to implicitly changed to 'none' without setting distribution mode. -- This is an automated message from the Apache Git Service. To respond to

Re: [I] Can't partition by nested field [iceberg-python]

2025-07-11 Thread via GitHub
geruh commented on issue #2095: URL: https://github.com/apache/iceberg-python/issues/2095#issuecomment-3064474306 I was incorrect about this looks like the java implementation allows for primitives that are referenced by structs. However the determine partitioning in pyiceberg doesn't have

Re: [I] Merge snapshots into 1 under transaction of multiple operations [iceberg-python]

2025-07-11 Thread via GitHub
ForeverAngry commented on issue #2201: URL: https://github.com/apache/iceberg-python/issues/2201#issuecomment-3064460915 > ### Question > For my use case, I have a daily cron job that batch process and append data but I only want a single snapshot record after the whole process. I tried

Re: [PR] Metrics reporting [iceberg-rust]

2025-07-11 Thread via GitHub
sdd commented on PR #1496: URL: https://github.com/apache/iceberg-rust/pull/1496#issuecomment-3064451392 I put something together off the back of my earlier comment and opened a draft PR. Here's the relevant commit: https://github.com/apache/iceberg-rust/pull/1502/commits/2a02e559a71c30ebd9

[PR] Traces & Metrics: Scan Plan [iceberg-rust]

2025-07-11 Thread via GitHub
sdd opened a new pull request, #1502: URL: https://github.com/apache/iceberg-rust/pull/1502 ## Which issue does this PR close? - Follows on from https://github.com/apache/iceberg-rust/issues/482 - builds on https://github.com/apache/iceberg-rust/pull/1486 - Related to https://git

Re: [PR] Spark 4.0: Add procedure to compute partition stats [iceberg]

2025-07-11 Thread via GitHub
ajantha-bhat commented on PR #13523: URL: https://github.com/apache/iceberg/pull/13523#issuecomment-3064394746 > I guess not a exact clean backport as its using new spark 4.0 procedure framework :) but late lgtm Yes. I have mentioned that after a clean backport, adopted to 4.0 procedu

Re: [PR] This hacky fix shows an improvement in memory usage for iceberg#13297 [iceberg]

2025-07-11 Thread via GitHub
github-actions[bot] commented on PR #13298: URL: https://github.com/apache/iceberg/pull/13298#issuecomment-3064360294 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [PR] Prevent driver from overwhelming during orphan file removal [iceberg]

2025-07-11 Thread via GitHub
github-actions[bot] commented on PR #13084: URL: https://github.com/apache/iceberg/pull/13084#issuecomment-3064359817 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [I] Column statistics are not collected when the columns name is escaped [iceberg]

2025-07-11 Thread via GitHub
github-actions[bot] commented on issue #11950: URL: https://github.com/apache/iceberg/issues/11950#issuecomment-3064359640 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [PR] Metrics reporting [iceberg-rust]

2025-07-11 Thread via GitHub
sdd commented on PR #1496: URL: https://github.com/apache/iceberg-rust/pull/1496#issuecomment-3064213003 Thanks for this contribution - I can see that you've put a lot of work into this and it is appreciated! I do have some concerns with this approach though - I think that by mimicki

Re: [PR] Metrics reporting [iceberg-rust]

2025-07-11 Thread via GitHub
sdd commented on code in PR #1496: URL: https://github.com/apache/iceberg-rust/pull/1496#discussion_r2202042699 ## crates/iceberg/Cargo.toml: ## @@ -90,6 +90,7 @@ typed-builder = { workspace = true } url = { workspace = true } uuid = { workspace = true } zstd = { workspace =

Re: [PR] Core: Implement close() method in CompositeMetricsReporter [iceberg]

2025-07-11 Thread via GitHub
amogh-jahagirdar merged PR #13535: URL: https://github.com/apache/iceberg/pull/13535 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@

Re: [PR] Metrics reporting [iceberg-rust]

2025-07-11 Thread via GitHub
sdd commented on code in PR #1496: URL: https://github.com/apache/iceberg-rust/pull/1496#discussion_r2202035793 ## crates/iceberg/src/delete_file_index.rs: ## Review Comment: Makes sense. If I recall, when `delete_file_index.rs` was added, the scan module didn't exist as a

Re: [PR] Implement missing close() method in CompositeMetricsReporter [iceberg]

2025-07-11 Thread via GitHub
amogh-jahagirdar commented on PR #13535: URL: https://github.com/apache/iceberg/pull/13535#issuecomment-3064160154 Thanks @anoopj , and @nandorKollar for reviewing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

Re: [PR] refactor: TableScan file plan generation now implemented purely in streams rather than channels [iceberg-rust]

2025-07-11 Thread via GitHub
sdd commented on PR #1486: URL: https://github.com/apache/iceberg-rust/pull/1486#issuecomment-3064117170 @liurenjie1024 / @Xuanwo It would be great to get a review of this if you could find the time. This fully streaming approach is much cleaner and easier to follow, with better error handl

Re: [PR] Scan Delete Support Part 6: Equality Delete Parsing [iceberg-rust]

2025-07-11 Thread via GitHub
sdd commented on code in PR #1017: URL: https://github.com/apache/iceberg-rust/pull/1017#discussion_r2201925656 ## crates/iceberg/src/arrow/caching_delete_file_loader.rs: ## @@ -308,28 +319,233 @@ impl CachingDeleteFileLoader { Ok(result) } -/// Parses record

[PR] add PARTITION_SUMMARY_PROP [iceberg-python]

2025-07-11 Thread via GitHub
gtrettenero opened a new pull request, #2202: URL: https://github.com/apache/iceberg-python/pull/2202 # Rationale for this change # Are these changes tested? # Are there any user-facing changes? -- This is an automated message from the Apache G

Re: [I] Support HMAC for GCS authentication [iceberg-rust]

2025-07-11 Thread via GitHub
dentiny commented on issue #1500: URL: https://github.com/apache/iceberg-rust/issues/1500#issuecomment-3063641648 > We add this support in gcs Sorry for the dumb question, does that mean it's already supported but I'm unaware of? 😢 -- This is an automated message from the Apach

Re: [PR] feat: bump datafusion to 48 [iceberg-rust]

2025-07-11 Thread via GitHub
colinmarc commented on PR #1501: URL: https://github.com/apache/iceberg-rust/pull/1501#issuecomment-3063600973 Famous last words! The CI issues seem to mostly revolve around everything needing `protoc` to build now (because of datafusion-substrait). -- This is an automated message from t

[PR] Implement missing close() method in CompositeMetricsReporter [iceberg]

2025-07-11 Thread via GitHub
anoopj opened a new pull request, #13535: URL: https://github.com/apache/iceberg/pull/13535 This ensures that when a CompositeMetricsReporter is closed, all of its underlying reporters are properly closed as well, preventing resource leaks. -- This is an automated message from the Apache

[PR] Spark: Reordering partitions via partition evolution can disable SPJ [iceberg]

2025-07-11 Thread via GitHub
jbewing opened a new pull request, #13534: URL: https://github.com/apache/iceberg/pull/13534 This PR fixes a case where tables with current identical partitioning structures where one table was created directly w/ this structure and the other was created, had it's partitions evolved in a wa

Re: [PR] API, Spark 4.0: Expose cleanExpiredMetadata in expire_snapshots Spark procedure [iceberg]

2025-07-11 Thread via GitHub
nastra merged PR #13509: URL: https://github.com/apache/iceberg/pull/13509 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Make metrics reporting asynchronous [iceberg]

2025-07-11 Thread via GitHub
nastra merged PR #13507: URL: https://github.com/apache/iceberg/pull/13507 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Docs: Document compute_partition_stats procedure [iceberg]

2025-07-11 Thread via GitHub
szehon-ho commented on code in PR #13532: URL: https://github.com/apache/iceberg/pull/13532#discussion_r2201602850 ## docs/docs/spark-procedures.md: ## @@ -974,6 +974,38 @@ Collect statistics of the snapshot with id `snap1` of table `my_table` for colum CALL catalog_name.syste

[PR] feat: bump datafusion to 48 [iceberg-rust]

2025-07-11 Thread via GitHub
colinmarc opened a new pull request, #1501: URL: https://github.com/apache/iceberg-rust/pull/1501 This did not require very many changes! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] maint: catalog implementation roundtripping tests [iceberg-python]

2025-07-11 Thread via GitHub
jayceslesar commented on PR #2090: URL: https://github.com/apache/iceberg-python/pull/2090#issuecomment-3063355842 @kevinjqliu can you please trigger tests again? Just want to see everything hopefully work -- This is an automated message from the Apache Git Service. To respond to the mess

Re: [I] Custom Credential Refresh Client (fsspec, s3fs) [iceberg-python]

2025-07-11 Thread via GitHub
snowman2 commented on issue #2018: URL: https://github.com/apache/iceberg-python/issues/2018#issuecomment-3063310830 > Just to confirm, is this the result you’re looking for? Not quite. It has to be an aiobotocore session. I would like it to look like: ```python catalog = lo

Re: [I] Delegate `delete` to JUnit [iceberg]

2025-07-11 Thread via GitHub
saumyapandey1998 commented on issue #13506: URL: https://github.com/apache/iceberg/issues/13506#issuecomment-3063293040 Hi @ebyhr! I keep running into the following error while trying to build the project: `A problem occurred configuring root project 'iceberg'. > Could not resolve all

Re: [I] Support HMAC for GCS authentication [iceberg-rust]

2025-07-11 Thread via GitHub
Xuanwo commented on issue #1500: URL: https://github.com/apache/iceberg-rust/issues/1500#issuecomment-3063292765 We add this support in gcs. BTW, you can just use those HMAC key to access gcs in S3 XML API. -- This is an automated message from the Apache Git Service. To respond to the mes

[I] Support HMAC for GCS authentication [iceberg-rust]

2025-07-11 Thread via GitHub
dentiny opened a new issue, #1500: URL: https://github.com/apache/iceberg-rust/issues/1500 ### Is your feature request related to a problem or challenge? duckdb leverages HMAC keys for authentication: https://duckdb.org/docs/stable/guides/network_cloud_storage/gcs_import.html I

Re: [PR] Add support for Bodo DataFrame [iceberg-python]

2025-07-11 Thread via GitHub
ehsantn commented on PR #2167: URL: https://github.com/apache/iceberg-python/pull/2167#issuecomment-3063211683 @kevinjqliu please advise on next steps. This PR looks ready to merge to me and the flakiness of existing unit tests should be addressed separately (would be happy to contribute if

Re: [PR] Use short string in Variant when possible [iceberg]

2025-07-11 Thread via GitHub
aihuaxu commented on code in PR #13284: URL: https://github.com/apache/iceberg/pull/13284#discussion_r2201422332 ## api/src/test/java/org/apache/iceberg/variants/TestSerializedObject.java: ## @@ -18,7 +18,7 @@ */ package org.apache.iceberg.variants; -import static org.asser

Re: [PR] feat: add schema conversion from avro timestamp-millis [iceberg-python]

2025-07-11 Thread via GitHub
matthias-Q commented on PR #2173: URL: https://github.com/apache/iceberg-python/pull/2173#issuecomment-3063182750 @kevinjqliu now that #2007 is merged should I add UUID conversion here or in a separate PR? -- This is an automated message from the Apache Git Service. To respond to the mess

Re: [PR] Spark 4.0: Add procedure to compute partition stats [iceberg]

2025-07-11 Thread via GitHub
szehon-ho commented on PR #13523: URL: https://github.com/apache/iceberg/pull/13523#issuecomment-3063155979 I guess not a exact clean backport as its using new spark 4.0 procedure framework :) but late lgtm -- This is an automated message from the Apache Git Service. To respond to the me

Re: [PR] pluggable routers [iceberg]

2025-07-11 Thread via GitHub
kumarpritam863 commented on PR #12859: URL: https://github.com/apache/iceberg/pull/12859#issuecomment-3063071747 @bryanck for review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Backward Compatibility Support with Field Deletion Handling [iceberg]

2025-07-11 Thread via GitHub
kumarpritam863 commented on PR #12954: URL: https://github.com/apache/iceberg/pull/12954#issuecomment-3063069564 @bryanck can we please review this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[PR] Docker: Publish standalone iceberg docker containers for testings and demo [iceberg]

2025-07-11 Thread via GitHub
lliangyu-lin opened a new pull request, #13533: URL: https://github.com/apache/iceberg/pull/13533 ### Description * Add support to build and publish iceberg docker images with bundled spark, iceberg spark runtime, minio, and rest catalog. * It is bundled because Docker compose cannot b

[PR] Docs: Document compute_partition_stats procedure [iceberg]

2025-07-11 Thread via GitHub
ajantha-bhat opened a new pull request, #13532: URL: https://github.com/apache/iceberg/pull/13532 Follow up from https://github.com/apache/iceberg/pull/13523 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

Re: [I] ALTER TABLE... WRITE LOCALLY ORDERED BY not properly setting the write.distribution-mode table property [iceberg]

2025-07-11 Thread via GitHub
ssandona commented on issue #13526: URL: https://github.com/apache/iceberg/issues/13526#issuecomment-3062973461 Agree that the absence of write.distribution-mode does not mean it is none. What I'm trying to say here is that ALTER TABLE... WRITE LOCALLY ORDERED BY ideally should update tha

Re: [I] Publish Iceberg kafka connect runtime to Confluent hub [iceberg]

2025-07-11 Thread via GitHub
rmoff commented on issue #10745: URL: https://github.com/apache/iceberg/issues/10745#issuecomment-3062941117 ack - I've passed this on. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] AWS: feat Turning AAL default On [iceberg]

2025-07-11 Thread via GitHub
sullis commented on PR #13527: URL: https://github.com/apache/iceberg/pull/13527#issuecomment-3062931700 fyi: I mentioned S3 Analytics Accelerator in my NYC Meetup talk on July 10th https://docs.google.com/presentation/d/1GgBWJwxP_rZLMt4Kixv697CCBO_7vm1VM9YENs2MwdA/edit --

Re: [I] ALTER TABLE... WRITE LOCALLY ORDERED BY not properly setting the write.distribution-mode table property [iceberg]

2025-07-11 Thread via GitHub
manuzhang commented on issue #13526: URL: https://github.com/apache/iceberg/issues/13526#issuecomment-3062926421 @ssandona The absence of distribution mode in the SQL doesn't mean the distribution mode is `none`. `ALTER TABLE... WRITE LOCALLY ORDERED BY` should only change the sort order wh

[PR] Kafka Connect: Add manifests for the transformations [iceberg]

2025-07-11 Thread via GitHub
mimaison opened a new pull request, #13531: URL: https://github.com/apache/iceberg/pull/13531 Iceberg has a service loader manifest for `IcebergSinkConnector` but does not have a manifest for the Kafka Connect transformations. Transformations also need a manifest, otherwise by default, wi

Re: [I] Publish Iceberg kafka connect runtime to Confluent hub [iceberg]

2025-07-11 Thread via GitHub
manuzhang commented on issue #10745: URL: https://github.com/apache/iceberg/issues/10745#issuecomment-3062849055 I downloaded the zip, but it only contained the iceberg-kafka-connector jar, not the same as that built following https://iceberg.apache.org/docs/nightly/kafka-connect/#installat

[I] Reordering partitions via partition evolution can disable Spark Storage Partitioned Joins in some cases [iceberg]

2025-07-11 Thread via GitHub
jbewing opened a new issue, #13530: URL: https://github.com/apache/iceberg/issues/13530 ### Apache Iceberg version 1.9.0 ### Query engine Spark ### Please describe the bug 🐞 ### What If partitions are evolved for an Iceberg table, there is a potential to

Re: [PR] Enforce that test classes start with "Test" instead of suffix [iceberg]

2025-07-11 Thread via GitHub
amogh-jahagirdar commented on code in PR #13466: URL: https://github.com/apache/iceberg/pull/13466#discussion_r2201039545 ## spark/v4.0/spark-runtime/src/integration/java/org/apache/iceberg/spark/TestSmoke.java: ## @@ -31,7 +31,7 @@ import org.junit.jupiter.api.extension.Extend

Re: [PR] Docs: Meetup Guidelines [iceberg]

2025-07-11 Thread via GitHub
RussellSpitzer commented on code in PR #13520: URL: https://github.com/apache/iceberg/pull/13520#discussion_r2200946079 ## site/docs/community.md: ## @@ -38,37 +62,31 @@ Issues are tracked in GitHub: [open-issues]: https://github.com/apache/iceberg/issues [new-issue]: https://

[I] Make customizable the number of Task retries when refreshing the metastore location in BaseMetastoreTableOperations [iceberg]

2025-07-11 Thread via GitHub
astarrr opened a new issue, #13529: URL: https://github.com/apache/iceberg/issues/13529 ### Feature Request / Improvement Hello, I think this feature is missing. In `BaseMetastoreTableOperations#refreshFromMetadataLocation(String)` there is an hardcoded value for Iceberg Tas

Re: [PR] Docs: Meetup Guidelines [iceberg]

2025-07-11 Thread via GitHub
RussellSpitzer commented on code in PR #13520: URL: https://github.com/apache/iceberg/pull/13520#discussion_r2200939675 ## site/docs/community.md: ## @@ -20,16 +20,40 @@ title: "Community" # Welcome! -Apache Iceberg tracks issues in GitHub and prefers to receive contributio

Re: [PR] AWS: Add support to run all integration tests when S3 Analytics Accelerator is enabled [iceberg]

2025-07-11 Thread via GitHub
stubz151 commented on code in PR #13347: URL: https://github.com/apache/iceberg/pull/13347#discussion_r2200933124 ## aws/src/integration/java/org/apache/iceberg/aws/s3/TestS3InputStream.java: ## @@ -160,28 +187,36 @@ private void readAndCheckRanges( .isEqualTo(Arrays.co

Re: [I] Publish Iceberg kafka connect runtime to Confluent hub [iceberg]

2025-07-11 Thread via GitHub
rmoff commented on issue #10745: URL: https://github.com/apache/iceberg/issues/10745#issuecomment-3062618564 By coincidence, and through no effort of my own, the connector is now live: https://www.confluent.io/hub/iceberg/iceberg-kafka-connect -- This is an automated message from the Apac

Re: [I] iceberg Kafka-connect - 1.9.1 - avro timestampNanos NoSuchMethodError [iceberg]

2025-07-11 Thread via GitHub
raphaelauv commented on issue #13481: URL: https://github.com/apache/iceberg/issues/13481#issuecomment-3062572183 that was also my firt idea but iceberg 1.8.1 is already using the version 1.12.0 of avro https://github.com/apache/iceberg/blob/9ce0fcf0af7becf25ad9fc996c3bad2afdcfd33d/gradle/l

[I] Inconsistent handling of non-existent object locations across FileIO implementations [iceberg]

2025-07-11 Thread via GitHub
alessandro-nori opened a new issue, #13528: URL: https://github.com/apache/iceberg/issues/13528 ### Feature Request / Improvement Currently, S3InputStream throws a NotFoundException when getObject is called on a location that doesn’t exist. This behavior ensures that BaseMetastoreTab

[PR] AWS: feat Turning AAL default On [iceberg]

2025-07-11 Thread via GitHub
stubz151 opened a new pull request, #13527: URL: https://github.com/apache/iceberg/pull/13527 ### What Am I doing We are turning on AAL to be the default stream for S3FileIo. You can read more about our optimizations in our [readme](https://github.com/awslabs/analytics-accelerator-s3?tab

Re: [PR] Make metrics reporting asynchronous [iceberg]

2025-07-11 Thread via GitHub
amogh-jahagirdar commented on code in PR #13507: URL: https://github.com/apache/iceberg/pull/13507#discussion_r2200847375 ## core/src/main/java/org/apache/iceberg/rest/RESTMetricsReporter.java: ## @@ -51,15 +57,20 @@ public void report(MetricsReport report) { return;

Re: [I] Transient AWS Connection Issues [iceberg]

2025-07-11 Thread via GitHub
troy-curtis commented on issue #11412: URL: https://github.com/apache/iceberg/issues/11412#issuecomment-3062458414 I was also encountering this issue, and I also ended up creating longer lived status credentials up front in the Spark driver and then passing them into the catalog configurati

Re: [PR] Make metrics reporting asynchronous [iceberg]

2025-07-11 Thread via GitHub
anoopj commented on code in PR #13507: URL: https://github.com/apache/iceberg/pull/13507#discussion_r2200805021 ## core/src/main/java/org/apache/iceberg/rest/RESTMetricsReporter.java: ## @@ -51,15 +57,20 @@ public void report(MetricsReport report) { return; } -

Re: [I] iceberg Kafka-connect - 1.9.1 - avro timestampNanos NoSuchMethodError [iceberg]

2025-07-11 Thread via GitHub
paolo-cristofanelli commented on issue #13481: URL: https://github.com/apache/iceberg/issues/13481#issuecomment-3062177184 Could it be that kafka-connect-avro-converter 7.9.2 uses avro-11.4 ( I am looking [here](https://mvnrepository.com/artifact/io.confluent/kafka-connect-avro-converter/7.

Re: [I] Pyiceberg allows dropping the sort order column and causes table corruption on AWS Glue Catalog [iceberg-python]

2025-07-11 Thread via GitHub
mwa28 commented on issue #2166: URL: https://github.com/apache/iceberg-python/issues/2166#issuecomment-3062014422 > Hi [@mwa28](https://github.com/mwa28), have you started working on this? It would be great to get it in before the next release. If not, I can take it up. Hello @geruh

Re: [PR] API, Spark: Expose cleanExpiredMetadata in expire_snapshots Spark procedure [iceberg]

2025-07-11 Thread via GitHub
gaborkaszab commented on code in PR #13509: URL: https://github.com/apache/iceberg/pull/13509#discussion_r2200345749 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/actions/ExpireSnapshotsSparkAction.java: ## @@ -79,6 +79,7 @@ public class ExpireSnapshotsSparkAction e

Re: [PR] AWS: Support similar S3 Sync Client configurations for S3 Async Clients [iceberg]

2025-07-11 Thread via GitHub
stubz151 commented on code in PR #13387: URL: https://github.com/apache/iceberg/pull/13387#discussion_r2200312436 ## aws/src/integration/java/org/apache/iceberg/aws/s3/TestFlakyS3InputStream.java: ## @@ -138,6 +230,14 @@ private S3ClientWrapper flakyStreamClient(AtomicInteger c

Re: [I] Nested field IDs in user-defined schema are reassigned during table creation, causing query failures in external engines [iceberg]

2025-07-11 Thread via GitHub
sclee01 commented on issue #13164: URL: https://github.com/apache/iceberg/issues/13164#issuecomment-3061627475 Great. I will take a look at the catalog side based on your point. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[I] ALTER TABLE... WRITE LOCALLY ORDERED BY not properly setting the write.distribution-mode table property [iceberg]

2025-07-11 Thread via GitHub
ssandona opened a new issue, #13526: URL: https://github.com/apache/iceberg/issues/13526 ### Apache Iceberg version 1.7.1 ### Query engine Spark ### Please describe the bug 🐞 When running a `ALTER TABLE... WRITE LOCALLY ORDERED BY` statement it does not set

Re: [PR] Metrics reporting [iceberg-rust]

2025-07-11 Thread via GitHub
DerGut commented on code in PR #1496: URL: https://github.com/apache/iceberg-rust/pull/1496#discussion_r2200246689 ## crates/iceberg/src/metrics.rs: ## @@ -0,0 +1,154 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. S

Re: [I] Nested field IDs in user-defined schema are reassigned during table creation, causing query failures in external engines [iceberg]

2025-07-11 Thread via GitHub
phil-schreiber commented on issue #13164: URL: https://github.com/apache/iceberg/issues/13164#issuecomment-3061491338 I just dove through the table creation and it seems the given IDs are sent to the catalog rest endpoint, but it returns with the IDs overwritten by sequential number. I thin

Re: [I] How to avoid partition key sorting when inserting data into a partitioned Iceberg table? [iceberg]

2025-07-11 Thread via GitHub
jiqiujia commented on issue #10181: URL: https://github.com/apache/iceberg/issues/10181#issuecomment-306188 > From my side solved with use-table-distribution-and-ordering=false. Thank you very much [@eubnara](https://github.com/eubnara) @lrpt Are there any documentation illustrati

Re: [PR] API, Spark: Expose cleanExpiredMetadata in expire_snapshots Spark procedure [iceberg]

2025-07-11 Thread via GitHub
nastra commented on code in PR #13509: URL: https://github.com/apache/iceberg/pull/13509#discussion_r2200150993 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/actions/ExpireSnapshotsSparkAction.java: ## @@ -79,6 +79,7 @@ public class ExpireSnapshotsSparkAction extend

Re: [PR] readme: add link to irc spec [iceberg]

2025-07-11 Thread via GitHub
manuzhang commented on PR #13521: URL: https://github.com/apache/iceberg/pull/13521#issuecomment-3061366201 +1 to have a real page for REST spec under "Specification" tab. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

Re: [I] Nested field IDs in user-defined schema are reassigned during table creation, causing query failures in external engines [iceberg]

2025-07-11 Thread via GitHub
sclee01 commented on issue #13164: URL: https://github.com/apache/iceberg/issues/13164#issuecomment-3061350212 I think somewhere in the java lib. I tested it using rest polaris catalog and glue but they have same issue. -- This is an automated message from the Apache Git Service. To respo

Re: [PR] [docs] Tidy up left-hand navigation [iceberg]

2025-07-11 Thread via GitHub
rmoff commented on code in PR #13491: URL: https://github.com/apache/iceberg/pull/13491#discussion_r2200128695 ## docs/mkdocs.yml: ## @@ -22,69 +22,79 @@ plugins: nav: - index.md - - Tables: -- branching.md -- configuration.md -- evolution.md -- maintenanc

Re: [PR] [docs] Tidy up left-hand navigation [iceberg]

2025-07-11 Thread via GitHub
rmoff commented on code in PR #13491: URL: https://github.com/apache/iceberg/pull/13491#discussion_r2200125780 ## docs/mkdocs.yml: ## @@ -22,69 +22,79 @@ plugins: nav: - index.md - - Tables: -- branching.md -- configuration.md -- evolution.md -- maintenanc

Re: [I] Nested field IDs in user-defined schema are reassigned during table creation, causing query failures in external engines [iceberg]

2025-07-11 Thread via GitHub
phil-schreiber commented on issue #13164: URL: https://github.com/apache/iceberg/issues/13164#issuecomment-3061312480 Awesome, I can confirm this is happening in tabluario rest catalog. Do you have already any pointer, where this is going wrong? Is it on catalog side or are the ids reset so

Re: [I] Nested field IDs in user-defined schema are reassigned during table creation, causing query failures in external engines [iceberg]

2025-07-11 Thread via GitHub
sclee01 commented on issue #13164: URL: https://github.com/apache/iceberg/issues/13164#issuecomment-3061268323 @phil-schreiber Thanks for your comment. I will work on this and raise the PR asap! -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [I] Nested field IDs in user-defined schema are reassigned during table creation, causing query failures in external engines [iceberg]

2025-07-11 Thread via GitHub
phil-schreiber commented on issue #13164: URL: https://github.com/apache/iceberg/issues/13164#issuecomment-3061256178 Still running into this issue in 1.9.1. Any work on this would be much appreciated. As @sclee01 mentioned for schema evolution to be viable it's crucial to be able to specif

Re: [PR] [docs] Add Confluent to vendors page [iceberg]

2025-07-11 Thread via GitHub
rmoff commented on code in PR #13512: URL: https://github.com/apache/iceberg/pull/13512#discussion_r2200068278 ## site/docs/vendors.md: ## @@ -61,6 +61,10 @@ the same copy of data using Spark and run analytics or AI with our [Machine Learning](https://www.cloudera.com/products

Re: [PR] [docs] Remove Tabular from Vendors page [iceberg]

2025-07-11 Thread via GitHub
rmoff commented on code in PR #13511: URL: https://github.com/apache/iceberg/pull/13511#discussion_r2200066519 ## site/docs/vendors.md: ## @@ -132,10 +132,6 @@ The Stackable Data Platform is completely open source, providing maximum portabi Starburst is a commercial offering

[PR] AWS: use aws client region for StsClient if available [iceberg]

2025-07-11 Thread via GitHub
keejon opened a new pull request, #13525: URL: https://github.com/apache/iceberg/pull/13525 Fixes https://github.com/apache/iceberg/issues/13524 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [I] Custom Credential Refresh Client (fsspec, s3fs) [iceberg-python]

2025-07-11 Thread via GitHub
james5418 commented on issue #2018: URL: https://github.com/apache/iceberg-python/issues/2018#issuecomment-3061101620 Hi @snowman2, I’m interested in implementing this and will start with **Option 1**! Just to confirm, is this the result you’re looking for? ```python my_sessio

Re: [PR] API, Spark: Expose cleanExpiredMetadata in expire_snapshots Spark procedure [iceberg]

2025-07-11 Thread via GitHub
gaborkaszab commented on PR #13509: URL: https://github.com/apache/iceberg/pull/13509#issuecomment-3061084396 Thanks for taking a look @nastra @amogh-jahagirdar ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] API, Spark: Expose cleanExpiredMetadata in expire_snapshots Spark procedure [iceberg]

2025-07-11 Thread via GitHub
gaborkaszab commented on code in PR #13509: URL: https://github.com/apache/iceberg/pull/13509#discussion_r2199943583 ## spark/v4.0/spark/src/test/java/org/apache/iceberg/spark/actions/TestExpireSnapshotsAction.java: ## @@ -1342,4 +1342,62 @@ public void testExpireSomeCheckFilesD

[I] AssumeRoleAwsClientFactory does not pick up client.region [iceberg]

2025-07-11 Thread via GitHub
keejon opened a new issue, #13524: URL: https://github.com/apache/iceberg/issues/13524 ### Apache Iceberg version 1.9.1 (latest release) ### Query engine Kafka Connect ### Please describe the bug 🐞 If `iceberg.catalog.client.region` is configured in connecto

Re: [PR] API, Spark: Expose cleanExpiredMetadata in expire_snapshots Spark procedure [iceberg]

2025-07-11 Thread via GitHub
gaborkaszab commented on code in PR #13509: URL: https://github.com/apache/iceberg/pull/13509#discussion_r2199849129 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/actions/ExpireSnapshotsSparkAction.java: ## @@ -158,6 +165,10 @@ public Dataset expireFiles() {

Re: [PR] API, Spark: Expose cleanExpiredMetadata in expire_snapshots Spark procedure [iceberg]

2025-07-11 Thread via GitHub
gaborkaszab commented on code in PR #13509: URL: https://github.com/apache/iceberg/pull/13509#discussion_r2199846533 ## spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/actions/ExpireSnapshotsSparkAction.java: ## @@ -79,6 +79,7 @@ public class ExpireSnapshotsSparkAction e

Re: [PR] Spark 4.0: Add procedure to compute partition stats [iceberg]

2025-07-11 Thread via GitHub
nastra merged PR #13523: URL: https://github.com/apache/iceberg/pull/13523 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap