Re: [PR] Core: Fix filed ids of partition stats file [iceberg]

2025-06-19 Thread via GitHub
lirui-apache commented on code in PR #13329: URL: https://github.com/apache/iceberg/pull/13329#discussion_r2158197890 ## core/src/main/java/org/apache/iceberg/PartitionStatsHandler.java: ## @@ -280,6 +273,8 @@ private static Collection computeAndMergeStatsIncremental( ol

[PR] Build: Bump third party deps [iceberg-python]

2025-06-19 Thread via GitHub
Fokko opened a new pull request, #2127: URL: https://github.com/apache/iceberg-python/pull/2127 In https://github.com/apache/iceberg-python/pull/2125 we got a warning from the google-auth package, and we're on `1.6.3` while `2.40.3` is out there. # Rationale for th

Re: [PR] feat: add or expression [iceberg-cpp]

2025-06-19 Thread via GitHub
wgtmac commented on code in PR #120: URL: https://github.com/apache/iceberg-cpp/pull/120#discussion_r2158169511 ## src/iceberg/expression/expression.h: ## @@ -67,8 +66,8 @@ class ICEBERG_EXPORT Expression { virtual Operation op() const = 0; /// \brief Returns the negatio

Re: [PR] chore(deps): bump cpp-linter/cpp-linter-action from 2.13.3 to 2.15.0 [iceberg-cpp]

2025-06-19 Thread via GitHub
dependabot[bot] commented on PR #114: URL: https://github.com/apache/iceberg-cpp/pull/114#issuecomment-2989984370 Sorry, only users with push access can use that command. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] chore(deps): bump cpp-linter/cpp-linter-action from 2.13.3 to 2.15.0 [iceberg-cpp]

2025-06-19 Thread via GitHub
wgtmac commented on PR #114: URL: https://github.com/apache/iceberg-cpp/pull/114#issuecomment-2989984210 @dependabot rebase -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] chore(deps): bump cpp-linter/cpp-linter-action from 2.13.3 to 2.15.0 [iceberg-cpp]

2025-06-19 Thread via GitHub
wgtmac commented on code in PR #114: URL: https://github.com/apache/iceberg-cpp/pull/114#discussion_r2158181393 ## .github/workflows/cpp-linter.yml: ## @@ -39,7 +39,7 @@ jobs: mkdir build && cd build cmake .. -DCMAKE_EXPORT_COMPILE_COMMANDS=ON cm

Re: [PR] Core: Fix filed ids of partition stats file [iceberg]

2025-06-19 Thread via GitHub
ajantha-bhat commented on code in PR #13329: URL: https://github.com/apache/iceberg/pull/13329#discussion_r2158168420 ## core/src/main/java/org/apache/iceberg/PartitionStatsHandler.java: ## @@ -280,6 +304,8 @@ private static Collection computeAndMergeStatsIncremental( ol

Re: [PR] CI: run FileIO integration tests (s3/adls/gcs) [iceberg-python]

2025-06-19 Thread via GitHub
Fokko commented on code in PR #2125: URL: https://github.com/apache/iceberg-python/pull/2125#discussion_r2158150237 ## .github/workflows/python-ci.yml: ## @@ -84,3 +84,21 @@ jobs: - name: Show debug logs if: ${{ failure() }} run: docker compose -f dev/docker-c

Re: [PR] CI: run FileIO integration tests (s3/adls/gcs) [iceberg-python]

2025-06-19 Thread via GitHub
Fokko commented on code in PR #2125: URL: https://github.com/apache/iceberg-python/pull/2125#discussion_r2158172725 ## Makefile: ## @@ -62,44 +58,51 @@ test-integration-setup: # Prepare the environment for integration docker compose -f dev/docker-compose-integration.yml

Re: [PR] Core: Fix filed ids of partition stats file [iceberg]

2025-06-19 Thread via GitHub
ajantha-bhat commented on code in PR #13329: URL: https://github.com/apache/iceberg/pull/13329#discussion_r2158168420 ## core/src/main/java/org/apache/iceberg/PartitionStatsHandler.java: ## @@ -280,6 +304,8 @@ private static Collection computeAndMergeStatsIncremental( ol

[I] Kafka Connector should initialize the Catalog before starting tasks [iceberg]

2025-06-19 Thread via GitHub
Claudenw opened a new issue, #13356: URL: https://github.com/apache/iceberg/issues/13356 ### Apache Iceberg version None ### Query engine None ### Please describe the bug ๐Ÿž At least one catalog type (JDBC) should be initialized before use. Initializing th

Re: [PR] CI: run FileIO integration tests (s3/adls/gcs) [iceberg-python]

2025-06-19 Thread via GitHub
Fokko commented on code in PR #2125: URL: https://github.com/apache/iceberg-python/pull/2125#discussion_r2158158308 ## Makefile: ## @@ -62,44 +58,51 @@ test-integration-setup: # Prepare the environment for integration docker compose -f dev/docker-compose-integration.yml

Re: [PR] feat: implement PrimitiveLiteral [iceberg-cpp]

2025-06-19 Thread via GitHub
wgtmac commented on code in PR #117: URL: https://github.com/apache/iceberg-cpp/pull/117#discussion_r2131227449 ## src/iceberg/literal.h: ## @@ -0,0 +1,137 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the

Re: [PR] CI: run FileIO integration tests (s3/adls/gcs) [iceberg-python]

2025-06-19 Thread via GitHub
Fokko commented on code in PR #2125: URL: https://github.com/apache/iceberg-python/pull/2125#discussion_r2158157062 ## .github/workflows/python-ci.yml: ## @@ -84,3 +84,21 @@ jobs: - name: Show debug logs if: ${{ failure() }} run: docker compose -f dev/docker-c

Re: [I] Creating 2 or more Iceburg Kafka Sink Connecturs using slow JDBC catalog causes a startup failure on first run. [iceberg]

2025-06-19 Thread via GitHub
Claudenw commented on issue #13343: URL: https://github.com/apache/iceberg/issues/13343#issuecomment-2989941880 Fix in https://github.com/apache/iceberg/pull/13345 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [PR] CI: run FileIO integration tests (s3/adls/gcs) [iceberg-python]

2025-06-19 Thread via GitHub
Fokko commented on code in PR #2125: URL: https://github.com/apache/iceberg-python/pull/2125#discussion_r2158150237 ## .github/workflows/python-ci.yml: ## @@ -84,3 +84,21 @@ jobs: - name: Show debug logs if: ${{ failure() }} run: docker compose -f dev/docker-c

Re: [PR] Optimise RowData evolution [iceberg]

2025-06-19 Thread via GitHub
aiborodin commented on code in PR #13340: URL: https://github.com/apache/iceberg/pull/13340#discussion_r2158148371 ## flink/v2.0/flink/src/main/java/org/apache/iceberg/flink/sink/dynamic/convert/RowDataConverter.java: ## @@ -0,0 +1,89 @@ +/* + * Licensed to the Apache Software F

Re: [PR] Optimise RowData evolution [iceberg]

2025-06-19 Thread via GitHub
aiborodin commented on code in PR #13340: URL: https://github.com/apache/iceberg/pull/13340#discussion_r2158147346 ## flink/v2.0/flink/src/main/java/org/apache/iceberg/flink/sink/dynamic/convert/RowDataConverter.java: ## @@ -0,0 +1,89 @@ +/* + * Licensed to the Apache Software F

Re: [PR] Optimise RowData evolution [iceberg]

2025-06-19 Thread via GitHub
aiborodin commented on code in PR #13340: URL: https://github.com/apache/iceberg/pull/13340#discussion_r2158145848 ## flink/v2.0/flink/src/main/java/org/apache/iceberg/flink/sink/dynamic/convert/DataConverter.java: ## @@ -0,0 +1,102 @@ +/* + * Licensed to the Apache Software Fou

Re: [PR] CI: run FileIO integration tests (s3/adls/gcs) [iceberg-python]

2025-06-19 Thread via GitHub
Fokko commented on code in PR #2125: URL: https://github.com/apache/iceberg-python/pull/2125#discussion_r2158142826 ## Makefile: ## @@ -62,44 +58,51 @@ test-integration-setup: # Prepare the environment for integration docker compose -f dev/docker-compose-integration.yml

Re: [PR] CI: run FileIO integration tests (s3/adls/gcs) [iceberg-python]

2025-06-19 Thread via GitHub
Fokko commented on code in PR #2125: URL: https://github.com/apache/iceberg-python/pull/2125#discussion_r2158141870 ## pyproject.toml: ## @@ -75,6 +75,7 @@ boto3 = { version = ">=1.24.59", optional = true } s3fs = { version = ">=2023.1.0", optional = true } adlfs = { version =

Re: [PR] Optimise RowData evolution [iceberg]

2025-06-19 Thread via GitHub
aiborodin commented on code in PR #13340: URL: https://github.com/apache/iceberg/pull/13340#discussion_r2158138926 ## flink/v2.0/flink/src/main/java/org/apache/iceberg/flink/sink/dynamic/convert/ArrayConverter.java: ## @@ -0,0 +1,45 @@ +/* + * Licensed to the Apache Software Fou

Re: [PR] Build: Bump com.azure:azure-sdk-bom from 1.2.31 to 1.2.35 [iceberg]

2025-06-19 Thread via GitHub
Fokko commented on PR #13201: URL: https://github.com/apache/iceberg/pull/13201#issuecomment-2989907962 @dependabot rebase -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

Re: [PR] Core: Fix filed ids of partition stats file [iceberg]

2025-06-19 Thread via GitHub
ajantha-bhat commented on code in PR #13329: URL: https://github.com/apache/iceberg/pull/13329#discussion_r2158117893 ## core/src/main/java/org/apache/iceberg/PartitionStatsHandler.java: ## @@ -280,6 +273,8 @@ private static Collection computeAndMergeStatsIncremental( ol

Re: [PR] Spark: Make maxRecordPerMicrobatch a soft limit [iceberg]

2025-06-19 Thread via GitHub
singhpk234 commented on code in PR #12988: URL: https://github.com/apache/iceberg/pull/12988#discussion_r2158110714 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkMicroBatchStream.java: ## @@ -387,7 +387,7 @@ public Offset latestOffset(Offset startOffset,

Re: [PR] feat: metadata access support for table [iceberg-cpp]

2025-06-19 Thread via GitHub
wgtmac commented on code in PR #111: URL: https://github.com/apache/iceberg-cpp/pull/111#discussion_r2156289324 ## src/iceberg/type_fwd.h: ## @@ -99,6 +99,9 @@ class TransformFunction; struct PartitionStatisticsFile; struct Snapshot; struct SnapshotRef; +struct SnapshotLogEnt

Re: [PR] Optimise RowData evolution [iceberg]

2025-06-19 Thread via GitHub
aiborodin commented on code in PR #13340: URL: https://github.com/apache/iceberg/pull/13340#discussion_r2158058305 ## flink/v2.0/flink/src/main/java/org/apache/iceberg/flink/sink/dynamic/convert/ArrayConverter.java: ## @@ -0,0 +1,45 @@ +/* + * Licensed to the Apache Software Fou

Re: [PR] Optimise RowData evolution [iceberg]

2025-06-19 Thread via GitHub
aiborodin commented on code in PR #13340: URL: https://github.com/apache/iceberg/pull/13340#discussion_r2158055293 ## flink/v2.0/flink/src/main/java/org/apache/iceberg/flink/sink/dynamic/DynamicRecordProcessor.java: ## @@ -142,10 +151,18 @@ private void emit( Schema schem

Re: [I] Table metadata corruption during parallel upsert operations [iceberg-python]

2025-06-19 Thread via GitHub
arul-cc commented on issue #2120: URL: https://github.com/apache/iceberg-python/issues/2120#issuecomment-2989759477 > > Subsequent operations fail with CommitFailedException (expected due to ACID constraints) > > With more parallel calls, the table becomes inaccessible with "table not f

Re: [I] Table metadata corruption during parallel upsert operations [iceberg-python]

2025-06-19 Thread via GitHub
arul-cc commented on issue #2120: URL: https://github.com/apache/iceberg-python/issues/2120#issuecomment-2989751909 > > Suggestions for safe parallel upsert patterns in Iceberg > > maybe a good approach is to start a transaction, and only commit and the very end after all the upsert

Re: [PR] Bump `AzuriteContainer` to 3.34.0 [iceberg]

2025-06-19 Thread via GitHub
amogh-jahagirdar merged PR #13321: URL: https://github.com/apache/iceberg/pull/13321 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@

Re: [I] Google BigLake Metastore Catalog issue [iceberg-python]

2025-06-19 Thread via GitHub
unfrgivn commented on issue #2122: URL: https://github.com/apache/iceberg-python/issues/2122#issuecomment-2989718202 I just mentioned this over here https://github.com/apache/iceberg-python/issues/1524#issuecomment-2989705907 (since that's the top hit on Google when searching this error).

Re: [I] Validation Error in ConfigResponse Model When connecting Nessie with PyIceberg using RestCatalog [iceberg-python]

2025-06-19 Thread via GitHub
unfrgivn commented on issue #1524: URL: https://github.com/apache/iceberg-python/issues/1524#issuecomment-2989705907 FWIW in case others run into this thread while searching this error. The new [BigLake Iceberg REST catalog](https://cloud.google.com/bigquery/docs/blms-rest-catalog) that ju

[PR] feat: add support for avro to arrow data conversion [iceberg-cpp]

2025-06-19 Thread via GitHub
wgtmac opened a new pull request, #124: URL: https://github.com/apache/iceberg-cpp/pull/124 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-m

Re: [PR] Core: Fix filed ids of partition stats file [iceberg]

2025-06-19 Thread via GitHub
lirui-apache commented on code in PR #13329: URL: https://github.com/apache/iceberg/pull/13329#discussion_r2157959956 ## core/src/main/java/org/apache/iceberg/PartitionStatsHandler.java: ## @@ -280,6 +273,8 @@ private static Collection computeAndMergeStatsIncremental( ol

Re: [PR] Azure: Support multiple storage credential prefixes [iceberg]

2025-06-19 Thread via GitHub
amogh-jahagirdar commented on code in PR #13241: URL: https://github.com/apache/iceberg/pull/13241#discussion_r2157937322 ## build.gradle: ## @@ -558,6 +558,12 @@ project(':iceberg-azure') { testImplementation libs.testcontainers testImplementation libs.mockserver.nett

Re: [PR] Support ADLS with Pyarrow file IO [iceberg-python]

2025-06-19 Thread via GitHub
kevinjqliu commented on code in PR #2111: URL: https://github.com/apache/iceberg-python/pull/2111#discussion_r2157932408 ## pyiceberg/io/pyarrow.py: ## @@ -197,6 +204,7 @@ MAP_VALUE_NAME = "value" DOC = "doc" UTC_ALIASES = {"UTC", "+00:00", "Etc/UTC", "Z"} +MIN_PYARROW_VERSIO

Re: [PR] Core: Remove redundant check in SizeBasedFileRewritePlanner#enoughInputFiles [iceberg]

2025-06-19 Thread via GitHub
manuzhang closed pull request #13281: Core: Remove redundant check in SizeBasedFileRewritePlanner#enoughInputFiles URL: https://github.com/apache/iceberg/pull/13281 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] Core: Remove redundant check in SizeBasedFileRewritePlanner#enoughInputFiles [iceberg]

2025-06-19 Thread via GitHub
manuzhang commented on PR #13281: URL: https://github.com/apache/iceberg/pull/13281#issuecomment-2989609756 Closing in favor of #13355 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Spark 3.4: Backport UPDATE/MERGE logic for row lineage [iceberg]

2025-06-19 Thread via GitHub
amogh-jahagirdar merged PR #13344: URL: https://github.com/apache/iceberg/pull/13344 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@

Re: [PR] Flink: Fix flaky test in testTwoSinksInDisjointedDAG [iceberg]

2025-06-19 Thread via GitHub
Guosmilesmile commented on code in PR #13349: URL: https://github.com/apache/iceberg/pull/13349#discussion_r2157876669 ## flink/v1.19/flink/src/test/java/org/apache/iceberg/flink/sink/TestFlinkIcebergSinkExtended.java: ## @@ -151,12 +151,13 @@ public void testTwoSinksInDisjointe

Re: [PR] Incremental Append Scan [iceberg-python]

2025-06-19 Thread via GitHub
jayceslesar commented on code in PR #2031: URL: https://github.com/apache/iceberg-python/pull/2031#discussion_r2157846278 ## pyiceberg/table/__init__.py: ## @@ -1688,102 +1887,252 @@ def _match_deletes_to_data_file(data_entry: ManifestEntry, positional_delete_ent retur

Re: [PR] feature: expire snapshots action [iceberg-rust]

2025-06-19 Thread via GitHub
cmcarthur commented on PR #1455: URL: https://github.com/apache/iceberg-rust/pull/1455#issuecomment-2989467490 appreciate the feedback @CTTY -- I'll incorporate these changes and work on adding further tests. thanks! -- This is an automated message from the Apache Git Service. To respond

Re: [PR] Spark 3.5: Add support for creating and altering tables with default values [iceberg]

2025-06-19 Thread via GitHub
github-actions[bot] commented on PR #13107: URL: https://github.com/apache/iceberg/pull/13107#issuecomment-2989456739 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think thatโ€™s incorrect or this pul

Re: [I] Failed to get `files` metadata from specified branch [iceberg]

2025-06-19 Thread via GitHub
github-actions[bot] commented on issue #11701: URL: https://github.com/apache/iceberg/issues/11701#issuecomment-2989456492 This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' -- This is an automated message from the Apache

Re: [I] Failed to get `files` metadata from specified branch [iceberg]

2025-06-19 Thread via GitHub
github-actions[bot] closed issue #11701: Failed to get `files` metadata from specified branch URL: https://github.com/apache/iceberg/issues/11701 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] CI: run FileIO integration tests (s3/adls/gcs) [iceberg-python]

2025-06-19 Thread via GitHub
kevinjqliu commented on code in PR #2125: URL: https://github.com/apache/iceberg-python/pull/2125#discussion_r2157794828 ## Makefile: ## @@ -34,25 +34,21 @@ install-poetry: ## Ensure Poetry is installed and the correct version is being fi \ fi -instal

Re: [PR] CI: run FileIO integration tests (s3/adls/gcs) [iceberg-python]

2025-06-19 Thread via GitHub
kevinjqliu commented on code in PR #2125: URL: https://github.com/apache/iceberg-python/pull/2125#discussion_r2157794565 ## Makefile: ## @@ -62,44 +58,51 @@ test-integration-setup: # Prepare the environment for integration docker compose -f dev/docker-compose-integratio

Re: [PR] feature: expire snapshots action [iceberg-rust]

2025-06-19 Thread via GitHub
CTTY commented on code in PR #1455: URL: https://github.com/apache/iceberg-rust/pull/1455#discussion_r215908 ## crates/iceberg/src/maintenance/expire_snapshots.rs: ## @@ -0,0 +1,1144 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor li

Re: [PR] feature: expire snapshots action [iceberg-rust]

2025-06-19 Thread via GitHub
CTTY commented on code in PR #1455: URL: https://github.com/apache/iceberg-rust/pull/1455#discussion_r2157765709 ## crates/iceberg/src/maintenance/expire_snapshots.rs: ## @@ -0,0 +1,1144 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor li

Re: [PR] fix: include sqlite-required features in sqlx dep [iceberg-rust]

2025-06-19 Thread via GitHub
kyteware commented on PR #1457: URL: https://github.com/apache/iceberg-rust/pull/1457#issuecomment-2989387696 I would appreciate a re-run of the test CI, all tests pass but it looks like the VM running it lost connection. -- This is an automated message from the Apache Git Service. To res

Re: [PR] Core: Fix numeric overflow of timestamp nano literal [iceberg]

2025-06-19 Thread via GitHub
ebyhr commented on code in PR #11775: URL: https://github.com/apache/iceberg/pull/11775#discussion_r2157765372 ## api/src/main/java/org/apache/iceberg/expressions/Literals.java: ## @@ -300,8 +300,7 @@ public Literal to(Type type) { case TIMESTAMP: return (Li

Re: [PR] Core: Fix numeric overflow of timestamp nano literal [iceberg]

2025-06-19 Thread via GitHub
ebyhr commented on code in PR #11775: URL: https://github.com/apache/iceberg/pull/11775#discussion_r2157765372 ## api/src/main/java/org/apache/iceberg/expressions/Literals.java: ## @@ -300,8 +300,7 @@ public Literal to(Type type) { case TIMESTAMP: return (Li

Re: [PR] spark 4.0: SPJ: add bucket reducer using gcd [iceberg]

2025-06-19 Thread via GitHub
huaxingao commented on PR #13167: URL: https://github.com/apache/iceberg/pull/13167#issuecomment-2989342451 Merged. Thanks @himadripal for the PR! Thanks @szehon-ho for the review! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [PR] spark 4.0: SPJ: add bucket reducer using gcd [iceberg]

2025-06-19 Thread via GitHub
huaxingao merged PR #13167: URL: https://github.com/apache/iceberg/pull/13167 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg

Re: [I] Proposal: Implement table maintenance operations [iceberg-rust]

2025-06-19 Thread via GitHub
CTTY commented on issue #1453: URL: https://github.com/apache/iceberg-rust/issues/1453#issuecomment-2989329094 Hi @cmcarthur , thanks for the detailed proposal! I have some questions regarding the design principles: > Follow the API and implementation convention set by Spark operati

Re: [PR] Core: Don't copy stats of delete files in DeleteFileIndex [iceberg]

2025-06-19 Thread via GitHub
ebyhr commented on PR #13161: URL: https://github.com/apache/iceberg/pull/13161#issuecomment-2989326581 @aokolnychyi Could you review this PR when you have time? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] Spark 4.0: Add query runner in test module [iceberg]

2025-06-19 Thread via GitHub
ebyhr commented on PR #11758: URL: https://github.com/apache/iceberg/pull/11758#issuecomment-2989319650 Closing as there has been no review for a while. I will manage this branch locally. -- This is an automated message from the Apache Git Service. To respond to the message, please log o

Re: [PR] Spark 4.0: Add query runner in test module [iceberg]

2025-06-19 Thread via GitHub
ebyhr closed pull request #11758: Spark 4.0: Add query runner in test module URL: https://github.com/apache/iceberg/pull/11758 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

Re: [I] [Fix warning] I found some warning when build code fix them๏ผŸ [iceberg-cpp]

2025-06-19 Thread via GitHub
MisterRaindrop closed issue #118: [Fix warning] I found some warning when build code fix them๏ผŸ URL: https://github.com/apache/iceberg-cpp/issues/118 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [PR] Added ExpireSnapshots Feature [iceberg-python]

2025-06-19 Thread via GitHub
ForeverAngry commented on PR #1880: URL: https://github.com/apache/iceberg-python/pull/1880#issuecomment-2989230858 > @ForeverAngry I think we can move this one forward. Before the release, we need to follow up on two things: > > - Add a new Maintenance doc section with a subsection t

Re: [PR] Added ExpireSnapshots Feature [iceberg-python]

2025-06-19 Thread via GitHub
ForeverAngry commented on PR #1880: URL: https://github.com/apache/iceberg-python/pull/1880#issuecomment-2989228382 > Thanks again @ForeverAngry for working on this ๐Ÿš€ Thank you, for being such a supportive and inspiring member to work with! -- This is an automated message from the

Re: [PR] fix: include sqlite-required features in sqlx dep [iceberg-rust]

2025-06-19 Thread via GitHub
kyteware commented on PR #1457: URL: https://github.com/apache/iceberg-rust/pull/1457#issuecomment-2989204156 I haven't fully investigated what the other features in `[dev-dependencies]` for `sqlx` are needed for, but its possible that the rest of them should also be included in the regular

[PR] fix: include sqlite-required features in sqlx dep [iceberg-rust]

2025-06-19 Thread via GitHub
kyteware opened a new pull request, #1457: URL: https://github.com/apache/iceberg-rust/pull/1457 ## Which issue does this PR close? - Closes #1456. ## What changes are included in this PR? - Included the needed dependencies to the sql catalog crate

[I] Sqlite sql catalogs work in unit tests, but not in prod because of config mistake [iceberg-rust]

2025-06-19 Thread via GitHub
kyteware opened a new issue, #1456: URL: https://github.com/apache/iceberg-rust/issues/1456 ### Apache Iceberg Rust version None ### Describe the bug When you try to create a sqlcatalog using a sqlite databse, `sqlx` reports an error that you are missing the required fea

[I] Add s3.anon as configurable property for S3 FileIO [iceberg-python]

2025-06-19 Thread via GitHub
gmweaver opened a new issue, #2126: URL: https://github.com/apache/iceberg-python/issues/2126 ### Feature Request / Improvement It is not currently possible to configure `anon` in underlying `s3fs` client, which prohibits use in cases where auth is not required or done in a different

Re: [PR] fix(transforms): Add can transform method to transform interface [iceberg-go]

2025-06-19 Thread via GitHub
zeroshade merged PR #463: URL: https://github.com/apache/iceberg-go/pull/463 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.

Re: [PR] CI: run FileIO integration tests (s3/adls/gcs) [iceberg-python]

2025-06-19 Thread via GitHub
kevinjqliu commented on PR #2125: URL: https://github.com/apache/iceberg-python/pull/2125#issuecomment-2989066334 ``` test_fsspec_new_input_file_gcs fsspec_fileio_gcs = @pytest.mark.gcs def test_fsspec_new_input_f

Re: [I] Google BigLake Metastore Catalog issue [iceberg-python]

2025-06-19 Thread via GitHub
jayceslesar commented on issue #2122: URL: https://github.com/apache/iceberg-python/issues/2122#issuecomment-2989059846 Not unique to BigLake but rather the rest implementation here is definitely behind the upstream definition https://github.com/apache/iceberg/blob/main/open-api/rest-catal

Re: [PR] Added ExpireSnapshots Feature [iceberg-python]

2025-06-19 Thread via GitHub
Fokko commented on PR #1880: URL: https://github.com/apache/iceberg-python/pull/1880#issuecomment-2989032298 @ForeverAngry I think we can move this one forward. Before the release, we need to follow up on two things: - Add a new Maintenance doc section with a subsection that explains

Re: [PR] Added ExpireSnapshots Feature [iceberg-python]

2025-06-19 Thread via GitHub
Fokko commented on PR #1880: URL: https://github.com/apache/iceberg-python/pull/1880#issuecomment-2989033112 Thanks again @ForeverAngry for working on this ๐Ÿš€ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

Re: [PR] Added ExpireSnapshots Feature [iceberg-python]

2025-06-19 Thread via GitHub
Fokko merged PR #1880: URL: https://github.com/apache/iceberg-python/pull/1880 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceber

Re: [PR] Support ADLS with Pyarrow file IO [iceberg-python]

2025-06-19 Thread via GitHub
NikitaMatskevich commented on code in PR #2111: URL: https://github.com/apache/iceberg-python/pull/2111#discussion_r2157577217 ## tests/io/test_pyarrow.py: ## @@ -1670,9 +1678,8 @@ def test_new_output_file_gcs(pyarrow_fileio_gcs: PyArrowFileIO) -> None: @pytest.mark.gcs -@

Re: [PR] Spark-3.5, 4.0: Add unit tests for ColumnarBatchUtil [iceberg]

2025-06-19 Thread via GitHub
anuragmantri commented on PR #12275: URL: https://github.com/apache/iceberg/pull/12275#issuecomment-2988998517 Thanks for the review @huaxingao. I added the suggested tests and also added the test in Spark 4.0 which was added since the original PR. Please take a look. -- This is an automa

Re: [PR] Spark-3.5, 4.0: Add unit tests for ColumnarBatchUtil [iceberg]

2025-06-19 Thread via GitHub
anuragmantri commented on code in PR #12275: URL: https://github.com/apache/iceberg/pull/12275#discussion_r2157567692 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/data/vectorized/TestColumnarBatchUtil.java: ## @@ -0,0 +1,140 @@ +/* + * Licensed to the Apache Softwa

Re: [PR] Support ADLS with Pyarrow file IO [iceberg-python]

2025-06-19 Thread via GitHub
kevinjqliu commented on code in PR #2111: URL: https://github.com/apache/iceberg-python/pull/2111#discussion_r2157468740 ## tests/io/test_pyarrow.py: ## @@ -1670,9 +1678,8 @@ def test_new_output_file_gcs(pyarrow_fileio_gcs: PyArrowFileIO) -> None: @pytest.mark.gcs -@pytest

[PR] CI: run FileIO integration tests (s3/adls/gcs) [iceberg-python]

2025-06-19 Thread via GitHub
kevinjqliu opened a new pull request, #2125: URL: https://github.com/apache/iceberg-python/pull/2125 Closes #2124 # Rationale for this change # Are these changes tested? # Are there any user-facing changes? -- This is an automated message fr

[I] CI: run FileIO integration tests (s3/adls/gcs) [iceberg-python]

2025-06-19 Thread via GitHub
kevinjqliu opened a new issue, #2124: URL: https://github.com/apache/iceberg-python/issues/2124 ### Apache Iceberg version None ### Please describe the bug ๐Ÿž Found out that we dont run s3/adls/gcs integration tests in CI https://github.com/apache/iceberg-python/pull/2

Re: [PR] Support ADLS with Pyarrow file IO [iceberg-python]

2025-06-19 Thread via GitHub
kevinjqliu commented on code in PR #2111: URL: https://github.com/apache/iceberg-python/pull/2111#discussion_r2157465863 ## tests/io/test_pyarrow.py: ## @@ -1670,9 +1678,8 @@ def test_new_output_file_gcs(pyarrow_fileio_gcs: PyArrowFileIO) -> None: @pytest.mark.gcs -@pytest

[PR] feature: expire snapshots action [iceberg-rust]

2025-06-19 Thread via GitHub
cmcarthur opened a new pull request, #1455: URL: https://github.com/apache/iceberg-rust/pull/1455 ## Which issue does this PR close? - Closes #1454. ## What changes are included in this PR? - Adds `ExpireSnapshotsAction` with extensive test coverage

Re: [PR] Core: Fix filed ids of partition stats file [iceberg]

2025-06-19 Thread via GitHub
pvary commented on code in PR #13329: URL: https://github.com/apache/iceberg/pull/13329#discussion_r2157446396 ## core/src/main/java/org/apache/iceberg/PartitionStatsHandler.java: ## @@ -280,6 +304,8 @@ private static Collection computeAndMergeStatsIncremental( oldStats.

Re: [PR] Core: Fix filed ids of partition stats file [iceberg]

2025-06-19 Thread via GitHub
pvary commented on code in PR #13329: URL: https://github.com/apache/iceberg/pull/13329#discussion_r2157445956 ## core/src/main/java/org/apache/iceberg/PartitionStatsHandler.java: ## @@ -280,6 +304,8 @@ private static Collection computeAndMergeStatsIncremental( oldStats.

[I] Maintenance: Expire Snapshots Action [iceberg-rust]

2025-06-19 Thread via GitHub
cmcarthur opened a new issue, #1454: URL: https://github.com/apache/iceberg-rust/issues/1454 ### Is your feature request related to a problem or challenge? Implement an ExpireSnapshotsAction with similar configuration options and outputs to the Spark expire snapshots procedure. Part o

Re: [PR] Flink: Migrate Flink `TableSchema` to `Schema`/`ResolvedSchema` [iceberg]

2025-06-19 Thread via GitHub
pvary commented on PR #13072: URL: https://github.com/apache/iceberg/pull/13072#issuecomment-2988781005 @liamzwbao: This is quite a big change, but seems like a good direction to me. If I can have 2 requests, I would like to ask you: - It is good that we validated that the Flink 1.20,

Re: [PR] Flink: Fix flaky test in testTwoSinksInDisjointedDAG [iceberg]

2025-06-19 Thread via GitHub
pvary commented on PR #13349: URL: https://github.com/apache/iceberg/pull/13349#issuecomment-2988761463 The error message is `Expecting empty but was: [Record(1, left-aaa)...` So we are in this code: ``` if (snapshot == null) { assertThat(expected).isEmpty();

Re: [PR] Flink: Fix flaky test in testTwoSinksInDisjointedDAG [iceberg]

2025-06-19 Thread via GitHub
pvary commented on code in PR #13349: URL: https://github.com/apache/iceberg/pull/13349#discussion_r2157426088 ## flink/v1.19/flink/src/test/java/org/apache/iceberg/flink/sink/TestFlinkIcebergSinkExtended.java: ## @@ -151,12 +151,13 @@ public void testTwoSinksInDisjointedDAG() t

[I] pyiceberg produces invalid avro if a partition name has an emoji (any non-alphanumeric character I guess, including dots or starting with digits) [iceberg-python]

2025-06-19 Thread via GitHub
nvartolomei opened a new issue, #2123: URL: https://github.com/apache/iceberg-python/issues/2123 ### Apache Iceberg version None ### Please describe the bug ๐Ÿž Example schema: ```python # Define schema with nested structure and timestamp schema = Schema(

Re: [PR] Docs: Fix description of min-input-files option of Spark rewrite_data_files procedure [iceberg]

2025-06-19 Thread via GitHub
pvary commented on code in PR #13355: URL: https://github.com/apache/iceberg/pull/13355#discussion_r2157413509 ## docs/docs/spark-procedures.md: ## @@ -406,7 +406,7 @@ Iceberg can compact data files in parallel using Spark with the `rewriteDataFile | `target-file-size-bytes` |

Re: [PR] AWS: Support metrics tracking when using Analytics Accelerator stream [iceberg]

2025-06-19 Thread via GitHub
SanjayMarreddi commented on code in PR #13348: URL: https://github.com/apache/iceberg/pull/13348#discussion_r2157377866 ## aws/src/integration/java/org/apache/iceberg/aws/s3/TestS3MultipartUpload.java: ## @@ -59,12 +61,13 @@ public class TestS3MultipartUpload { @BeforeAll

Re: [PR] Azure: Support multiple storage credential prefixes [iceberg]

2025-06-19 Thread via GitHub
ChaladiMohanVamsi commented on PR #13241: URL: https://github.com/apache/iceberg/pull/13241#issuecomment-2988658401 @nastra @amogh-jahagirdar Can you please help with the review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[I] Google BigLake Metastore Catalog issue [iceberg-python]

2025-06-19 Thread via GitHub
ccancellieri opened a new issue, #2122: URL: https://github.com/apache/iceberg-python/issues/2122 Dear all, I'm working on a GCP environment and I'm configuring pyIceberg to work over the BigLake API Metastore catalog. I'm pretty satisfied of the result (it almost works!) but I've

Re: [PR] Added ExpireSnapshots Feature [iceberg-python]

2025-06-19 Thread via GitHub
ForeverAngry commented on PR #1880: URL: https://github.com/apache/iceberg-python/pull/1880#issuecomment-2988615287 > Thanks @ForeverAngry for working on this, and I think it is ready to go ๐Ÿ‘ Great! I think @kevinjqliu is still listed as needing approval. @kevinjqliu can you put you

Re: [PR] Support ADLS with Pyarrow file IO [iceberg-python]

2025-06-19 Thread via GitHub
NikitaMatskevich commented on code in PR #2111: URL: https://github.com/apache/iceberg-python/pull/2111#discussion_r2157322524 ## pyiceberg/io/pyarrow.py: ## @@ -475,6 +486,42 @@ def _initialize_s3_fs(self, netloc: Optional[str]) -> FileSystem: return S3FileSystem(**

Re: [PR] Support ADLS with Pyarrow file IO [iceberg-python]

2025-06-19 Thread via GitHub
NikitaMatskevich commented on code in PR #2111: URL: https://github.com/apache/iceberg-python/pull/2111#discussion_r2157328970 ## pyiceberg/io/pyarrow.py: ## @@ -197,6 +204,7 @@ MAP_VALUE_NAME = "value" DOC = "doc" UTC_ALIASES = {"UTC", "+00:00", "Etc/UTC", "Z"} +MIN_PYARROW_

Re: [PR] Support ADLS with Pyarrow file IO [iceberg-python]

2025-06-19 Thread via GitHub
NikitaMatskevich commented on code in PR #2111: URL: https://github.com/apache/iceberg-python/pull/2111#discussion_r2157328970 ## pyiceberg/io/pyarrow.py: ## @@ -197,6 +204,7 @@ MAP_VALUE_NAME = "value" DOC = "doc" UTC_ALIASES = {"UTC", "+00:00", "Etc/UTC", "Z"} +MIN_PYARROW_

Re: [PR] Support ADLS with Pyarrow file IO [iceberg-python]

2025-06-19 Thread via GitHub
NikitaMatskevich commented on code in PR #2111: URL: https://github.com/apache/iceberg-python/pull/2111#discussion_r2157321373 ## tests/io/test_pyarrow.py: ## @@ -1670,9 +1678,8 @@ def test_new_output_file_gcs(pyarrow_fileio_gcs: PyArrowFileIO) -> None: @pytest.mark.gcs -@

Re: [PR] Support ADLS with Pyarrow file IO [iceberg-python]

2025-06-19 Thread via GitHub
kevinjqliu commented on code in PR #2111: URL: https://github.com/apache/iceberg-python/pull/2111#discussion_r2157299424 ## tests/io/test_pyarrow.py: ## @@ -1670,9 +1678,8 @@ def test_new_output_file_gcs(pyarrow_fileio_gcs: PyArrowFileIO) -> None: @pytest.mark.gcs -@pytest

Re: [PR] Support ADLS with Pyarrow file IO [iceberg-python]

2025-06-19 Thread via GitHub
kevinjqliu commented on code in PR #2111: URL: https://github.com/apache/iceberg-python/pull/2111#discussion_r2157298498 ## tests/io/test_pyarrow.py: ## @@ -1670,9 +1678,8 @@ def test_new_output_file_gcs(pyarrow_fileio_gcs: PyArrowFileIO) -> None: @pytest.mark.gcs -@pytest

Re: [PR] AWS: Support metrics tracking when using Analytics Accelerator stream [iceberg]

2025-06-19 Thread via GitHub
SanjayMarreddi commented on code in PR #13348: URL: https://github.com/apache/iceberg/pull/13348#discussion_r2157285379 ## aws/src/integration/java/org/apache/iceberg/aws/s3/TestS3FileIOIntegration.java: ## @@ -234,7 +239,12 @@ public void testCrossRegionAccessEnabled() throws

Re: [PR] Support ADLS with Pyarrow file IO [iceberg-python]

2025-06-19 Thread via GitHub
kevinjqliu commented on code in PR #2111: URL: https://github.com/apache/iceberg-python/pull/2111#discussion_r2157259714 ## pyiceberg/io/pyarrow.py: ## @@ -394,6 +402,9 @@ def _initialize_fs(self, scheme: str, netloc: Optional[str] = None) -> FileSyste elif scheme in {

  1   2   >