[PR] Core: Ensure current and newly added view versions are retained in ViewMetadata build [iceberg]

2025-02-25 Thread via GitHub
lliangyu-lin opened a new pull request, #12401: URL: https://github.com/apache/iceberg/pull/12401 ### Description Before this change, ```ViewMetadata.Builder``` did not always retain all view versions added in the current build. This issue was caused by ```int numVersionsToKeep = Math.ma

Re: [PR] feat: Add `StrictMetricsEvaluator` [iceberg-rust]

2025-02-25 Thread via GitHub
liurenjie1024 merged PR #963: URL: https://github.com/apache/iceberg-rust/pull/963 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ic

Re: [PR] feat: Add existing parquet files [iceberg-rust]

2025-02-25 Thread via GitHub
liurenjie1024 commented on code in PR #960: URL: https://github.com/apache/iceberg-rust/pull/960#discussion_r1970839596 ## crates/iceberg/src/transaction.rs: ## @@ -169,6 +176,48 @@ impl<'a> Transaction<'a> { catalog.update_table(table_commit).await } + +asyn

Re: [PR] feat: Pull Request Template [iceberg-rust]

2025-02-25 Thread via GitHub
liurenjie1024 commented on PR #1009: URL: https://github.com/apache/iceberg-rust/pull/1009#issuecomment-2683967059 Let's wait for a moment to have more eyes reviewing it.. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

[PR] Add overwrite method to schema on schema update [iceberg-python]

2025-02-25 Thread via GitHub
mariotaddeucci opened a new pull request, #1727: URL: https://github.com/apache/iceberg-python/pull/1727 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

Re: [PR] Kafka Connect: Add mechanisms for routing records by topic name [iceberg]

2025-02-25 Thread via GitHub
bb-dikshantsharma17012025 commented on PR #11623: URL: https://github.com/apache/iceberg/pull/11623#issuecomment-2683928085 hi @jbonofre @bryanck , any update on this PR? as for our usecase we want to route all tables from a source mysql into s3, (we are using debezium as source connector,

[I] [Feature] - Allow Schema Overwrite [iceberg-python]

2025-02-25 Thread via GitHub
mariotaddeucci opened a new issue, #1726: URL: https://github.com/apache/iceberg-python/issues/1726 ### Feature Request / Improvement It would be beneficial to introduce an option in pyiceberg to completely overwrite the schema as part of the table update process. By allowing users to

Re: [I] Validation Error in ConfigResponse Model with RestCatalog in PyIceberg using Nessie REST API [iceberg]

2025-02-25 Thread via GitHub
heman026 commented on issue #11255: URL: https://github.com/apache/iceberg/issues/11255#issuecomment-2683849718 @lars-sorgalla Check https://github.com/apache/iceberg-python/issues/1524#issuecomment-2596881736 -- This is an automated message from the Apache Git Service. To respond to the

Re: [I] Validation Error in ConfigResponse Model When connecting Nessie with PyIceberg using RestCatalog [iceberg-python]

2025-02-25 Thread via GitHub
heman026 commented on issue #1524: URL: https://github.com/apache/iceberg-python/issues/1524#issuecomment-2683847774 > Thanks [@heman026](https://github.com/heman026) for sharing the solution > > Would you mind sharing your configuration and how you load it? > > I assume you ar

Re: [PR] feat: Pull Request Template [iceberg-rust]

2025-02-25 Thread via GitHub
liurenjie1024 commented on PR #1009: URL: https://github.com/apache/iceberg-rust/pull/1009#issuecomment-2683764046 cc @Xuanwo @Fokko @sdd @c-thiel PTAL -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] feat: Pull Request Template [iceberg-rust]

2025-02-25 Thread via GitHub
liurenjie1024 commented on code in PR #1009: URL: https://github.com/apache/iceberg-rust/pull/1009#discussion_r1970821886 ## .github/PULL_REQUEST_TEMPLATE.md: ## @@ -0,0 +1,47 @@ + + +## Which issue does this PR close? + + + +- Closes #. + +## Why is this change needed? Review

Re: [PR] feat: Pull Request Template [iceberg-rust]

2025-02-25 Thread via GitHub
liurenjie1024 commented on PR #1009: URL: https://github.com/apache/iceberg-rust/pull/1009#issuecomment-2683761507 > thanks for the PR @jonathanc-n what do you think about copying over existing templates? [#1007 (comment)](https://github.com/apache/iceberg-rust/issues/1007#issuecomment-2682

Re: [PR] feat: Add Issue Template [iceberg-rust]

2025-02-25 Thread via GitHub
liurenjie1024 commented on code in PR #1008: URL: https://github.com/apache/iceberg-rust/pull/1008#discussion_r1970817058 ## .github/ISSUE_TEMPLATE/iceberg_question.yml: ## Review Comment: For question, we should ask user to navigate to https://github.com/apache/iceberg-ru

Re: [PR] fix loading `in-memory` catalog [iceberg-python]

2025-02-25 Thread via GitHub
kevinjqliu merged PR #1725: URL: https://github.com/apache/iceberg-python/pull/1725 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [PR] OpenAPI: Use more clear language in recommending error responses [iceberg]

2025-02-25 Thread via GitHub
sungwy commented on PR #12376: URL: https://github.com/apache/iceberg/pull/12376#issuecomment-2683707993 > Thanks for the spec clarification, @sungwy ! The changes LGTM 👍 All thanks to you for the helpful reviews @dimas-b 💯 -- This is an automated message from the Apache Git Servic

Re: [PR] Update docs to reflect default location provider [iceberg-python]

2025-02-25 Thread via GitHub
kevinjqliu merged PR #1724: URL: https://github.com/apache/iceberg-python/pull/1724 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [PR] Update docs to reflect default location provider [iceberg-python]

2025-02-25 Thread via GitHub
kevinjqliu commented on PR #1724: URL: https://github.com/apache/iceberg-python/pull/1724#issuecomment-2683661782 Thanks for the PR @geruh and thanks for the review @smaheshwar-pltr -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [PR] Turn `ObjectStoreLocationProvider` off by default [iceberg-python]

2025-02-25 Thread via GitHub
smaheshwar-pltr commented on PR #1722: URL: https://github.com/apache/iceberg-python/pull/1722#issuecomment-2683649867 Thanks @kevinjqliu for the explanation. I said something similar in https://github.com/apache/iceberg-python/pull/1509#issuecomment-2585317026 / https://github.com/apach

Re: [PR] Turn `ObjectStoreLocationProvider` off by default [iceberg-python]

2025-02-25 Thread via GitHub
kevinjqliu commented on PR #1722: URL: https://github.com/apache/iceberg-python/pull/1722#issuecomment-2683653037 :) Sorry for the back and forth! Another thing is that having `data/` as the default data location is more friendly to new users. -- This is an automated message from the

Re: [PR] Turn `ObjectStoreLocationProvider` off by default [iceberg-python]

2025-02-25 Thread via GitHub
kevinjqliu commented on PR #1722: URL: https://github.com/apache/iceberg-python/pull/1722#issuecomment-2683639454 > Is changing locations just too dramatic for this release? My main concern was around the default behavior change. For example, if i had a pipeline running with 0.8.1 and

Re: [PR] API, Core: Update inclusive metrics evaluator for extract and transforms [iceberg]

2025-02-25 Thread via GitHub
rdblue commented on PR #12311: URL: https://github.com/apache/iceberg/pull/12311#issuecomment-2683615946 Thanks for reviewing, @danielcweeks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] API, Core: Update inclusive metrics evaluator for extract and transforms [iceberg]

2025-02-25 Thread via GitHub
rdblue merged PR #12311: URL: https://github.com/apache/iceberg/pull/12311 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] Core,Api: Add overwrite option when register external table to catalog [iceberg]

2025-02-25 Thread via GitHub
dramaticlly commented on code in PR #12228: URL: https://github.com/apache/iceberg/pull/12228#discussion_r1970726948 ## core/src/test/java/org/apache/iceberg/catalog/CatalogTests.java: ## @@ -2871,6 +2872,33 @@ public void testRegisterExistingTable() { assertThat(catalog.dr

Re: [PR] Core,Api: Add overwrite option when register external table to catalog [iceberg]

2025-02-25 Thread via GitHub
dramaticlly commented on code in PR #12228: URL: https://github.com/apache/iceberg/pull/12228#discussion_r1970711569 ## core/src/main/java/org/apache/iceberg/BaseMetastoreCatalog.java: ## @@ -71,23 +77,53 @@ public Table loadTable(TableIdentifier identifier) { } @Overrid

Re: [PR] Core,Api: Add overwrite option when register external table to catalog [iceberg]

2025-02-25 Thread via GitHub
stevenzwu commented on code in PR #12228: URL: https://github.com/apache/iceberg/pull/12228#discussion_r1970684354 ## core/src/test/java/org/apache/iceberg/catalog/CatalogTests.java: ## @@ -2871,6 +2872,33 @@ public void testRegisterExistingTable() { assertThat(catalog.drop

Re: [I] Support queries all branches and tags java api [iceberg]

2025-02-25 Thread via GitHub
github-actions[bot] commented on issue #11042: URL: https://github.com/apache/iceberg/issues/11042#issuecomment-2683576322 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [PR] Core,Api: Add overwrite option when register external table to catalog [iceberg]

2025-02-25 Thread via GitHub
stevenzwu commented on code in PR #12228: URL: https://github.com/apache/iceberg/pull/12228#discussion_r1970355745 ## aws/src/integration/java/org/apache/iceberg/aws/dynamodb/TestDynamoDbCatalog.java: ## @@ -395,6 +396,31 @@ public void testRegisterExistingTable() { assertT

Re: [PR] Turn `ObjectStoreLocationProvider` off by default [iceberg-python]

2025-02-25 Thread via GitHub
smaheshwar-pltr commented on PR #1722: URL: https://github.com/apache/iceberg-python/pull/1722#issuecomment-2683519024 Thanks for the ping @kevinjqliu. (Feel free to respond later) Curious what the problem here was. I'd have thought that old tables would still be readable by new PyIce

Re: [PR] Update docs to reflect default location provider [iceberg-python]

2025-02-25 Thread via GitHub
kevinjqliu commented on code in PR #1724: URL: https://github.com/apache/iceberg-python/pull/1724#discussion_r1970666991 ## mkdocs/docs/configuration.md: ## Review Comment: can you also update the table? https://github.com/apache/iceberg-python/blob/main/mkdocs/docs/con

[PR] [1.7.x] Fix Kafka-connect `LICENSE` and `NOTICE` [iceberg]

2025-02-25 Thread via GitHub
Fokko opened a new pull request, #12400: URL: https://github.com/apache/iceberg/pull/12400 Equivalent of https://github.com/apache/iceberg/pull/12364 for `1.7.x` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] Update docs to reflect default location provider [iceberg-python]

2025-02-25 Thread via GitHub
smaheshwar-pltr commented on code in PR #1724: URL: https://github.com/apache/iceberg-python/pull/1724#discussion_r1970671110 ## mkdocs/docs/configuration.md: ## Review Comment: Also > Note: the default value of True differs from Iceberg's Java implementation shou

Re: [PR] Update docs to reflect default location provider [iceberg-python]

2025-02-25 Thread via GitHub
kevinjqliu commented on code in PR #1724: URL: https://github.com/apache/iceberg-python/pull/1724#discussion_r1970669793 ## mkdocs/docs/configuration.md: ## Review Comment: also in SimpleLocationProvider too, the part ``` The `SimpleLocationProvider` is enabled f

Re: [PR] Update docs to reflect default location provider [iceberg-python]

2025-02-25 Thread via GitHub
kevinjqliu commented on code in PR #1724: URL: https://github.com/apache/iceberg-python/pull/1724#discussion_r1970666991 ## mkdocs/docs/configuration.md: ## Review Comment: can you also update the table's default value for `write.object-storage.enabled`? https://github

Re: [PR] Update docs to reflect default location provider [iceberg-python]

2025-02-25 Thread via GitHub
kevinjqliu commented on code in PR #1724: URL: https://github.com/apache/iceberg-python/pull/1724#discussion_r1970668787 ## mkdocs/docs/configuration.md: ## Review Comment: and the "It is used by default." part -- This is an automated message from the Apache Git Service

[PR] Build: Bump mkdocstrings from 0.28.1 to 0.28.2 [iceberg-python]

2025-02-25 Thread via GitHub
dependabot[bot] opened a new pull request, #1723: URL: https://github.com/apache/iceberg-python/pull/1723 Bumps [mkdocstrings](https://github.com/mkdocstrings/mkdocstrings) from 0.28.1 to 0.28.2. Release notes Sourced from https://github.com/mkdocstrings/mkdocstrings/releases";>mkd

Re: [PR] ci(golangci-lint) :Add golangci linter [iceberg-go]

2025-02-25 Thread via GitHub
dttung2905 commented on PR #315: URL: https://github.com/apache/iceberg-go/pull/315#issuecomment-2683456060 @zeroshade It seems the Golang CI lint does work :smile: . https://github.com/apache/iceberg-go/actions/runs/13532254394/job/37817030681?pr=315 I will go ahead and fix the highlighted

Re: [PR] Turn `ObjectStoreLocationProvider` off by default [iceberg-python]

2025-02-25 Thread via GitHub
kevinjqliu commented on PR #1722: URL: https://github.com/apache/iceberg-python/pull/1722#issuecomment-2683450696 reverts #1509, cc @smaheshwar-pltr -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [I] turn `ObjectStoreLocationProvider` off by default [iceberg-python]

2025-02-25 Thread via GitHub
kevinjqliu closed issue #1721: turn `ObjectStoreLocationProvider` off by default URL: https://github.com/apache/iceberg-python/issues/1721 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [I] turn `ObjectStoreLocationProvider` off by default [iceberg-python]

2025-02-25 Thread via GitHub
kevinjqliu closed issue #1721: turn `ObjectStoreLocationProvider` off by default URL: https://github.com/apache/iceberg-python/issues/1721 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Turn `ObjectStoreLocationProvider` off by default [iceberg-python]

2025-02-25 Thread via GitHub
kevinjqliu merged PR #1722: URL: https://github.com/apache/iceberg-python/pull/1722 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [PR] Fix versions in LICENSE/NOTICE [iceberg]

2025-02-25 Thread via GitHub
Fokko commented on code in PR #12364: URL: https://github.com/apache/iceberg/pull/12364#discussion_r1970604229 ## kafka-connect/kafka-connect-runtime/main/LICENSE: ## @@ -1448,3 +1448,51 @@ Project URL (from POM): https://github.com/awslabs/aws-eventstream-java License (from P

[PR] Turn `ObjectStoreLocationProvider` off by default [iceberg-python]

2025-02-25 Thread via GitHub
kevinjqliu opened a new pull request, #1722: URL: https://github.com/apache/iceberg-python/pull/1722 Closes #1721 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubsc

[I] turn `ObjectStoreLocationProvider` off by default [iceberg-python]

2025-02-25 Thread via GitHub
kevinjqliu opened a new issue, #1721: URL: https://github.com/apache/iceberg-python/issues/1721 ### Apache Iceberg version main (development) ### Please describe the bug 🐞 As part of implementing LocationProvider, `ObjectStoreLocationProvider` was enabled by default.

Re: [I] Wrong Avro schema for ManifestEntry [iceberg-go]

2025-02-25 Thread via GitHub
arnaudbriche commented on issue #305: URL: https://github.com/apache/iceberg-go/issues/305#issuecomment-2683326930 Fixed by https://github.com/apache/iceberg-go/pull/307 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [I] Wrong Avro schema for ManifestEntry [iceberg-go]

2025-02-25 Thread via GitHub
arnaudbriche closed issue #305: Wrong Avro schema for ManifestEntry URL: https://github.com/apache/iceberg-go/issues/305 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsu

Re: [PR] API, Core: Update inclusive metrics evaluator for extract and transforms [iceberg]

2025-02-25 Thread via GitHub
rdblue commented on code in PR #12311: URL: https://github.com/apache/iceberg/pull/12311#discussion_r1970487043 ## api/src/main/java/org/apache/iceberg/variants/VariantUtil.java: ## @@ -18,21 +18,112 @@ */ package org.apache.iceberg.variants; +import java.math.BigDecimal;

Re: [PR] API, Core: Update inclusive metrics evaluator for extract and transforms [iceberg]

2025-02-25 Thread via GitHub
rdblue commented on code in PR #12311: URL: https://github.com/apache/iceberg/pull/12311#discussion_r1970492212 ## api/src/main/java/org/apache/iceberg/variants/VariantUtil.java: ## @@ -18,21 +18,112 @@ */ package org.apache.iceberg.variants; +import java.math.BigDecimal;

Re: [PR] Use delimited column names in `CreateChangelogViewProcedure` [iceberg]

2025-02-25 Thread via GitHub
dramaticlly commented on PR #12322: URL: https://github.com/apache/iceberg/pull/12322#issuecomment-2682975213 @andyglow Looks like CI failed for spotless formatting. I would recommend to run spotlessApply to check and fix all formatting issue ```shall ./gradlew -DallVersions spotlessAp

Re: [PR] Spark: Infer partition spec in ADD_FILES procedure for FileTables than taking latest table spec [iceberg]

2025-02-25 Thread via GitHub
RussellSpitzer commented on PR #12327: URL: https://github.com/apache/iceberg/pull/12327#issuecomment-2682938175 * What went wrong: Execution failed for task ':iceberg-spark:iceberg-spark-3.5_2.12:checkstyleMain'. > A failure occurred while executing org.gradle.api.plugins.quality.int

Re: [I] Validation Error in ConfigResponse Model with RestCatalog in PyIceberg using Nessie REST API [iceberg]

2025-02-25 Thread via GitHub
lars-sorgalla commented on issue #11255: URL: https://github.com/apache/iceberg/issues/11255#issuecomment-2682929555 I am encountering the same issue. I tried to comment out the "defaults" and "overrides" properties in the ConfigResponse model but then it failed elsewhere. I'm wonde

Re: [PR] feat: Pull Request Template [iceberg-rust]

2025-02-25 Thread via GitHub
kevinjqliu commented on PR #1009: URL: https://github.com/apache/iceberg-rust/pull/1009#issuecomment-2682911311 thanks for the PR @jonathanc-n what do you think about copying over existing templates? https://github.com/apache/iceberg-rust/issues/1007#issuecomment-2682485499 -- This is an

Re: [PR] Spark: Infer partition spec in ADD_FILES procedure for FileTables than taking latest table spec [iceberg]

2025-02-25 Thread via GitHub
RussellSpitzer commented on code in PR #12327: URL: https://github.com/apache/iceberg/pull/12327#discussion_r1970309065 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkTableUtil.java: ## @@ -1124,7 +1115,67 @@ private static PartitionSpec findCompatibleSpec(

Re: [PR] Spark: Infer partition spec in ADD_FILES procedure for FileTables than taking latest table spec [iceberg]

2025-02-25 Thread via GitHub
RussellSpitzer commented on code in PR #12327: URL: https://github.com/apache/iceberg/pull/12327#discussion_r1970303712 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestAddFilesProcedure.java: ## @@ -635,6 +635,67 @@ public void addFilteredPa

Re: [PR] API, Core: Update inclusive metrics evaluator for extract and transforms [iceberg]

2025-02-25 Thread via GitHub
danielcweeks commented on code in PR #12311: URL: https://github.com/apache/iceberg/pull/12311#discussion_r1970288476 ## api/src/main/java/org/apache/iceberg/variants/VariantUtil.java: ## @@ -18,21 +18,112 @@ */ package org.apache.iceberg.variants; +import java.math.BigDeci

Re: [I] 0.8 Patch Release for Arrow 19 Support [iceberg-python]

2025-02-25 Thread via GitHub
kevinjqliu commented on issue #1672: URL: https://github.com/apache/iceberg-python/issues/1672#issuecomment-2682726651 @srilman thanks! if you have the time, please verify the release and post on the [devlist thread](https://lists.apache.org/thread/9v38dntmbt91gfyq372j6fy12yn0ojfj) :) -

Re: [PR] Kafka Connect: Add SMTs for Debezium and AWS DMS [iceberg]

2025-02-25 Thread via GitHub
bryanck commented on PR #11936: URL: https://github.com/apache/iceberg/pull/11936#issuecomment-2682720586 Thanks @ismailsimsek for porting this over! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] Kafka Connect: Add SMTs for Debezium and AWS DMS [iceberg]

2025-02-25 Thread via GitHub
bryanck merged PR #11936: URL: https://github.com/apache/iceberg/pull/11936 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

Re: [I] Official iceberg kafka-connect is missing SMTs from original Databricks/Tabular repository [iceberg]

2025-02-25 Thread via GitHub
bryanck closed issue #11914: Official iceberg kafka-connect is missing SMTs from original Databricks/Tabular repository URL: https://github.com/apache/iceberg/issues/11914 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

Re: [I] 0.8 Patch Release for Arrow 19 Support [iceberg-python]

2025-02-25 Thread via GitHub
srilman closed issue #1672: 0.8 Patch Release for Arrow 19 Support URL: https://github.com/apache/iceberg-python/issues/1672 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [I] 0.8 Patch Release for Arrow 19 Support [iceberg-python]

2025-02-25 Thread via GitHub
srilman commented on issue #1672: URL: https://github.com/apache/iceberg-python/issues/1672#issuecomment-2682707834 Ok, closing this issue since 0.9.0rc2 was created. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [PR] parallelize `add_files` [iceberg-python]

2025-02-25 Thread via GitHub
Fokko commented on code in PR #1717: URL: https://github.com/apache/iceberg-python/pull/1717#discussion_r1970024778 ## pyiceberg/io/pyarrow.py: ## @@ -2464,38 +2464,37 @@ def _check_pyarrow_schema_compatible( _check_schema_compatible(requested_schema, provided_schema) -

[PR] feat: Pull Request Template [iceberg-rust]

2025-02-25 Thread via GitHub
jonathanc-n opened a new pull request, #1009: URL: https://github.com/apache/iceberg-rust/pull/1009 Add a pull request template to make it easier for reviewers -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [PR] AWS: Integrate S3 analytics accelerator library [iceberg]

2025-02-25 Thread via GitHub
SanjayMarreddi commented on code in PR #12299: URL: https://github.com/apache/iceberg/pull/12299#discussion_r1970198866 ## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIOProperties.java: ## @@ -72,6 +72,25 @@ public class S3FileIOProperties implements Serializable { pu

Re: [PR] Spark: Detect dangling DVs properly [iceberg]

2025-02-25 Thread via GitHub
nastra commented on code in PR #12270: URL: https://github.com/apache/iceberg/pull/12270#discussion_r1970189404 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RemoveDanglingDeletesSparkAction.java: ## @@ -156,7 +162,12 @@ private List findDanglingDeletes() {

[PR] feat: Add Issue Template [iceberg-rust]

2025-02-25 Thread via GitHub
jonathanc-n opened a new pull request, #1008: URL: https://github.com/apache/iceberg-rust/pull/1008 Closes #1007. Adds issue template for bugs, features, and questions. Some of the idea is from datafusion as well as iceberg. -- This is an automated message from the Apache Git Servi

Re: [PR] Build: Upgrade to Gradle 8.13 [iceberg]

2025-02-25 Thread via GitHub
nastra merged PR #12398: URL: https://github.com/apache/iceberg/pull/12398 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] ci(catalog/rest): initial framework for rest catalog integration tests [iceberg-go]

2025-02-25 Thread via GitHub
kevinjqliu commented on code in PR #310: URL: https://github.com/apache/iceberg-go/pull/310#discussion_r1970116534 ## .github/workflows/go-integration.yml: ## @@ -66,6 +66,7 @@ jobs: run: | go test -tags integration -v -run="^TestScanner" ./table

Re: [PR] fix(table/scanner): Fix nested field scan [iceberg-go]

2025-02-25 Thread via GitHub
kevinjqliu commented on code in PR #311: URL: https://github.com/apache/iceberg-go/pull/311#discussion_r1970105364 ## table/scanner_test.go: ## @@ -458,6 +458,23 @@ func (s *ScannerSuite) TestPartitionedTables() { } } +func (s *ScannerSuite) TestNestedColumns() { +

Re: [PR] Data: Add partition stats writer and reader [iceberg]

2025-02-25 Thread via GitHub
jbonofre commented on code in PR #11216: URL: https://github.com/apache/iceberg/pull/11216#discussion_r1970109787 ## data/src/main/java/org/apache/iceberg/data/PartitionStatsHandler.java: ## @@ -0,0 +1,286 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + *

Re: [PR] Data: Add partition stats writer and reader [iceberg]

2025-02-25 Thread via GitHub
jbonofre commented on code in PR #11216: URL: https://github.com/apache/iceberg/pull/11216#discussion_r1970108920 ## data/src/main/java/org/apache/iceberg/data/PartitionStatsHandler.java: ## @@ -0,0 +1,286 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + *

Re: [PR] Data: Add partition stats writer and reader [iceberg]

2025-02-25 Thread via GitHub
jbonofre commented on PR #11216: URL: https://github.com/apache/iceberg/pull/11216#issuecomment-2682520892 I did the review and it looks good to me. Thanks ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

Re: [PR] parallelize `add_files` [iceberg-python]

2025-02-25 Thread via GitHub
Fokko commented on code in PR #1717: URL: https://github.com/apache/iceberg-python/pull/1717#discussion_r1970035297 ## pyiceberg/io/pyarrow.py: ## @@ -2464,38 +2464,37 @@ def _check_pyarrow_schema_compatible( _check_schema_compatible(requested_schema, provided_schema) -

Re: [I] support pyarrow recordbatch as a valid data source for writing Iceberg table [iceberg-python]

2025-02-25 Thread via GitHub
kevinjqliu commented on issue #1004: URL: https://github.com/apache/iceberg-python/issues/1004#issuecomment-2682383747 Looking at the above, the 2 critical parts are: ``` df=final.arrow() ... tbl.append(df) ``` Im surprised that the `.arrow()` part didn't cause the OOM

Re: [I] metadata count [iceberg-python]

2025-02-25 Thread via GitHub
kevinjqliu commented on issue #1718: URL: https://github.com/apache/iceberg-python/issues/1718#issuecomment-2682370796 depends on your layout. i think at this point reading metadata files take a big chunk of the time -- This is an automated message from the Apache Git Service. To respond

Re: [PR] feat: View Metadata Builder [iceberg-rust]

2025-02-25 Thread via GitHub
Fokko merged PR #908: URL: https://github.com/apache/iceberg-rust/pull/908 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

[PR] feat: make output file name of write task consistent with java api [iceberg-python]

2025-02-25 Thread via GitHub
sharkdtu opened a new pull request, #1720: URL: https://github.com/apache/iceberg-python/pull/1720 Resolves: #1719 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[I] Inconsistent output file name with java api [iceberg-python]

2025-02-25 Thread via GitHub
sharkdtu opened a new issue, #1719: URL: https://github.com/apache/iceberg-python/issues/1719 ### Feature Request / Improvement The output file name of python api: f"0-{self.counter_id}-{self.write_uuid}.{extension}" The output file name of java api: f"{partitionId:05d}-{taskI

Re: [PR] Data: Add partition stats writer and reader [iceberg]

2025-02-25 Thread via GitHub
ajantha-bhat commented on code in PR #11216: URL: https://github.com/apache/iceberg/pull/11216#discussion_r1969601484 ## core/src/main/java/org/apache/iceberg/SetPartitionStatistics.java: ## @@ -36,7 +35,10 @@ public SetPartitionStatistics(TableOperations ops) { @Override

Re: [I] Flink Table Maintenance [iceberg]

2025-02-25 Thread via GitHub
akshat0395 commented on issue #10264: URL: https://github.com/apache/iceberg/issues/10264#issuecomment-2681635549 @pvary Could you please share any new developments regarding this issue? -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] Docs: Add Stackable to the Vendors page [iceberg]

2025-02-25 Thread via GitHub
nastra merged PR #12344: URL: https://github.com/apache/iceberg/pull/12344 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.ap

Re: [PR] parallelize `add_files` [iceberg-python]

2025-02-25 Thread via GitHub
amitgilad3 commented on code in PR #1717: URL: https://github.com/apache/iceberg-python/pull/1717#discussion_r1969511896 ## tests/integration/test_add_files.py: ## @@ -229,6 +229,35 @@ def test_add_files_to_unpartitioned_table_raises_has_field_ids( tbl.add_files(file_p

Re: [PR] fix: fix version of mechete [iceberg-rust]

2025-02-25 Thread via GitHub
liurenjie1024 merged PR #1006: URL: https://github.com/apache/iceberg-rust/pull/1006 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@

Re: [I] metadata count [iceberg-python]

2025-02-25 Thread via GitHub
djouallah commented on issue #1718: URL: https://github.com/apache/iceberg-python/issues/1718#issuecomment-2681223878 it does !! indeed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] Data: Add partition stats writer and reader [iceberg]

2025-02-25 Thread via GitHub
gaborkaszab commented on code in PR #11216: URL: https://github.com/apache/iceberg/pull/11216#discussion_r1969209511 ## core/src/main/java/org/apache/iceberg/SetPartitionStatistics.java: ## @@ -36,7 +35,10 @@ public SetPartitionStatistics(TableOperations ops) { @Override

Re: [I] metadata count [iceberg-python]

2025-02-25 Thread via GitHub
djouallah commented on issue #1718: URL: https://github.com/apache/iceberg-python/issues/1718#issuecomment-2681240314 yeah, I was expecting subsecond to be honest :) https://github.com/user-attachments/assets/fcb78f9f-438e-40ba-a54f-e38a1a70459e"; /> -- This is an automated messa

Re: [I] pyiceberg always return false for catalog.table_exists when used with Polaris catalog [iceberg-python]

2025-02-25 Thread via GitHub
djouallah commented on issue #1006: URL: https://github.com/apache/iceberg-python/issues/1006#issuecomment-2681222780 fwiw, it is working as expected, I presume polaris did fix it -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [PR] Manifest list encryption [iceberg]

2025-02-25 Thread via GitHub
ggershinsky commented on PR #7770: URL: https://github.com/apache/iceberg/pull/7770#issuecomment-2681222789 This PR is rebased and synced with the spec [patch](https://github.com/apache/iceberg/pull/12162). -- This is an automated message from the Apache Git Service. To respond to the mes

Re: [I] metadata count [iceberg-python]

2025-02-25 Thread via GitHub
Fokko commented on issue #1718: URL: https://github.com/apache/iceberg-python/issues/1718#issuecomment-2681215626 @djouallah No worries, please let us know if it works for you since it is a new feature :) -- This is an automated message from the Apache Git Service. To respond to the mess

Re: [PR] Spec additions for encryption [iceberg]

2025-02-25 Thread via GitHub
ggershinsky commented on PR #12162: URL: https://github.com/apache/iceberg/pull/12162#issuecomment-2681211642 @rdblue @RussellSpitzer I've implemented the spec changes in an e2e code, everything works ok. This PR is ready for a new review round. -- This is an automated message from the Ap

Re: [I] metadata count [iceberg-python]

2025-02-25 Thread via GitHub
djouallah commented on issue #1718: URL: https://github.com/apache/iceberg-python/issues/1718#issuecomment-2681195220 sorry, I should know better :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [I] metadata count [iceberg-python]

2025-02-25 Thread via GitHub
djouallah closed issue #1718: metadata count URL: https://github.com/apache/iceberg-python/issues/1718 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: i

Re: [I] metadata count [iceberg-python]

2025-02-25 Thread via GitHub
Fokko commented on issue #1718: URL: https://github.com/apache/iceberg-python/issues/1718#issuecomment-2681185689 @djouallah There is: `tbl.scan().count()` This will require installing from GitHub or `0.9.0rc2`, which is currently being voted on. -- This is an automated message fr

Re: [I] Update Table Error: UPDATE TABLE is not supported temporarily. [iceberg]

2025-02-25 Thread via GitHub
nastra commented on issue #9960: URL: https://github.com/apache/iceberg/issues/9960#issuecomment-2681166064 I looked into this with a fresh Spark 3.5.4 installation and using Iceberg 1.5.0 against a REST catalog and I'm not able to reproduce this issue. [SPARK-43324](https://github.co

[I] metadata count [iceberg-python]

2025-02-25 Thread via GitHub
djouallah opened a new issue, #1718: URL: https://github.com/apache/iceberg-python/issues/1718 ### Question is there a way to count without scanning a table, just using iceberg metadata ? -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] Core: Add "volatile" to HadoopFileIO#hadoopConf [iceberg]

2025-02-25 Thread via GitHub
okumin commented on PR #12388: URL: https://github.com/apache/iceberg/pull/12388#issuecomment-2681152050 Thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

Re: [PR] fix: refine doc for write support [iceberg-rust]

2025-02-25 Thread via GitHub
liurenjie1024 commented on code in PR #999: URL: https://github.com/apache/iceberg-rust/pull/999#discussion_r1968783422 ## crates/iceberg/src/lib.rs: ## @@ -50,6 +50,87 @@ //! Ok(()) //! } //! ``` +//! +//! ## Fast append data to table +//! +//! ```rust, no_run Review Co

Re: [PR] fix: fix version of mechete [iceberg-rust]

2025-02-25 Thread via GitHub
ZENOTME commented on PR #1006: URL: https://github.com/apache/iceberg-rust/pull/1006#issuecomment-2681042108 cc @liurenjie1024 @Xuanwo -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi