Re: [PR] Core, Flink, Spark: Verify maintenance actions with DVs [iceberg]

2024-11-11 Thread via GitHub
nastra commented on code in PR #11485: URL: https://github.com/apache/iceberg/pull/11485#discussion_r1837611245 ## core/src/main/java/org/apache/iceberg/BaseContentScanTask.java: ## @@ -82,7 +83,7 @@ public long start() { @Override public long length() { -return file

Re: [PR] Core, Flink, Spark: Verify maintenance actions with DVs [iceberg]

2024-11-11 Thread via GitHub
nastra commented on code in PR #11485: URL: https://github.com/apache/iceberg/pull/11485#discussion_r1837607694 ## core/src/main/java/org/apache/iceberg/BaseFileScanTask.java: ## @@ -176,7 +176,7 @@ public boolean canMerge(ScanTask other) { @Override public SplitScanTa

Re: [PR] Core, Flink, Spark: Verify maintenance actions with DVs [iceberg]

2024-11-11 Thread via GitHub
nastra commented on code in PR #11485: URL: https://github.com/apache/iceberg/pull/11485#discussion_r1837593058 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/actions/TestRemoveOrphanFilesAction3.java: ## @@ -21,38 +21,37 @@ import static org.assertj.core.api.Assert

Re: [I] Adjust the "table_exists" behavior in the REST Catalog [iceberg-python]

2024-11-11 Thread via GitHub
djouallah commented on issue #1018: URL: https://github.com/apache/iceberg-python/issues/1018#issuecomment-2469713958 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [PR] API: Support removeUnusedSpecs in ExpireSnapshots [iceberg]

2024-11-11 Thread via GitHub
advancedxy commented on code in PR #10755: URL: https://github.com/apache/iceberg/pull/10755#discussion_r1837531744 ## api/src/main/java/org/apache/iceberg/ExpireSnapshots.java: ## @@ -118,4 +118,16 @@ public interface ExpireSnapshots extends PendingUpdate> { * @return this

Re: [I] Enhance `catalog.create_table` API to enable creation of table with matching `field_ids` to provided Schema [iceberg-python]

2024-11-11 Thread via GitHub
kevinjqliu commented on issue #1284: URL: https://github.com/apache/iceberg-python/issues/1284#issuecomment-2469668954 looks like in `new_table_metadata` the ID assignment is propagated to the `partition_spec` and `sort_order` as well https://github.com/apache/iceberg-python/pull/13

Re: [I] Enhance `catalog.create_table` API to enable creation of table with matching `field_ids` to provided Schema [iceberg-python]

2024-11-11 Thread via GitHub
kevinjqliu commented on issue #1284: URL: https://github.com/apache/iceberg-python/issues/1284#issuecomment-2469667659 > In general, I think the philosophy should be; that people don't have to worry about field IDs, and this should be hidden away. +1 > In the case of the tab

Re: [PR] Add table statistics [iceberg-python]

2024-11-11 Thread via GitHub
kevinjqliu commented on code in PR #1285: URL: https://github.com/apache/iceberg-python/pull/1285#discussion_r1837496028 ## mkdocs/docs/api.md: ## @@ -1129,6 +1129,28 @@ with table.manage_snapshots() as ms: ms.create_branch(snapshot_id1, "Branch_A").create_tag(snapshot_id2,

Re: [I] Adjust the "table_exists" behavior in the REST Catalog [iceberg-python]

2024-11-11 Thread via GitHub
kevinjqliu commented on issue #1018: URL: https://github.com/apache/iceberg-python/issues/1018#issuecomment-2469629719 something like this, following the `table_exists` implementation https://github.com/apache/iceberg-python/blob/b7942a85dfb74ce3736c5088995e7bd0b996d56b/pyiceberg/catalog

Re: [PR] use KEYS file from `https://downloads.apache.org/iceberg` [iceberg-python]

2024-11-11 Thread via GitHub
kevinjqliu commented on code in PR #1315: URL: https://github.com/apache/iceberg-python/pull/1315#discussion_r1837460391 ## mkdocs/docs/how-to-release.md: ## @@ -82,10 +82,10 @@ export LAST_COMMIT_ID=$(git rev-list ${GIT_TAG} 2> /dev/null | head -n 1) ``` The `-s` option wi

Re: [PR] Parquet: Use native getRowIndexOffset support instead of calculating it [iceberg]

2024-11-11 Thread via GitHub
wypoon commented on PR #11520: URL: https://github.com/apache/iceberg/pull/11520#issuecomment-2469477663 @szehon-ho @flyrain can you please review? cc @huaxingao -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Spark 3.5: Iceberg parser should passthrough unsupported procedure to delegate [iceberg]

2024-11-11 Thread via GitHub
pan3793 commented on PR #11480: URL: https://github.com/apache/iceberg/pull/11480#issuecomment-2469462415 > Do we have an example of someone else using the CALL syntax? @RussellSpitzer for now I have seen the following projects support `CALL` syntax too. - Apache Hudi https://

Re: [PR] use KEYS file from `https://downloads.apache.org/iceberg` [iceberg-python]

2024-11-11 Thread via GitHub
Xuanwo commented on code in PR #1315: URL: https://github.com/apache/iceberg-python/pull/1315#discussion_r1837394989 ## mkdocs/docs/how-to-release.md: ## @@ -82,10 +82,10 @@ export LAST_COMMIT_ID=$(git rev-list ${GIT_TAG} 2> /dev/null | head -n 1) ``` The `-s` option will s

Re: [PR] use KEYS file from `https://downloads.apache.org/iceberg` [iceberg-python]

2024-11-11 Thread via GitHub
Xuanwo commented on code in PR #1315: URL: https://github.com/apache/iceberg-python/pull/1315#discussion_r1837394989 ## mkdocs/docs/how-to-release.md: ## @@ -82,10 +82,10 @@ export LAST_COMMIT_ID=$(git rev-list ${GIT_TAG} 2> /dev/null | head -n 1) ``` The `-s` option will s

Re: [PR] use KEYS file from `https://downloads.apache.org/iceberg` [iceberg-python]

2024-11-11 Thread via GitHub
kevinjqliu commented on code in PR #1315: URL: https://github.com/apache/iceberg-python/pull/1315#discussion_r1837390392 ## mkdocs/docs/how-to-release.md: ## @@ -82,10 +82,10 @@ export LAST_COMMIT_ID=$(git rev-list ${GIT_TAG} 2> /dev/null | head -n 1) ``` The `-s` option wi

[PR] use KEYS file from `https://downloads.apache.org/iceberg` [iceberg-python]

2024-11-11 Thread via GitHub
kevinjqliu opened a new pull request, #1315: URL: https://github.com/apache/iceberg-python/pull/1315 Deprecate the use of - https://dist.apache.org/repos/dist/dev/iceberg/KEYS - https://dist.apache.org/repos/dist/release/iceberg/KEYS in favor of - https://downloads.apache.or

Re: [I] Why shouldn't we return an `UnboundPartitionSpec` instead? [iceberg-rust]

2024-11-11 Thread via GitHub
liurenjie1024 commented on issue #694: URL: https://github.com/apache/iceberg-rust/issues/694#issuecomment-2469423843 Here is the reason why I accept it: `SchemalessPartitionSpec` is used when we load table metadata and build partition spec from table metadata. Since there is no schem

Re: [I] Why shouldn't we return an `UnboundPartitionSpec` instead? [iceberg-rust]

2024-11-11 Thread via GitHub
liurenjie1024 commented on issue #694: URL: https://github.com/apache/iceberg-rust/issues/694#issuecomment-2469416092 I asked same question, and here is the answer from @c-thiel : https://github.com/apache/iceberg-rust/pull/645#issuecomment-2431923524 And the reason I this is reason

Re: [I] Adjust the "table_exists" behavior in the REST Catalog [iceberg-python]

2024-11-11 Thread via GitHub
djouallah commented on issue #1018: URL: https://github.com/apache/iceberg-python/issues/1018#issuecomment-2469375855 Sorry, how to do that ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] Docs: Change to Flink directory for instructions [iceberg]

2024-11-11 Thread via GitHub
liuml07 commented on PR #11031: URL: https://github.com/apache/iceberg/pull/11031#issuecomment-2469391182 @szehon-ho Could you help review. This is trivial change IMHO. Thanks, -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] Parquet: Use native getRowIndexOffset support instead of calculating it [iceberg]

2024-11-11 Thread via GitHub
wypoon commented on PR #10107: URL: https://github.com/apache/iceberg/pull/10107#issuecomment-2469384488 I don't see how to reopen this. I have opened a new PR, https://github.com/apache/iceberg/pull/11520. -- This is an automated message from the Apache Git Service. To respond to the mes

Re: [I] Missing `woodstox-core` transitive dependency results in `ClassNotFoundException: com.ctc.wstx.io.InputBootstrapper` in kafka connector distribution artifact [iceberg]

2024-11-11 Thread via GitHub
josepanguera commented on issue #11489: URL: https://github.com/apache/iceberg/issues/11489#issuecomment-2469354170 Fixed with #11516 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [I] Missing `woodstox-core` transitive dependency results in `ClassNotFoundException: com.ctc.wstx.io.InputBootstrapper` in kafka connector distribution artifact [iceberg]

2024-11-11 Thread via GitHub
josepanguera closed issue #11489: Missing `woodstox-core` transitive dependency results in `ClassNotFoundException: com.ctc.wstx.io.InputBootstrapper` in kafka connector distribution artifact URL: https://github.com/apache/iceberg/issues/11489 -- This is an automated message from the Apache

Re: [PR] Materialized View Spec [iceberg]

2024-11-11 Thread via GitHub
github-actions[bot] commented on PR #11041: URL: https://github.com/apache/iceberg/pull/11041#issuecomment-2469334725 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [PR] GCP: Add Iceberg Catalog for GCP BigQuery Metastore [iceberg]

2024-11-11 Thread via GitHub
github-actions[bot] commented on PR #11039: URL: https://github.com/apache/iceberg/pull/11039#issuecomment-2469334657 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [PR] Docs: Change to Flink directory for instructions [iceberg]

2024-11-11 Thread via GitHub
github-actions[bot] commented on PR #11031: URL: https://github.com/apache/iceberg/pull/11031#issuecomment-2469334644 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [PR] Bump Palentir gradle baseline [iceberg]

2024-11-11 Thread via GitHub
github-actions[bot] commented on PR #11012: URL: https://github.com/apache/iceberg/pull/11012#issuecomment-2469334631 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [PR] Core, API, Arrow: Type promotion for int/long to string for V3 tables [iceberg]

2024-11-11 Thread via GitHub
github-actions[bot] commented on PR #10991: URL: https://github.com/apache/iceberg/pull/10991#issuecomment-2469334610 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [PR] Parquet: page skipping using filtered row groups (vectorized and non-vectorized read) [iceberg]

2024-11-11 Thread via GitHub
github-actions[bot] commented on PR #10399: URL: https://github.com/apache/iceberg/pull/10399#issuecomment-2469334332 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If

Re: [PR] Use Snapshot's statistics file in SparkScan [iceberg]

2024-11-11 Thread via GitHub
github-actions[bot] commented on PR #11040: URL: https://github.com/apache/iceberg/pull/11040#issuecomment-2469334682 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [PR] Build: Bump antlr from 4.9.3 to 4.13.2 [iceberg]

2024-11-11 Thread via GitHub
github-actions[bot] commented on PR #10867: URL: https://github.com/apache/iceberg/pull/10867#issuecomment-2469334527 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [PR] add aliyun bundle jar [iceberg]

2024-11-11 Thread via GitHub
github-actions[bot] commented on PR #10971: URL: https://github.com/apache/iceberg/pull/10971#issuecomment-2469334596 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [PR] Flink: Iceberg flink multi table sink and runtime table discoverability [iceberg]

2024-11-11 Thread via GitHub
github-actions[bot] closed pull request #10376: Flink: Iceberg flink multi table sink and runtime table discoverability URL: https://github.com/apache/iceberg/pull/10376 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Parquet: page skipping using filtered row groups (vectorized and non-vectorized read) [iceberg]

2024-11-11 Thread via GitHub
github-actions[bot] closed pull request #10399: Parquet: page skipping using filtered row groups (vectorized and non-vectorized read) URL: https://github.com/apache/iceberg/pull/10399 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [PR] Remove tokens-endpoint from REST spec [iceberg]

2024-11-11 Thread via GitHub
github-actions[bot] commented on PR #10398: URL: https://github.com/apache/iceberg/pull/10398#issuecomment-2469334309 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If

Re: [PR] Remove tokens-endpoint from REST spec [iceberg]

2024-11-11 Thread via GitHub
github-actions[bot] closed pull request #10398: Remove tokens-endpoint from REST spec URL: https://github.com/apache/iceberg/pull/10398 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] Flink: Iceberg flink multi table sink and runtime table discoverability [iceberg]

2024-11-11 Thread via GitHub
github-actions[bot] commented on PR #10376: URL: https://github.com/apache/iceberg/pull/10376#issuecomment-2469334290 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If

Re: [I] Allow to configure thread-pool while using Iceberg to read the data (plan files/tasks) [iceberg]

2024-11-11 Thread via GitHub
github-actions[bot] commented on issue #10335: URL: https://github.com/apache/iceberg/issues/10335#issuecomment-2469334260 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [I] REST Catalog to support custom-catalog name like HMS/Glue [iceberg]

2024-11-11 Thread via GitHub
github-actions[bot] commented on issue #10205: URL: https://github.com/apache/iceberg/issues/10205#issuecomment-2469334211 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [PR] Kafka Connect: fix Hadoop dependency exclusion [iceberg]

2024-11-11 Thread via GitHub
bryanck merged PR #11516: URL: https://github.com/apache/iceberg/pull/11516 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.a

Re: [PR] Kafka Connect: fix Hadoop dependency exclusion [iceberg]

2024-11-11 Thread via GitHub
bryanck commented on PR #11516: URL: https://github.com/apache/iceberg/pull/11516#issuecomment-2469306787 Awesome, thanks @josepanguera for reporting this and testing the fix, and thanks @nastra @Fokko @RussellSpitzer @ajantha-bhat and @singhpk234 for the review! -- This is an automated

[PR] Local changes to `verify_rc.sh` [iceberg-go]

2024-11-11 Thread via GitHub
kevinjqliu opened a new pull request, #199: URL: https://github.com/apache/iceberg-go/pull/199 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

[PR] Bump deptry from 0.20.0 to 0.21.0 [iceberg-python]

2024-11-11 Thread via GitHub
dependabot[bot] opened a new pull request, #1313: URL: https://github.com/apache/iceberg-python/pull/1313 Bumps [deptry](https://github.com/fpgmaas/deptry) from 0.20.0 to 0.21.0. Release notes Sourced from https://github.com/fpgmaas/deptry/releases";>deptry's releases. 0.21.

[PR] Add Option to Configure Max Concurrency for Table Scanning and Planning Operations [iceberg-go]

2024-11-11 Thread via GitHub
glkz opened a new pull request, #198: URL: https://github.com/apache/iceberg-go/pull/198 This PR introduces an option to set `MaxConcurrency` for the `Scanner`, allowing users to control the level of concurrent downloads in Scanner. This configuration can be beneficial for workloads running

[PR] Bump moto from 5.0.18 to 5.0.20 [iceberg-python]

2024-11-11 Thread via GitHub
dependabot[bot] opened a new pull request, #1314: URL: https://github.com/apache/iceberg-python/pull/1314 Bumps [moto](https://github.com/getmoto/moto) from 5.0.18 to 5.0.20. Changelog Sourced from https://github.com/getmoto/moto/blob/master/CHANGELOG.md";>moto's changelog.

Re: [PR] Support WASB scheme in ADLSFileIO [iceberg]

2024-11-11 Thread via GitHub
RussellSpitzer commented on code in PR #11504: URL: https://github.com/apache/iceberg/pull/11504#discussion_r1837256131 ## azure/src/main/java/org/apache/iceberg/azure/AzureProperties.java: ## @@ -93,7 +93,7 @@ public void applyClientConfiguration(String account, DataLakeFileSy

Re: [PR] Spark 3.5: Iceberg parser should passthrough unsupported procedure to delegate [iceberg]

2024-11-11 Thread via GitHub
RussellSpitzer commented on PR #11480: URL: https://github.com/apache/iceberg/pull/11480#issuecomment-2469050720 > Spark allows users to configure multiple extensions and each extension is allowed to inject its own SQL parser, there is a chance that the user configures multiple extensions t

Re: [PR] Core: Support commits with DVs [iceberg]

2024-11-11 Thread via GitHub
aokolnychyi commented on PR #11495: URL: https://github.com/apache/iceberg/pull/11495#issuecomment-2469024997 Thanks for reviewing, @nastra @danielcweeks @jbonofre @amogh-jahagirdar! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [PR] Core: Support commits with DVs [iceberg]

2024-11-11 Thread via GitHub
aokolnychyi merged PR #11495: URL: https://github.com/apache/iceberg/pull/11495 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@icebe

Re: [PR] Spark partial limit push down [iceberg]

2024-11-11 Thread via GitHub
huaxingao commented on code in PR #10943: URL: https://github.com/apache/iceberg/pull/10943#discussion_r1837201737 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestLimitPushDown.java: ## @@ -0,0 +1,339 @@ +/* + * Licensed to the Apache Software Foundation (A

Re: [PR] Kafka Connect: fix Hadoop dependency exclusion [iceberg]

2024-11-11 Thread via GitHub
bryanck commented on PR #11516: URL: https://github.com/apache/iceberg/pull/11516#issuecomment-2468939132 To some extent we are at their mercy. But in this case, it is really our bug, because we were relying on a library to be on the classpath that we shouldn't have relied on. We got "lucky

Re: [PR] Core, Flink, Spark: Verify maintenance actions with DVs [iceberg]

2024-11-11 Thread via GitHub
aokolnychyi commented on code in PR #11485: URL: https://github.com/apache/iceberg/pull/11485#discussion_r1837135723 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/actions/TestRemoveOrphanFilesAction3.java: ## @@ -21,38 +21,37 @@ import static org.assertj.core.api.A

Re: [PR] Core, Flink, Spark: Verify maintenance actions with DVs [iceberg]

2024-11-11 Thread via GitHub
aokolnychyi commented on code in PR #11485: URL: https://github.com/apache/iceberg/pull/11485#discussion_r1837132399 ## core/src/main/java/org/apache/iceberg/BaseFileScanTask.java: ## @@ -176,7 +176,7 @@ public boolean canMerge(ScanTask other) { @Override public SplitS

Re: [PR] Core, Flink, Spark: Verify maintenance actions with DVs [iceberg]

2024-11-11 Thread via GitHub
aokolnychyi commented on code in PR #11485: URL: https://github.com/apache/iceberg/pull/11485#discussion_r1837132399 ## core/src/main/java/org/apache/iceberg/BaseFileScanTask.java: ## @@ -176,7 +176,7 @@ public boolean canMerge(ScanTask other) { @Override public SplitS

[PR] Add @override [iceberg-python]

2024-11-11 Thread via GitHub
cosmastech opened a new pull request, #1312: URL: https://github.com/apache/iceberg-python/pull/1312 Resolves 1310 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubs

Re: [PR] Kafka Connect: fix Hadoop dependency exclusion [iceberg]

2024-11-11 Thread via GitHub
RussellSpitzer commented on PR #11516: URL: https://github.com/apache/iceberg/pull/11516#issuecomment-2468918890 > The full classpath depends on the Connect framework being used to run the connector (MSK, Strimzi, Confluent, etc), so in this case we need to deploy to MSK to really test this

Re: [PR] Core: Support commits with DVs [iceberg]

2024-11-11 Thread via GitHub
danielcweeks commented on code in PR #11495: URL: https://github.com/apache/iceberg/pull/11495#discussion_r1837127586 ## core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java: ## @@ -268,9 +274,32 @@ private void add(PendingDeleteFile file) { if (deleteFiles.ad

Re: [PR] Core, Flink, Spark: Verify maintenance actions with DVs [iceberg]

2024-11-11 Thread via GitHub
aokolnychyi commented on code in PR #11485: URL: https://github.com/apache/iceberg/pull/11485#discussion_r1837123187 ## core/src/main/java/org/apache/iceberg/BaseContentScanTask.java: ## @@ -82,7 +83,7 @@ public long start() { @Override public long length() { -return

Re: [PR] Core, Flink, Spark: Verify maintenance actions with DVs [iceberg]

2024-11-11 Thread via GitHub
aokolnychyi commented on code in PR #11485: URL: https://github.com/apache/iceberg/pull/11485#discussion_r1837123187 ## core/src/main/java/org/apache/iceberg/BaseContentScanTask.java: ## @@ -82,7 +83,7 @@ public long start() { @Override public long length() { -return

Re: [PR] Core: Support commits with DVs [iceberg]

2024-11-11 Thread via GitHub
aokolnychyi commented on code in PR #11495: URL: https://github.com/apache/iceberg/pull/11495#discussion_r1837117968 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteManifestsAction.java: ## @@ -956,6 +956,62 @@ public void testRewriteLargeDeleteMan

Re: [PR] Core: Support commits with DVs [iceberg]

2024-11-11 Thread via GitHub
aokolnychyi commented on code in PR #11495: URL: https://github.com/apache/iceberg/pull/11495#discussion_r1837104323 ## core/src/test/java/org/apache/iceberg/TestBase.java: ## @@ -643,6 +666,22 @@ protected DataFile newDataFile(String partitionPath) { .build(); } +

Re: [PR] Add REST Catalog tests to Spark 3.5 integration test [iceberg]

2024-11-11 Thread via GitHub
haizhou-zhao commented on PR #11093: URL: https://github.com/apache/iceberg/pull/11093#issuecomment-2468874987 @danielcweeks All comments adopted, feel free to take another look -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] Core: Support commits with DVs [iceberg]

2024-11-11 Thread via GitHub
aokolnychyi commented on code in PR #11495: URL: https://github.com/apache/iceberg/pull/11495#discussion_r1837092881 ## core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java: ## @@ -268,9 +274,32 @@ private void add(PendingDeleteFile file) { if (deleteFiles.add

Re: [PR] Kafka Connect: fix Hadoop dependency exclusion [iceberg]

2024-11-11 Thread via GitHub
bryanck commented on PR #11516: URL: https://github.com/apache/iceberg/pull/11516#issuecomment-2468857385 The full classpath depends on the Connect framework being used to run the connector (MSK, Strimzi, Confluent, etc), so in this case we need to deploy to MSK to really test this. Josep w

Re: [PR] Core: Support commits with DVs [iceberg]

2024-11-11 Thread via GitHub
aokolnychyi commented on code in PR #11495: URL: https://github.com/apache/iceberg/pull/11495#discussion_r1837091098 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteManifestsAction.java: ## @@ -956,6 +956,62 @@ public void testRewriteLargeDeleteMan

Re: [PR] collect min-max from manifest(Draft) [iceberg]

2024-11-11 Thread via GitHub
saitharun15 closed pull request #11519: collect min-max from manifest(Draft) URL: https://github.com/apache/iceberg/pull/11519 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

Re: [PR] Spark 3.5: Iceberg parser should passthrough unsupported procedure to delegate [iceberg]

2024-11-11 Thread via GitHub
pan3793 commented on code in PR #11480: URL: https://github.com/apache/iceberg/pull/11480#discussion_r1837078343 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/parser/extensions/IcebergSparkSqlExtensionsParser.scala: ## @@ -151,6 +155,11 @@ class Ice

Re: [PR] Kafka Connect: fix Hadoop dependency exclusion [iceberg]

2024-11-11 Thread via GitHub
RussellSpitzer commented on PR #11516: URL: https://github.com/apache/iceberg/pull/11516#issuecomment-2468821496 I'm a little lost on this, why is this difficult on MSK? If we don't know the classpath is there ever a way we can really test things? -- This is an automated message from the

Re: [PR] Core: Support commits with DVs [iceberg]

2024-11-11 Thread via GitHub
danielcweeks commented on code in PR #11495: URL: https://github.com/apache/iceberg/pull/11495#discussion_r1837076611 ## spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteManifestsAction.java: ## @@ -956,6 +956,62 @@ public void testRewriteLargeDeleteMa

Re: [PR] Core,Open-API: Don't expose the `last-column-id` [iceberg]

2024-11-11 Thread via GitHub
RussellSpitzer commented on PR #11514: URL: https://github.com/apache/iceberg/pull/11514#issuecomment-2468828703 I did not understand why this was there before. Do we have anyone or any implementations which benefit from having it there? -- This is an automated message from the Apache Git

Re: [PR] Core: Support commits with DVs [iceberg]

2024-11-11 Thread via GitHub
danielcweeks commented on code in PR #11495: URL: https://github.com/apache/iceberg/pull/11495#discussion_r1837051720 ## core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java: ## @@ -268,9 +274,32 @@ private void add(PendingDeleteFile file) { if (deleteFiles.ad

Re: [PR] Spark 3.5: Iceberg parser should passthrough unsupported procedure to delegate [iceberg]

2024-11-11 Thread via GitHub
RussellSpitzer commented on code in PR #11480: URL: https://github.com/apache/iceberg/pull/11480#discussion_r1837044108 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/parser/extensions/IcebergSparkSqlExtensionsParser.scala: ## @@ -151,6 +155,11 @@ cl

Re: [PR] Core: Change Delete granularity to file for new tables [iceberg]

2024-11-11 Thread via GitHub
amogh-jahagirdar commented on code in PR #11478: URL: https://github.com/apache/iceberg/pull/11478#discussion_r1837005354 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestRewritePositionDeleteFilesProcedure.java: ## @@ -49,7 +49,7 @@ private v

Re: [PR] Parquet: Use native getRowIndexOffset support instead of calculating it [iceberg]

2024-11-11 Thread via GitHub
wypoon commented on PR #10107: URL: https://github.com/apache/iceberg/pull/10107#issuecomment-2468749053 @huaxingao yes, I'll be happy to reopen it. The versions in the deprecation comments need to be updated. I'll update the PR soon. -- This is an automated message from the Apache Git Se

[I] Update `KEYS` file reference [iceberg-python]

2024-11-11 Thread via GitHub
kevinjqliu opened a new issue, #1311: URL: https://github.com/apache/iceberg-python/issues/1311 ### Apache Iceberg version None ### Please describe the bug 🐞 https://lists.apache.org/thread/8j41w4y2jx6r3ybj0o82bfyn0npmhgx2 Update references to `https://dist.apa

Re: [PR] Kafka Connect: fix Hadoop dependency exclusion [iceberg]

2024-11-11 Thread via GitHub
josepanguera commented on PR #11516: URL: https://github.com/apache/iceberg/pull/11516#issuecomment-2468738705 > The unfortunate fact is many Hadoop/Hive dependencies have security vulnerabilities, which is why we have 2 connector distributions (one with Hive, one without). I pushed an upda

Re: [PR] Kafka Connect: fix Hadoop dependency exclusion [iceberg]

2024-11-11 Thread via GitHub
bryanck commented on PR #11516: URL: https://github.com/apache/iceberg/pull/11516#issuecomment-2468734643 The unfortunate fact is many Hadoop/Hive dependencies have security vulnerabilities, which is why we have 2 connector distributions (one with Hive, one without). I pushed an update to f

Re: [I] Add `@typing.override` to functions [iceberg-python]

2024-11-11 Thread via GitHub
kevinjqliu commented on issue #1310: URL: https://github.com/apache/iceberg-python/issues/1310#issuecomment-2468725040 @cosmastech I dont think we need to do the override. For Python versions less than 3.12, I think this will just be a no-op https://docs.python.org/3/library/typing.html#

Re: [I] Adjust the "table_exists" behavior in the REST Catalog [iceberg-python]

2024-11-11 Thread via GitHub
kevinjqliu commented on issue #1018: URL: https://github.com/apache/iceberg-python/issues/1018#issuecomment-2468721923 I think the issue here is that `catalog.table_exists(tbl)` returns `False` when the table already exists. @djouallah can you check what the the response/status code

Re: [PR] Spark 3.5: Iceberg parser should passthrough unsupported procedure to delegate [iceberg]

2024-11-11 Thread via GitHub
pan3793 commented on code in PR #11480: URL: https://github.com/apache/iceberg/pull/11480#discussion_r1836983760 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/procedures/SparkProcedures.java: ## @@ -37,6 +38,10 @@ public static ProcedureBuilder newBuilder(String nam

Re: [PR] Kafka Connect: fix Hadoop dependency exclusion [iceberg]

2024-11-11 Thread via GitHub
bryanck commented on code in PR #11516: URL: https://github.com/apache/iceberg/pull/11516#discussion_r1837002014 ## kafka-connect/build.gradle: ## @@ -96,7 +96,6 @@ project(':iceberg-kafka-connect:iceberg-kafka-connect-runtime') { exclude group: 'org.slf4j' exclud

Re: [PR] Kafka Connect: fix Hadoop dependency exclusion [iceberg]

2024-11-11 Thread via GitHub
bryanck commented on PR #11516: URL: https://github.com/apache/iceberg/pull/11516#issuecomment-2468679671 Thanks for testing @josepanguera , any chance you could test one more time? I removed another exclude. If that doesn't work I'll debug on MSK myself. -- This is an automated message f

Re: [PR] Spark 3.5: Iceberg parser should passthrough unsupported procedure to delegate [iceberg]

2024-11-11 Thread via GitHub
pan3793 commented on code in PR #11480: URL: https://github.com/apache/iceberg/pull/11480#discussion_r1836979682 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/parser/extensions/IcebergSparkSqlExtensionsParser.scala: ## @@ -151,6 +155,11 @@ class Ice

Re: [PR] Spark: Fix typo in spark ddl document [iceberg]

2024-11-11 Thread via GitHub
hantangwangd commented on PR #11517: URL: https://github.com/apache/iceberg/pull/11517#issuecomment-2468674770 My pleasure! @Fokko -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] Spark 3.5: Iceberg parser should passthrough unsupported procedure to delegate [iceberg]

2024-11-11 Thread via GitHub
pan3793 commented on code in PR #11480: URL: https://github.com/apache/iceberg/pull/11480#discussion_r1836985076 ## spark/v3.5/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestCallStatementParser.java: ## @@ -68,11 +68,29 @@ public static void stopSpark() {

Re: [PR] Kafka Connect: fix Hadoop dependency exclusion [iceberg]

2024-11-11 Thread via GitHub
josepanguera commented on PR #11516: URL: https://github.com/apache/iceberg/pull/11516#issuecomment-2468671617 I'm including here my build scripts to make it work for now if it's of any usefulness. `Makefile` ```Makefile SHELL=/bin/bash SCHEMA_REGISTRY_CONVERTER_V

Re: [PR] Spark 3.5: Iceberg parser should passthrough unsupported procedure to delegate [iceberg]

2024-11-11 Thread via GitHub
pan3793 commented on code in PR #11480: URL: https://github.com/apache/iceberg/pull/11480#discussion_r1836983760 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/procedures/SparkProcedures.java: ## @@ -37,6 +38,10 @@ public static ProcedureBuilder newBuilder(String nam

Re: [PR] Ignore schema merge updates from long -> int [iceberg]

2024-11-11 Thread via GitHub
RussellSpitzer commented on PR #11419: URL: https://github.com/apache/iceberg/pull/11419#issuecomment-2468649457 I think we may need to change the name from compatibleType ... I think the check here is more like "ignoreTypeChange" since we basically let invalid type changes go through t

Re: [PR] Spark 3.5: Iceberg parser should passthrough unsupported procedure to delegate [iceberg]

2024-11-11 Thread via GitHub
pan3793 commented on code in PR #11480: URL: https://github.com/apache/iceberg/pull/11480#discussion_r1836979682 ## spark/v3.5/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/parser/extensions/IcebergSparkSqlExtensionsParser.scala: ## @@ -151,6 +155,11 @@ class Ice

Re: [PR] Ignore schema merge updates from long -> int [iceberg]

2024-11-11 Thread via GitHub
RussellSpitzer commented on PR #11419: URL: https://github.com/apache/iceberg/pull/11419#issuecomment-2468641393 Didn't understand your comment @rocco408 , We were just missing that any NonPrimitive -> NonPrimitive are considered compatible. https://github.com/rocco408/iceberg/pull/3/

Re: [PR] Spark: Fix typo in spark ddl document [iceberg]

2024-11-11 Thread via GitHub
Fokko merged PR #11517: URL: https://github.com/apache/iceberg/pull/11517 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apa

Re: [PR] Kafka Connect: fix Hadoop dependency exclusion [iceberg]

2024-11-11 Thread via GitHub
ajantha-bhat commented on code in PR #11516: URL: https://github.com/apache/iceberg/pull/11516#discussion_r1836965419 ## kafka-connect/build.gradle: ## @@ -96,7 +96,6 @@ project(':iceberg-kafka-connect:iceberg-kafka-connect-runtime') { exclude group: 'org.slf4j' e

Re: [PR] Kafka Connect: fix Hadoop dependency exclusion [iceberg]

2024-11-11 Thread via GitHub
ajantha-bhat commented on code in PR #11516: URL: https://github.com/apache/iceberg/pull/11516#discussion_r1836965419 ## kafka-connect/build.gradle: ## @@ -96,7 +96,6 @@ project(':iceberg-kafka-connect:iceberg-kafka-connect-runtime') { exclude group: 'org.slf4j' e

Re: [PR] Spark: Fix typo in spark ddl document [iceberg]

2024-11-11 Thread via GitHub
Fokko commented on PR #11517: URL: https://github.com/apache/iceberg/pull/11517#issuecomment-2468631383 Thanks for spotting this @hantangwangd -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [I] Serialization of the org.apache.iceberg.io.WriteResult class. [iceberg]

2024-11-11 Thread via GitHub
pvary commented on issue #10710: URL: https://github.com/apache/iceberg/issues/10710#issuecomment-2468632001 I don't think that the config solution would be something which we would like to support in the Iceberg connector. Setting it when creating the operators and connecting the during

Re: [I] Why shouldn't we return an `UnboundPartitionSpec` instead? [iceberg-rust]

2024-11-11 Thread via GitHub
Fokko commented on issue #694: URL: https://github.com/apache/iceberg-rust/issues/694#issuecomment-2468628966 @Xuanwo Thanks, appreciate it. I'm also preparing a PR for Java that illustrates the solution more clearly. -- This is an automated message from the Apache Git Service. To respond

Re: [PR] Test out Apache Parquet 1.14.4 RC2 [iceberg]

2024-11-11 Thread via GitHub
singhpk234 commented on code in PR #11502: URL: https://github.com/apache/iceberg/pull/11502#discussion_r1836956448 ## flink/v1.18/flink/src/test/java/org/apache/iceberg/flink/source/TestMetadataTableReadableMetrics.java: ## @@ -217,27 +217,27 @@ public void testPrimitiveColumns

Re: [PR] Kafka Connect: fix Hadoop dependency exclusion [iceberg]

2024-11-11 Thread via GitHub
bryanck commented on PR #11516: URL: https://github.com/apache/iceberg/pull/11516#issuecomment-2468600064 I asked the user to test this PR on MSK (where the issue happens), so just waiting on that. -- This is an automated message from the Apache Git Service. To respond to the message, ple

Re: [PR] Support WASB scheme in ADLSFileIO [iceberg]

2024-11-11 Thread via GitHub
mrcnc commented on code in PR #11504: URL: https://github.com/apache/iceberg/pull/11504#discussion_r1836917624 ## azure/src/main/java/org/apache/iceberg/azure/AzureProperties.java: ## @@ -93,7 +93,7 @@ public void applyClientConfiguration(String account, DataLakeFileSystemClien

[PR] Kafka Connect: fix Hadoop dependency exclusion [iceberg]

2024-11-11 Thread via GitHub
bryanck opened a new pull request, #11516: URL: https://github.com/apache/iceberg/pull/11516 This PR removes the exclusion of group `com.fasterxml.woodstox` from the Hadoop transitive dependencies when building the Kafka Connect distribution, as the library is needed to load Hadoop's `Confi

  1   2   >