Re: [PR] Flink: Fix duplicate data in Flink's upsert writer for format V2 [iceberg]

2024-06-21 Thread via GitHub
zhongqishang commented on code in PR #10526: URL: https://github.com/apache/iceberg/pull/10526#discussion_r1648478721 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/IcebergFilesCommitter.java: ## @@ -259,28 +268,28 @@ private void commitUpToCheckpoint( l

Re: [PR] Flink: Fix duplicate data in Flink's upsert writer for format V2 [iceberg]

2024-06-21 Thread via GitHub
zhongqishang commented on code in PR #10526: URL: https://github.com/apache/iceberg/pull/10526#discussion_r1648452582 ## flink/v1.19/flink/src/test/java/org/apache/iceberg/flink/sink/TestIcebergFilesCommitter.java: ## @@ -887,6 +921,55 @@ public void testCommitTwoCheckpointsInSi

Re: [PR] Kafka Connect: Add kerberos authentication option [iceberg]

2024-06-21 Thread via GitHub
Dawnpool commented on PR #10173: URL: https://github.com/apache/iceberg/pull/10173#issuecomment-2182024037 @bryanck If I got it right, you’re asking if we can add those configs for Kerberos authentication via Hadoop config. I found there are Kerberos options for service principals like

Re: [PR] Flink: Fix duplicate data in Flink's upsert writer for format V2 [iceberg]

2024-06-21 Thread via GitHub
pvary commented on code in PR #10526: URL: https://github.com/apache/iceberg/pull/10526#discussion_r1648420475 ## flink/v1.19/flink/src/test/java/org/apache/iceberg/flink/sink/TestIcebergFilesCommitter.java: ## @@ -887,6 +921,55 @@ public void testCommitTwoCheckpointsInSingleTxn

Re: [PR] Spark: Add SparkSQLProperty to control split-size [iceberg]

2024-06-21 Thread via GitHub
szehon-ho commented on PR #10336: URL: https://github.com/apache/iceberg/pull/10336#issuecomment-2182038718 The problem with this is that it does lead to some ambiguity as to what table the config is applying to (many queries read from several tables, for example) -- This is an automated

Re: [PR] Data: Add a util to read write partition stats [iceberg]

2024-06-21 Thread via GitHub
ajantha-bhat commented on code in PR #10176: URL: https://github.com/apache/iceberg/pull/10176#discussion_r1647514744 ## core/src/main/java/org/apache/iceberg/PartitionStatsUtil.java: ## @@ -0,0 +1,213 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

Re: [PR] Flink: Fix duplicate data in Flink's upsert writer for format V2 [iceberg]

2024-06-21 Thread via GitHub
pvary commented on code in PR #10526: URL: https://github.com/apache/iceberg/pull/10526#discussion_r1647508483 ## flink/v1.19/flink/src/test/java/org/apache/iceberg/flink/SimpleDataUtil.java: ## @@ -82,6 +83,13 @@ private SimpleDataUtil() {} Types.NestedField.optional

Re: [PR] Flink: Fix duplicate data in Flink's upsert writer for format V2 [iceberg]

2024-06-21 Thread via GitHub
pvary commented on code in PR #10526: URL: https://github.com/apache/iceberg/pull/10526#discussion_r1647501589 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/IcebergFilesCommitter.java: ## @@ -448,8 +452,8 @@ private byte[] writeToManifest(long checkpointId) th

Re: [PR] Core: add a new task-type field to task JSON serialization. add data task JSON serialization implementation. [iceberg]

2024-06-21 Thread via GitHub
pvary commented on code in PR #9728: URL: https://github.com/apache/iceberg/pull/9728#discussion_r1647487236 ## core/src/main/java/org/apache/iceberg/DataTaskParser.java: ## @@ -0,0 +1,81 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contribut

[PR] Make connect compatable with kafka plugin.discovery [iceberg]

2024-06-21 Thread via GitHub
joswlv opened a new pull request, #10536: URL: https://github.com/apache/iceberg/pull/10536 What is [Plugin Discovery](https://kafka.apache.org/documentation.html#connect_plugindiscovery). When connector is used with kafka version above 3.6 with default plugin.discovery worker config

Re: [PR] feat: make BoundPredicate,Datum serializable [iceberg-rust]

2024-06-21 Thread via GitHub
liurenjie1024 commented on PR #406: URL: https://github.com/apache/iceberg-rust/pull/406#issuecomment-2177358785 Thanks @ZENOTME for this effort, and @sdd for review -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [I] Filter predicate is duplicated in `TableScanBuilder` and `TableScan` 🤦🏼‍♂️ [iceberg-rust]

2024-06-21 Thread via GitHub
liurenjie1024 commented on issue #407: URL: https://github.com/apache/iceberg-rust/issues/407#issuecomment-2177359460 > @liurenjie1024 can you add the `good first issue` label please? Sure, done. -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] Fix code depending on JVM default charset [iceberg]

2024-06-21 Thread via GitHub
liurenjie1024 merged PR #10529: URL: https://github.com/apache/iceberg/pull/10529 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ice

Re: [PR] feat: make BoundPredicate,Datum serializable [iceberg-rust]

2024-06-21 Thread via GitHub
liurenjie1024 merged PR #406: URL: https://github.com/apache/iceberg-rust/pull/406 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ic

Re: [I] ValueError: Mismatch in fields: ? [iceberg-python]

2024-06-21 Thread via GitHub
kevinjqliu commented on issue #674: URL: https://github.com/apache/iceberg-python/issues/674#issuecomment-2177344307 @Fokko, coming back to this. I think your first comment is already addressed in #807 (thanks @syun64). Your second comment is implemented in #829. Please take a look!

Re: [PR] Support `Table.to_arrow_batch_reader` to return RecordBatchReader instead of a fully materialized Arrow Table [iceberg-python]

2024-06-21 Thread via GitHub
syun64 commented on PR #786: URL: https://github.com/apache/iceberg-python/pull/786#issuecomment-2177312341 @HonahX @Fokko - could I ask your review on this PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [PR] Core: add a new task-type field to task JSON serialization. add data task JSON serialization implementation. [iceberg]

2024-06-21 Thread via GitHub
stevenzwu commented on code in PR #9728: URL: https://github.com/apache/iceberg/pull/9728#discussion_r1645245721 ## core/src/main/java/org/apache/iceberg/SnapshotsTable.java: ## @@ -27,7 +28,8 @@ * This does not include snapshots that have been expired using {@link ExpireSnap

Re: [I] Does MERGE INTO operations support hidden partition on timestamp columns? [iceberg]

2024-06-21 Thread via GitHub
github-actions[bot] commented on issue #2765: URL: https://github.com/apache/iceberg/issues/2765#issuecomment-2177284816 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [I] The job of writing iceberg v2 table threw a validException when testing the merge of iceberg v2 table [iceberg]

2024-06-21 Thread via GitHub
github-actions[bot] commented on issue #2773: URL: https://github.com/apache/iceberg/issues/2773#issuecomment-2177284842 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs.

Re: [PR] Compare `Schema` and `StructType` fields irrespective of ordering [iceberg-python]

2024-06-21 Thread via GitHub
kevinjqliu commented on PR #700: URL: https://github.com/apache/iceberg-python/pull/700#issuecomment-2177259331 Closing in favor of #829 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-06-21 Thread via GitHub
rdblue commented on PR #9695: URL: https://github.com/apache/iceberg/pull/9695#issuecomment-2177254299 > > https://github.com/apache apache deleted a comment from [rahil-c](https://github.com/rahil-c) ([#9695 (comment)](https://github.com/apache/iceberg/pull/9695#event-13139602437)) > >

Re: [I] [DeleteManifest] Making file validation optional [iceberg]

2024-06-21 Thread via GitHub
szehon-ho commented on issue #10535: URL: https://github.com/apache/iceberg/issues/10535#issuecomment-2177253344 Well, that was my thought as its parallel with appendManifest. But now I see https://github.com/apache/iceberg/pull/10396 try to deprecate that, so like to get @Fokko thought he

Re: [PR] OpenAPI: Express server capabilities via /config endpoint [iceberg]

2024-06-21 Thread via GitHub
rdblue commented on PR #9940: URL: https://github.com/apache/iceberg/pull/9940#issuecomment-2177231902 > How is this tag-based approach backwards compatible? @jackye1995, I think what we're realizing is that we need to be able to signal that capabilities are supported. Future addition

Re: [PR] Add PrePlanTable and PlanTable Endpoints to open api spec [iceberg]

2024-06-21 Thread via GitHub
rdblue commented on code in PR #9695: URL: https://github.com/apache/iceberg/pull/9695#discussion_r1645215603 ## open-api/rest-catalog-open-api.yaml: ## @@ -2838,6 +2988,69 @@ components: additionalProperties: type: string +PreplanTableRequest: +

Re: [PR] Flink: handle rescale properly and refactor statistics [iceberg]

2024-06-21 Thread via GitHub
stevenzwu commented on code in PR #10457: URL: https://github.com/apache/iceberg/pull/10457#discussion_r1645212218 ## flink/v1.19/flink/src/main/java/org/apache/iceberg/flink/sink/shuffle/RangePartitioner.java: ## @@ -0,0 +1,110 @@ +/* + * Licensed to the Apache Software Foundat

Re: [PR] Compare `Schema` and `StructType` fields irrespective of ordering [iceberg-python]

2024-06-21 Thread via GitHub
kevinjqliu closed pull request #700: Compare `Schema` and `StructType` fields irrespective of ordering URL: https://github.com/apache/iceberg-python/pull/700 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

[PR] Bump getdaft from 0.2.27 to 0.2.28 [iceberg-python]

2024-06-21 Thread via GitHub
dependabot[bot] opened a new pull request, #834: URL: https://github.com/apache/iceberg-python/pull/834 Bumps [getdaft](https://github.com/Eventual-Inc/Daft) from 0.2.27 to 0.2.28. Release notes Sourced from https://github.com/Eventual-Inc/Daft/releases";>getdaft's releases.

Re: [PR] OpenAPI: Express server capabilities via /config endpoint [iceberg]

2024-06-21 Thread via GitHub
rdblue commented on code in PR #9940: URL: https://github.com/apache/iceberg/pull/9940#discussion_r1645193788 ## open-api/rest-catalog-open-api.yaml: ## @@ -61,6 +61,14 @@ security: - OAuth2: [catalog] - BearerAuth: [] +tags: Review Comment: @nastra, looks like this

Re: [PR] OpenAPI: Express server capabilities via /config endpoint [iceberg]

2024-06-21 Thread via GitHub
rdblue commented on PR #9940: URL: https://github.com/apache/iceberg/pull/9940#issuecomment-2177235732 I think this is close to a point where we can open a mailing list thread for it, but we still have some open items: * Add a `tables` tag * Remove tags that are not capabilities * L

[PR] Bump mkdocstrings-python from 1.10.3 to 1.10.4 [iceberg-python]

2024-06-21 Thread via GitHub
dependabot[bot] opened a new pull request, #832: URL: https://github.com/apache/iceberg-python/pull/832 Bumps [mkdocstrings-python](https://github.com/mkdocstrings/python) from 1.10.3 to 1.10.4. Release notes Sourced from https://github.com/mkdocstrings/python/releases";>mkdocstrin

[PR] Bump tenacity from 8.3.0 to 8.4.1 [iceberg-python]

2024-06-21 Thread via GitHub
dependabot[bot] opened a new pull request, #833: URL: https://github.com/apache/iceberg-python/pull/833 Bumps [tenacity](https://github.com/jd/tenacity) from 8.3.0 to 8.4.1. Release notes Sourced from https://github.com/jd/tenacity/releases";>tenacity's releases. tenacity 8.4

[PR] Bump griffe from 0.46.1 to 0.47.0 [iceberg-python]

2024-06-21 Thread via GitHub
dependabot[bot] opened a new pull request, #831: URL: https://github.com/apache/iceberg-python/pull/831 Bumps [griffe](https://github.com/mkdocstrings/griffe) from 0.46.1 to 0.47.0. Release notes Sourced from https://github.com/mkdocstrings/griffe/releases";>griffe's releases.