Re: [I] Support validate and related logic in Snapshot [iceberg-rust]

2025-05-16 Thread via GitHub
CTTY commented on issue #1344: URL: https://github.com/apache/iceberg-rust/issues/1344#issuecomment-2888123707 predicate/residual evaluator and manifest filter are also important (to reduce conflict rate) but they could be completed separately, wdyt? @Xuanwo @liurenjie1024 @jonathanc-n -

Re: [I] Support RowDeltaAction [iceberg-rust]

2025-05-16 Thread via GitHub
CTTY commented on issue #1104: URL: https://github.com/apache/iceberg-rust/issues/1104#issuecomment-2888118740 I've created https://github.com/apache/iceberg-rust/issues/1344 to add validate logic, which should be a prerequisite of this issue -- This is an automated message from the Apach

[I] Support validate and related logic in Snapshot [iceberg-rust]

2025-05-16 Thread via GitHub
CTTY opened a new issue, #1344: URL: https://github.com/apache/iceberg-rust/issues/1344 ### Is your feature request related to a problem or challenge? To detect conflicts, we need to have `validate` and related logic to scan manifests and validate against history. References:

Re: [PR] Spark: Structured Streaming read limit support follow-up [iceberg]

2025-05-16 Thread via GitHub
huaxingao commented on PR #12260: URL: https://github.com/apache/iceberg/pull/12260#issuecomment-2888092503 @wypoon Thanks for the PR โ€” the changes look good to me. I have a question about the tests. It seems that a test like testReadStreamWithMaxRows2() would pass with both the original im

Re: [PR] Core: Enhance remove snapshots efficiency by executing them in bulk [iceberg]

2025-05-16 Thread via GitHub
sfc-gh-aixu commented on code in PR #12670: URL: https://github.com/apache/iceberg/pull/12670#discussion_r209396 ## core/src/main/java/org/apache/iceberg/TableMetadata.java: ## @@ -1450,6 +1452,10 @@ private Builder rewriteSnapshotsInternal(Collection idsToRemove, boolean s

Re: [PR] feat: Introduce snapshot summary properties [iceberg-rust]

2025-05-16 Thread via GitHub
jonathanc-n commented on code in PR #1336: URL: https://github.com/apache/iceberg-rust/pull/1336#discussion_r2093880705 ## crates/iceberg/src/transaction/mod.rs: ## @@ -128,6 +128,18 @@ impl<'a> Transaction<'a> { Ok(self) } +/// Add snapshot summary propertie

Re: [PR] feat: Add `IndexByName` and `IndexById` to Namemapping [iceberg-rust]

2025-05-16 Thread via GitHub
jonathanc-n commented on code in PR #1299: URL: https://github.com/apache/iceberg-rust/pull/1299#discussion_r2093873639 ## crates/iceberg/src/spec/name_mapping/mod.rs: ## @@ -84,9 +133,187 @@ impl MappedField { } } +/// Recursively visits the entire name mapping using vi

Re: [PR] Flink 2.0: Remove the JUnit4 dependency [iceberg]

2025-05-16 Thread via GitHub
JeonDaehong commented on code in PR #13021: URL: https://github.com/apache/iceberg/pull/13021#discussion_r2093858283 ## flink/v2.0/flink/src/test/java/org/apache/iceberg/flink/source/TestIcebergSourceFailover.java: ## @@ -314,20 +313,6 @@ private void createBoundedStreams(Strea

Re: [PR] Core: Remove deprecated MetricsConfig.fromProperties method [iceberg]

2025-05-16 Thread via GitHub
xxubai commented on code in PR #13056: URL: https://github.com/apache/iceberg/pull/13056#discussion_r2093832048 ## core/src/main/java/org/apache/iceberg/MetricsConfig.java: ## @@ -212,6 +212,26 @@ public void validateReferencedColumns(Schema schema) { } } + /** + *

Re: [PR] Spark-3.5: Add spark action to compute partition stats [iceberg]

2025-05-16 Thread via GitHub
ajantha-bhat commented on code in PR #12450: URL: https://github.com/apache/iceberg/pull/12450#discussion_r2093813728 ## api/src/main/java/org/apache/iceberg/actions/ComputePartitionStats.java: ## @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under on

Re: [PR] Flink:Backport fix watermark no pass in TaskResultAggregator to Flink 1.19 and 1.20 [iceberg]

2025-05-16 Thread via GitHub
Guosmilesmile commented on PR #13085: URL: https://github.com/apache/iceberg/pull/13085#issuecomment-2887898862 @pvary Hi Peter , please take a look and review it if you have time ,Thanks ! -- This is an automated message from the Apache Git Service. To respond to the message, please log o

Re: [PR] Core, Data: File Format API interfaces [iceberg]

2025-05-16 Thread via GitHub
stevenzwu commented on code in PR #12774: URL: https://github.com/apache/iceberg/pull/12774#discussion_r2093779618 ## data/src/main/java/org/apache/iceberg/data/FileWriteBuilderBase.java: ## @@ -0,0 +1,54 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + *

Re: [PR] Flink: Fix npe in TaskResultAggregator when job recovery [iceberg]

2025-05-16 Thread via GitHub
Guosmilesmile commented on PR #13086: URL: https://github.com/apache/iceberg/pull/13086#issuecomment-2887897972 Here is the error: `java.lang.NullPointerException at org.apache.iceberg.flink.maintenance.operator.TaskResultAggregator.processWatermark(TaskResultAggregator.java:90)

[PR] Flink: Fix npe in TaskResultAggregator when job recovery [iceberg]

2025-05-16 Thread via GitHub
Guosmilesmile opened a new pull request, #13086: URL: https://github.com/apache/iceberg/pull/13086 Now the `startTime` in `TaskResultAggregator` is a transient `Long` that is initialized when the object is created, not in the `open` method. When the job recovers from a failure, it becomes n

Re: [PR] Core: use ReachableFileCleanup when table has discontinuous snapshots [iceberg]

2025-05-16 Thread via GitHub
github-actions[bot] closed pull request #12261: Core: use ReachableFileCleanup when table has discontinuous snapshots URL: https://github.com/apache/iceberg/pull/12261 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

Re: [PR] add_docs_and_backport_max_files_rewrite_option [iceberg]

2025-05-16 Thread via GitHub
coderfender commented on PR #13082: URL: https://github.com/apache/iceberg/pull/13082#issuecomment-2887896407 @pvary , @RussellSpitzer . Please take a look whenever you get a chance -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] Core: use ReachableFileCleanup when table has discontinuous snapshots [iceberg]

2025-05-16 Thread via GitHub
github-actions[bot] commented on PR #12261: URL: https://github.com/apache/iceberg/pull/12261#issuecomment-2887876916 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If

Re: [PR] Core, Data: File Format API interfaces [iceberg]

2025-05-16 Thread via GitHub
stevenzwu commented on code in PR #12774: URL: https://github.com/apache/iceberg/pull/12774#discussion_r2093626096 ## core/src/main/java/org/apache/iceberg/io/ObjectModel.java: ## @@ -0,0 +1,106 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more co

Re: [PR] Core, Data: File Format API interfaces [iceberg]

2025-05-16 Thread via GitHub
stevenzwu commented on code in PR #12774: URL: https://github.com/apache/iceberg/pull/12774#discussion_r2093651265 ## core/src/main/java/org/apache/iceberg/io/AppenderBuilder.java: ## @@ -0,0 +1,109 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or mor

Re: [PR] GCP: Add Iceberg Catalog for GCP BigQuery Metastore [iceberg]

2025-05-16 Thread via GitHub
github-actions[bot] commented on PR #11039: URL: https://github.com/apache/iceberg/pull/11039#issuecomment-2887876741 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think thatโ€™s incorrect or this pul

Re: [PR] Flink: Revise the display of the task name in TableMaintenance to show the specific task name. [iceberg]

2025-05-16 Thread via GitHub
Guosmilesmile commented on code in PR #13024: URL: https://github.com/apache/iceberg/pull/13024#discussion_r2093790184 ## flink/v2.0/flink/src/main/java/org/apache/iceberg/flink/maintenance/api/ExpireSnapshots.java: ## @@ -47,6 +47,11 @@ public static class Builder extends Main

Re: [I] Support to optimize, analyze tables and expire snapshots, remove orphan files [iceberg-python]

2025-05-16 Thread via GitHub
github-actions[bot] commented on issue #31: URL: https://github.com/apache/iceberg-python/issues/31#issuecomment-2887878869 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occu

Re: [PR] Added support for evolving the partition of the table [iceberg]

2025-05-16 Thread via GitHub
github-actions[bot] closed pull request #12723: Added support for evolving the partition of the table URL: https://github.com/apache/iceberg/pull/12723 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [PR] Added support for evolving the partition of the table [iceberg]

2025-05-16 Thread via GitHub
github-actions[bot] commented on PR #12723: URL: https://github.com/apache/iceberg/pull/12723#issuecomment-2887876955 This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If

Re: [PR] Ignore partition fields that are dropped from the current-schema [iceberg]

2025-05-16 Thread via GitHub
github-actions[bot] commented on PR #11868: URL: https://github.com/apache/iceberg/pull/11868#issuecomment-2887876896 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think thatโ€™s incorrect or this pul

Re: [I] Remove Dependency on Hadoop's Filesystem Class from Remove Orphan Files [iceberg]

2025-05-16 Thread via GitHub
github-actions[bot] commented on issue #11541: URL: https://github.com/apache/iceberg/issues/11541#issuecomment-2887876830 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [PR] Core, Data: File Format API interfaces [iceberg]

2025-05-16 Thread via GitHub
stevenzwu commented on code in PR #12774: URL: https://github.com/apache/iceberg/pull/12774#discussion_r2093626096 ## core/src/main/java/org/apache/iceberg/io/ObjectModel.java: ## @@ -0,0 +1,106 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more co

[PR] Flink:Backport fix watermark no pass in TaskResultAggregator to Flink 1.19 and 1.20 [iceberg]

2025-05-16 Thread via GitHub
Guosmilesmile opened a new pull request, #13085: URL: https://github.com/apache/iceberg/pull/13085 This pr is a backport for https://github.com/apache/iceberg/pull/13069. It is a clearn pr. -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] Spark-3.5: Add spark action to compute partition stats [iceberg]

2025-05-16 Thread via GitHub
amogh-jahagirdar commented on code in PR #12450: URL: https://github.com/apache/iceberg/pull/12450#discussion_r2093745842 ## api/src/main/java/org/apache/iceberg/actions/ComputePartitionStats.java: ## @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] Spark-3.5: Add spark action to compute partition stats [iceberg]

2025-05-16 Thread via GitHub
amogh-jahagirdar commented on code in PR #12450: URL: https://github.com/apache/iceberg/pull/12450#discussion_r2093745842 ## api/src/main/java/org/apache/iceberg/actions/ComputePartitionStats.java: ## @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] Spark-3.5: Add spark action to compute partition stats [iceberg]

2025-05-16 Thread via GitHub
amogh-jahagirdar commented on code in PR #12450: URL: https://github.com/apache/iceberg/pull/12450#discussion_r2093745842 ## api/src/main/java/org/apache/iceberg/actions/ComputePartitionStats.java: ## @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] Spark-3.5: Add spark action to compute partition stats [iceberg]

2025-05-16 Thread via GitHub
amogh-jahagirdar commented on code in PR #12450: URL: https://github.com/apache/iceberg/pull/12450#discussion_r2093745842 ## api/src/main/java/org/apache/iceberg/actions/ComputePartitionStats.java: ## @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] Prevent driver from overwhelimg during orphan file removal [iceberg]

2025-05-16 Thread via GitHub
karuppayya commented on PR #13084: URL: https://github.com/apache/iceberg/pull/13084#issuecomment-2887786202 @RussellSpitzer @aokolnychyi @flyrain @szehon-ho @huaxingao @flyrain @anuragmantri when you get a chance -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] Feat: replace sort order [iceberg-python]

2025-05-16 Thread via GitHub
Fokko commented on code in PR #1500: URL: https://github.com/apache/iceberg-python/pull/1500#discussion_r2093695075 ## pyiceberg/table/__init__.py: ## @@ -1113,6 +1128,14 @@ def update_schema(self, allow_incompatible_changes: bool = False, case_sensitive name_mappi

Re: [PR] Feat: replace sort order [iceberg-python]

2025-05-16 Thread via GitHub
Fokko commented on PR #1500: URL: https://github.com/apache/iceberg-python/pull/1500#issuecomment-2887701178 Hey @JasperHG90 sorry for the late reply, the notification got buried in my mailbox. I'll definitely review this, it would be great to get this in ๐Ÿ‘ -- This is an automated messag

Re: [PR] Rewrite manifests [iceberg-python]

2025-05-16 Thread via GitHub
Fokko commented on code in PR #1661: URL: https://github.com/apache/iceberg-python/pull/1661#discussion_r2093688570 ## pyiceberg/table/update/snapshot.py: ## @@ -524,6 +531,153 @@ def _process_manifests(self, manifests: List[ManifestFile]) -> List[ManifestFile return

Re: [PR] Rewrite manifests [iceberg-python]

2025-05-16 Thread via GitHub
Fokko commented on PR #1661: URL: https://github.com/apache/iceberg-python/pull/1661#issuecomment-2887691980 Looks like the CI is sad ๐Ÿ˜ž ``` tests/integration/test_writes/test_rewrite_manifests.py:154: error: "rewrite_manifests" of "Table" does not return a value (it only ever retu

Re: [PR] Added ExpireSnapshots Feature [iceberg-python]

2025-05-16 Thread via GitHub
Fokko commented on PR #1880: URL: https://github.com/apache/iceberg-python/pull/1880#issuecomment-2887674853 @ForeverAngry Sorry for the late reply, it looks like that there is a test failing now ๐Ÿ‘€ -- This is an automated message from the Apache Git Service. To respond to the message, pl

Re: [PR] fix: correct `UUIDType` partition representation for `BucketTransform` [iceberg-python]

2025-05-16 Thread via GitHub
Fokko commented on PR #2003: URL: https://github.com/apache/iceberg-python/pull/2003#issuecomment-2887624900 @DinGo4DEV Good news, it looks like this is fixed in the next release of Arrow: https://github.com/apache/arrow/pull/45866 -- This is an automated message from the Apache Git Servi

[PR] backport_max_files_rewrite_option [iceberg]

2025-05-16 Thread via GitHub
coderfender opened a new pull request, #13082: URL: https://github.com/apache/iceberg/pull/13082 Backport changes to flink 1.19 , 2.0 and spark 4.0 Original PR : https://github.com/apache/iceberg/pull/12824 -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] feat: delete orphaned files [iceberg-python]

2025-05-16 Thread via GitHub
Fokko commented on code in PR #1958: URL: https://github.com/apache/iceberg-python/pull/1958#discussion_r2093644910 ## pyiceberg/table/inspect.py: ## @@ -657,3 +665,62 @@ def all_manifests(self) -> "pa.Table": lambda args: self._generate_manifests_table(*args), [(sn

Re: [PR] feat: delete orphaned files [iceberg-python]

2025-05-16 Thread via GitHub
Fokko commented on code in PR #1958: URL: https://github.com/apache/iceberg-python/pull/1958#discussion_r2093644910 ## pyiceberg/table/inspect.py: ## @@ -657,3 +665,62 @@ def all_manifests(self) -> "pa.Table": lambda args: self._generate_manifests_table(*args), [(sn

Re: [PR] feat: delete orphaned files [iceberg-python]

2025-05-16 Thread via GitHub
Fokko commented on code in PR #1958: URL: https://github.com/apache/iceberg-python/pull/1958#discussion_r2093643789 ## pyiceberg/table/inspect.py: ## @@ -678,6 +685,28 @@ def all_manifests(self) -> "pa.Table": ) return pa.concat_tables(manifests_by_snapshots)

Re: [PR] feat: delete orphaned files [iceberg-python]

2025-05-16 Thread via GitHub
Fokko commented on code in PR #1958: URL: https://github.com/apache/iceberg-python/pull/1958#discussion_r2093643789 ## pyiceberg/table/inspect.py: ## @@ -678,6 +685,28 @@ def all_manifests(self) -> "pa.Table": ) return pa.concat_tables(manifests_by_snapshots)

Re: [PR] 1.9.1 release fixes [iceberg]

2025-05-16 Thread via GitHub
RussellSpitzer commented on PR #13081: URL: https://github.com/apache/iceberg/pull/13081#issuecomment-2887567333 Merged, Thanks @nastra , @amogh-jahagirdar , @stevenzwu , @kevinjqliu For review. I'll start a RC in a bit -- This is an automated message from the Apache Git Service. To resp

Re: [PR] 1.9.1 release fixes [iceberg]

2025-05-16 Thread via GitHub
RussellSpitzer merged PR #13081: URL: https://github.com/apache/iceberg/pull/13081 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ic

Re: [PR] 1.9.1 release fixes [iceberg]

2025-05-16 Thread via GitHub
stevenzwu commented on code in PR #13081: URL: https://github.com/apache/iceberg/pull/13081#discussion_r2093599959 ## api/src/test/java/org/apache/iceberg/TestIcebergBuild.java: ## @@ -37,6 +41,34 @@ public void testFullVersion() { + ")"); } + @Test + pub

Re: [PR] 1.9.1 release fixes [iceberg]

2025-05-16 Thread via GitHub
RussellSpitzer commented on code in PR #13081: URL: https://github.com/apache/iceberg/pull/13081#discussion_r2093600793 ## api/src/test/java/org/apache/iceberg/TestIcebergBuild.java: ## @@ -37,6 +41,34 @@ public void testFullVersion() { + ")"); } + @Test +

Re: [PR] API: Compute truncate decimal result precision based on lowest value bound [iceberg]

2025-05-16 Thread via GitHub
nandorKollar commented on code in PR #12969: URL: https://github.com/apache/iceberg/pull/12969#discussion_r2093589080 ## api/src/main/java/org/apache/iceberg/transforms/Truncate.java: ## @@ -513,5 +516,35 @@ public UnboundPredicate projectStrict( } return null;

Re: [PR] [Core] Add max files rewrite option for RewriteAction [iceberg]

2025-05-16 Thread via GitHub
coderfender commented on PR #12824: URL: https://github.com/apache/iceberg/pull/12824#issuecomment-2887511976 Sure I will create a new PR just for the porting changes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [PR] [Core] Add max files rewrite option for RewriteAction [iceberg]

2025-05-16 Thread via GitHub
RussellSpitzer commented on PR #12824: URL: https://github.com/apache/iceberg/pull/12824#issuecomment-2887505169 I have no problem with doing all the other changes at once, I just don't like having them all in the original PR because it's harder to track changes -- This is an automated me

Re: [PR] [Core] Add max files rewrite option for RewriteAction [iceberg]

2025-05-16 Thread via GitHub
coderfender commented on PR #12824: URL: https://github.com/apache/iceberg/pull/12824#issuecomment-2887503481 Sure , I will create another PR to support backport / forward port this functionality. @RussellSpitzer , @pvary Spark - v 3.4 and 3.5 are already done so I will only have to mak

Re: [PR] [Core] Add max files rewrite option for RewriteAction [iceberg]

2025-05-16 Thread via GitHub
RussellSpitzer commented on PR #12824: URL: https://github.com/apache/iceberg/pull/12824#issuecomment-2887490967 Remember to "forward port" too now that we have a 4.0 Module :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] [Core] Add max files rewrite option for RewriteAction [iceberg]

2025-05-16 Thread via GitHub
coderfender commented on PR #12824: URL: https://github.com/apache/iceberg/pull/12824#issuecomment-2887485441 Thank you @pvary @RussellSpitzer @anuragmantri . I will start working on the documentation changes and raise a PR soon -- This is an automated message from the Apache Git Service

Re: [PR] [Core] Add max files rewrite option for RewriteAction [iceberg]

2025-05-16 Thread via GitHub
pvary commented on PR #12824: URL: https://github.com/apache/iceberg/pull/12824#issuecomment-2887484891 Merged to main. Thanks for all the work @coderfender on the PR, and @RussellSpitzer for the review! @coderfender: Could you please create the backport PRs for Spark and Flink?

Re: [PR] [Core] Add max files rewrite option for RewriteAction [iceberg]

2025-05-16 Thread via GitHub
pvary merged PR #12824: URL: https://github.com/apache/iceberg/pull/12824 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apa

Re: [PR] Flink: Fix watermark not passing in TaskResultAggregator [iceberg]

2025-05-16 Thread via GitHub
pvary commented on PR #13069: URL: https://github.com/apache/iceberg/pull/13069#issuecomment-2887474197 Merged to main. Good catch. Thanks for the fix and the PR @Guosmilesmile and @mxm for the review! -- This is an automated message from the Apache Git Service. To respond to the mess

Re: [PR] Flink: Fix watermark not passing in TaskResultAggregator [iceberg]

2025-05-16 Thread via GitHub
pvary merged PR #13069: URL: https://github.com/apache/iceberg/pull/13069 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apa

Re: [PR] 1.9.1 release fixes [iceberg]

2025-05-16 Thread via GitHub
RussellSpitzer commented on code in PR #13081: URL: https://github.com/apache/iceberg/pull/13081#discussion_r2093539632 ## api/src/test/java/org/apache/iceberg/TestIcebergBuild.java: ## @@ -37,6 +41,34 @@ public void testFullVersion() { + ")"); } + @Test +

Re: [PR] Core: Remove deprecated MetricsConfig.fromProperties method [iceberg]

2025-05-16 Thread via GitHub
RussellSpitzer commented on code in PR #13056: URL: https://github.com/apache/iceberg/pull/13056#discussion_r2093538740 ## core/src/main/java/org/apache/iceberg/MetricsConfig.java: ## @@ -212,6 +212,26 @@ public void validateReferencedColumns(Schema schema) { } } + /*

Re: [PR] 1.9.1 release fixes [iceberg]

2025-05-16 Thread via GitHub
stevenzwu commented on code in PR #13081: URL: https://github.com/apache/iceberg/pull/13081#discussion_r2093523511 ## api/src/test/java/org/apache/iceberg/TestIcebergBuild.java: ## @@ -37,6 +41,34 @@ public void testFullVersion() { + ")"); } + @Test + pub

Re: [PR] Kafka Connect: Add BigQuery Metastore catalog [iceberg]

2025-05-16 Thread via GitHub
RussellSpitzer merged PR #13041: URL: https://github.com/apache/iceberg/pull/13041 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ic

Re: [PR] Kafka Connect: Add BigQuery Metastore catalog [iceberg]

2025-05-16 Thread via GitHub
RussellSpitzer commented on PR #13041: URL: https://github.com/apache/iceberg/pull/13041#issuecomment-2887451701 Thanks @juldrixx for the pr! Thanks @talatuyarer for reviewing! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] Core: Implement source-ids to deal with multi arguments transforms [iceberg]

2025-05-16 Thread via GitHub
jbonofre commented on PR #12897: URL: https://github.com/apache/iceberg/pull/12897#issuecomment-2887451639 I did a first update to introduce multi-args in `Transform`. I will check/update the tests too. -- This is an automated message from the Apache Git Service. To respond to the message

Re: [PR] Core: Implement source-ids to deal with multi arguments transforms [iceberg]

2025-05-16 Thread via GitHub
jbonofre commented on code in PR #12897: URL: https://github.com/apache/iceberg/pull/12897#discussion_r2093531700 ## api/src/main/java/org/apache/iceberg/PartitionSpec.java: ## @@ -625,7 +634,7 @@ PartitionSpec buildUnchecked() { static void checkCompatibility(PartitionSpec

Re: [PR] Core: Implement source-ids to deal with multi arguments transforms [iceberg]

2025-05-16 Thread via GitHub
jbonofre commented on code in PR #12897: URL: https://github.com/apache/iceberg/pull/12897#discussion_r2093530131 ## api/src/main/java/org/apache/iceberg/UnboundPartitionSpec.java: ## @@ -118,7 +128,12 @@ public String transformAsString() { } public int sourceId() {

Re: [PR] Flink: Backport support zookeeper lock in TableMaintenance to Flink 1.19 and 2.0 [iceberg]

2025-05-16 Thread via GitHub
pvary commented on PR #13063: URL: https://github.com/apache/iceberg/pull/13063#issuecomment-2887443943 Merged to main. Thanks for the backport @Guosmilesmile and @mxm for the review! -- This is an automated message from the Apache Git Service. To respond to the message, please log on t

Re: [PR] Flink: Backport support zookeeper lock in TableMaintenance to Flink 1.19 and 2.0 [iceberg]

2025-05-16 Thread via GitHub
pvary merged PR #13063: URL: https://github.com/apache/iceberg/pull/13063 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apa

Re: [PR] 1.9.1 release fixes [iceberg]

2025-05-16 Thread via GitHub
stevenzwu commented on code in PR #13081: URL: https://github.com/apache/iceberg/pull/13081#discussion_r2093523511 ## api/src/test/java/org/apache/iceberg/TestIcebergBuild.java: ## @@ -37,6 +41,34 @@ public void testFullVersion() { + ")"); } + @Test + pub

Re: [PR] Core: Remove deprecated MetricsConfig.fromProperties method [iceberg]

2025-05-16 Thread via GitHub
xxubai commented on PR #13056: URL: https://github.com/apache/iceberg/pull/13056#issuecomment-2887418459 Could someone please help review this? cc @RussellSpitzer @ajantha-bhat @nastra Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please l

Re: [PR] Adding VSCode and Devcontainer Configs [iceberg]

2025-05-16 Thread via GitHub
bpkroth closed pull request #13034: Adding VSCode and Devcontainer Configs URL: https://github.com/apache/iceberg/pull/13034 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Adding VSCode and Devcontainer Configs [iceberg]

2025-05-16 Thread via GitHub
bpkroth commented on PR #13034: URL: https://github.com/apache/iceberg/pull/13034#issuecomment-2887401983 I spent a bit more hacking time on this just because: https://github.com/bpkroth/iceberg/commits/add-devcontainer-configs-vscode-tweaks-attemp-2?since=2025-05-14 I found that i

Re: [PR] [Core] Add max files rewrite option for RewriteAction [iceberg]

2025-05-16 Thread via GitHub
coderfender commented on code in PR #12824: URL: https://github.com/apache/iceberg/pull/12824#discussion_r2093469352 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/api/RewriteDataFiles.java: ## @@ -170,6 +170,12 @@ public Builder maxFileGroupSizeBytes(lo

Re: [PR] fix: correct `UUIDType` partition representation for `BucketTransform` [iceberg-python]

2025-05-16 Thread via GitHub
Fokko commented on PR #2003: URL: https://github.com/apache/iceberg-python/pull/2003#issuecomment-2887335585 @DinGo4DEV Yes, please do. My biggest concern is that we produce Parquet files that will not be supported by other implementations because of the missing logical annotation. Arrow re

Re: [PR] feat: Introduce C FFI for iceberg rust [iceberg-rust]

2025-05-16 Thread via GitHub
ancapdev commented on PR #966: URL: https://github.com/apache/iceberg-rust/pull/966#issuecomment-2887285508 > Your comments are welcome! Hi, is the consensus here to abandon the idea of a C API that can serve as a foundation to other language implementations? Java is pretty awk

Re: [PR] 1.9.1 release fixes [iceberg]

2025-05-16 Thread via GitHub
RussellSpitzer commented on PR #13081: URL: https://github.com/apache/iceberg/pull/13081#issuecomment-2887285271 Note if someone tries to merge this that isn't me, we want to do a Rebase and merge for this PR and not a Squash -- This is an automated message from the Apache Git Service. To

Re: [PR] Parquet: Fix to ensure that last updated sequence numbers for V2 and earlier tables are null [iceberg]

2025-05-16 Thread via GitHub
RussellSpitzer commented on PR #13001: URL: https://github.com/apache/iceberg/pull/13001#issuecomment-2887274515 Removed from Milestone because this doesn't effect 1.9.0 because 1.9.0 is missing #12736 which makes this bug possible -- This is an automated message from the Apache Git Servi

Re: [PR] fix: correct `UUIDType` partition representation for `BucketTransform` [iceberg-python]

2025-05-16 Thread via GitHub
DinGo4DEV commented on PR #2003: URL: https://github.com/apache/iceberg-python/pull/2003#issuecomment-2887262755 @Fokko Thank you for taking the time to review. I appreciate your thoughtful feedback and the effort you put into this. To fully support the UUID type, it looks like we'll need

Re: [PR] [Core] Add max files rewrite option for RewriteAction [iceberg]

2025-05-16 Thread via GitHub
coderfender commented on code in PR #12824: URL: https://github.com/apache/iceberg/pull/12824#discussion_r2092711533 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/api/RewriteDataFiles.java: ## @@ -170,6 +170,12 @@ public Builder maxFileGroupSizeBytes(lo

Re: [PR] Add Hugging Face filesystem support to fsspec [iceberg-python]

2025-05-16 Thread via GitHub
kevinjqliu commented on PR #1997: URL: https://github.com/apache/iceberg-python/pull/1997#issuecomment-2887215434 Thanks @lhoestq for the contribution and @Fokko for the review :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [PR] Add Hugging Face filesystem support to fsspec [iceberg-python]

2025-05-16 Thread via GitHub
kevinjqliu merged PR #1997: URL: https://github.com/apache/iceberg-python/pull/1997 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [PR] Flink: add snapshot expiration reset strategy [iceberg]

2025-05-16 Thread via GitHub
mxm commented on PR #12639: URL: https://github.com/apache/iceberg/pull/12639#issuecomment-2887181211 I tend to lean towards what @pvary pointed out, that this is a critical condition which warrants manual intervention. However, I think the use case is valid and I agree that there should be

Re: [PR] Range distribution iceberg sink [iceberg]

2025-05-16 Thread via GitHub
rodmeneses commented on PR #12071: URL: https://github.com/apache/iceberg/pull/12071#issuecomment-2887181029 Hi @stevenzwu thanks for reopening this. I will try to finish it this coming week. I think it needs only to port some fixes recently made in the FlinkSink RANGE distribution mode, as

Re: [PR] Range distribution iceberg sink [iceberg]

2025-05-16 Thread via GitHub
mxm commented on PR #12071: URL: https://github.com/apache/iceberg/pull/12071#issuecomment-2887169195 +1 it would be nice to follow up with this. If @rodmeneses is busy, maybe @Guosmilesmile could also take this one? -- This is an automated message from the Apache Git Service. To respond

Re: [I] [Feature] Add Support for Distributed Write [iceberg-python]

2025-05-16 Thread via GitHub
andormarkus commented on issue #1751: URL: https://github.com/apache/iceberg-python/issues/1751#issuecomment-2877526453 Hello @potatochipcoconut My solution is tested and working with AWS Lambda + AWS SQS. How soon do you need solution because I can share our code, however I need to

Re: [PR] Core: validate file format compatibility with v3 [iceberg]

2025-05-16 Thread via GitHub
danielcweeks commented on code in PR #13060: URL: https://github.com/apache/iceberg/pull/13060#discussion_r2093326193 ## core/src/main/java/org/apache/iceberg/TableMetadata.java: ## @@ -61,6 +65,9 @@ public class TableMetadata implements Serializable { static final int INITIA

Re: [PR] Flink: Backport support zookeeper lock in TableMaintenance to Flink 1.19 and 2.0 [iceberg]

2025-05-16 Thread via GitHub
mxm commented on PR #13063: URL: https://github.com/apache/iceberg/pull/13063#issuecomment-2887153556 @pvary for merging. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Core: validate file format compatibility with v3 [iceberg]

2025-05-16 Thread via GitHub
danielcweeks commented on PR #13060: URL: https://github.com/apache/iceberg/pull/13060#issuecomment-2887147085 > The tests are failing because we have a few tests that use a parameterization that uses "avro" as the data file format. Currently we do this for all versions I'll chase th

Re: [PR] Flink 2.0: Remove the JUnit4 dependency [iceberg]

2025-05-16 Thread via GitHub
nastra commented on code in PR #13021: URL: https://github.com/apache/iceberg/pull/13021#discussion_r2093318132 ## flink/v2.0/flink/src/test/java/org/apache/iceberg/flink/source/TestIcebergSourceFailover.java: ## @@ -314,20 +313,6 @@ private void createBoundedStreams(StreamExec

Re: [PR] Flink 2.0: Remove the JUnit4 dependency [iceberg]

2025-05-16 Thread via GitHub
nastra commented on code in PR #13021: URL: https://github.com/apache/iceberg/pull/13021#discussion_r2093318132 ## flink/v2.0/flink/src/test/java/org/apache/iceberg/flink/source/TestIcebergSourceFailover.java: ## @@ -314,20 +313,6 @@ private void createBoundedStreams(StreamExec

Re: [PR] API: Compute truncate decimal result precision based on lowest value bound [iceberg]

2025-05-16 Thread via GitHub
RussellSpitzer commented on code in PR #12969: URL: https://github.com/apache/iceberg/pull/12969#discussion_r2093307904 ## api/src/test/java/org/apache/iceberg/transforms/TestTruncate.java: ## @@ -85,6 +87,43 @@ public void testTruncateDecimal() { assertThat(trunc.apply(new

Re: [PR] [Spark]Add max files rewrite option for RewriteAction [iceberg]

2025-05-16 Thread via GitHub
sfc-gh-rspitzer commented on code in PR #12824: URL: https://github.com/apache/iceberg/pull/12824#discussion_r2087441627 ## core/src/main/java/org/apache/iceberg/actions/BinPackRewriteFilePlanner.java: ## @@ -199,30 +214,48 @@ protected long defaultTargetFileSize() { public F

Re: [PR] Core: validate file format compatibility with v3 [iceberg]

2025-05-16 Thread via GitHub
RussellSpitzer commented on PR #13060: URL: https://github.com/apache/iceberg/pull/13060#issuecomment-2887094938 The tests are failing because we have a few tests that use a parameterization that uses "avro" as the data file format. Currently we do this for all versions -- This is an aut

Re: [PR] API: Compute truncate decimal result precision based on lowest value bound [iceberg]

2025-05-16 Thread via GitHub
RussellSpitzer commented on code in PR #12969: URL: https://github.com/apache/iceberg/pull/12969#discussion_r2093302548 ## api/src/main/java/org/apache/iceberg/transforms/Truncate.java: ## @@ -513,5 +516,35 @@ public UnboundPredicate projectStrict( } return null;

Re: [PR] Core: validate file format compatibility with v3 [iceberg]

2025-05-16 Thread via GitHub
RussellSpitzer commented on code in PR #13060: URL: https://github.com/apache/iceberg/pull/13060#discussion_r2093292196 ## core/src/main/java/org/apache/iceberg/TableMetadata.java: ## @@ -1907,5 +1916,23 @@ private boolean isAddedSnapshot(long snapshotId) { private Stream

Re: [PR] Core: validate file format compatibility with v3 [iceberg]

2025-05-16 Thread via GitHub
RussellSpitzer commented on code in PR #13060: URL: https://github.com/apache/iceberg/pull/13060#discussion_r2093289066 ## core/src/main/java/org/apache/iceberg/TableMetadata.java: ## @@ -61,6 +65,9 @@ public class TableMetadata implements Serializable { static final int INIT

Re: [PR] Flink: Dynamic Iceberg Sink Contribution [iceberg]

2025-05-16 Thread via GitHub
mxm commented on PR #12424: URL: https://github.com/apache/iceberg/pull/12424#issuecomment-2887074164 I decided to breakdown PRs. So far: 1. #12996 2. #13032 3. #13080 4. Table update operator (outstanding) 5. Putting it all together (outstanding) -- This is an automa

Re: [I] Partition Query with more details for spark SQL [iceberg]

2025-05-16 Thread via GitHub
RussellSpitzer commented on issue #13079: URL: https://github.com/apache/iceberg/issues/13079#issuecomment-2886964916 Could you elaborate on what you expect? The struct format has this information? -- This is an automated message from the Apache Git Service. To respond to the message, ple

Re: [PR] [Core] Add max files rewrite option for RewriteAction [iceberg]

2025-05-16 Thread via GitHub
pvary commented on code in PR #12824: URL: https://github.com/apache/iceberg/pull/12824#discussion_r2092664923 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/api/RewriteDataFiles.java: ## @@ -170,6 +170,12 @@ public Builder maxFileGroupSizeBytes(long ma

Re: [PR] Flink 2.0: Remove the JUnit4 dependency [iceberg]

2025-05-16 Thread via GitHub
JeonDaehong commented on code in PR #13021: URL: https://github.com/apache/iceberg/pull/13021#discussion_r2093234600 ## flink/v2.0/flink/src/test/java/org/apache/iceberg/flink/source/TestIcebergSourceFailover.java: ## @@ -314,20 +313,6 @@ private void createBoundedStreams(Strea

  1   2   >