Re: [PR] refine: refine interface of ManifestWriter [iceberg-rust]

2024-12-03 Thread via GitHub
ZENOTME commented on code in PR #738: URL: https://github.com/apache/iceberg-rust/pull/738#discussion_r1868871965 ## crates/iceberg/src/spec/manifest.rs: ## @@ -203,12 +206,80 @@ impl ManifestWriter { partition_summary } -/// Write a manifest. -pub async

Re: [PR] feat: support arrow_struct_to_iceberg_struct [iceberg-rust]

2024-12-03 Thread via GitHub
ZENOTME commented on PR #731: URL: https://github.com/apache/iceberg-rust/pull/731#issuecomment-2516429652 > Hi, @ZENOTME Thanks for this pr! I'm thinking that instead of array transformation, should we consider transforming arrow record batch to/from array of iceberg datum? For now

Re: [PR] Flink: Maintenance - RewriteDataFiles [iceberg]

2024-12-03 Thread via GitHub
pvary commented on code in PR #11497: URL: https://github.com/apache/iceberg/pull/11497#discussion_r1868806860 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/DataFileRewritePlanner.java: ## @@ -0,0 +1,228 @@ +/* + * Licensed to the Apache Softwa

Re: [PR] Flink: Maintenance - RewriteDataFiles [iceberg]

2024-12-03 Thread via GitHub
pvary commented on code in PR #11497: URL: https://github.com/apache/iceberg/pull/11497#discussion_r1868801347 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/DataFileRewriteCommitter.java: ## @@ -0,0 +1,304 @@ +/* + * Licensed to the Apache Soft

Re: [PR] Flink: Maintenance - RewriteDataFiles [iceberg]

2024-12-03 Thread via GitHub
pvary commented on code in PR #11497: URL: https://github.com/apache/iceberg/pull/11497#discussion_r1868809006 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/DataFileRewriteCommitter.java: ## @@ -0,0 +1,318 @@ +/* + * Licensed to the Apache Soft

Re: [PR] REST: AuthManager API [iceberg]

2024-12-03 Thread via GitHub
nastra commented on PR #10753: URL: https://github.com/apache/iceberg/pull/10753#issuecomment-2516323173 thanks for being so patient here. I'll try to review this PR today/tomorrow -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [PR] Flink: Maintenance - RewriteDataFiles [iceberg]

2024-12-03 Thread via GitHub
pvary commented on code in PR #11497: URL: https://github.com/apache/iceberg/pull/11497#discussion_r1868782263 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/DataFileRewriteCommitter.java: ## @@ -0,0 +1,304 @@ +/* + * Licensed to the Apache Soft

Re: [PR] refine: refine writer interface [iceberg-rust]

2024-12-03 Thread via GitHub
ZENOTME commented on code in PR #741: URL: https://github.com/apache/iceberg-rust/pull/741#discussion_r1868789693 ## crates/iceberg/src/writer/mod.rs: ## @@ -83,13 +81,15 @@ pub trait IcebergWriter: Send + 'static { /// The current file status of iceberg writer. It implement

Re: [PR] test: append partition data file [iceberg-rust]

2024-12-03 Thread via GitHub
feniljain commented on PR #742: URL: https://github.com/apache/iceberg-rust/pull/742#issuecomment-2516279844 Hey @Fokko 👋🏻 Thanks a lot for checking up in detail! Can I take up both of the issues are both are corresponding to same test? Also a small idea 💡, do you think we sho

Re: [PR] Hive: Optimize tableExists API in hive catalog [iceberg]

2024-12-03 Thread via GitHub
dramaticlly commented on code in PR #11597: URL: https://github.com/apache/iceberg/pull/11597#discussion_r1868763730 ## api/src/main/java/org/apache/iceberg/catalog/Catalog.java: ## @@ -271,7 +271,7 @@ default Transaction newReplaceTableTransaction( } /** - * Check whe

Re: [PR] Hive: Optimize tableExists API in hive catalog [iceberg]

2024-12-03 Thread via GitHub
szehon-ho commented on code in PR #11597: URL: https://github.com/apache/iceberg/pull/11597#discussion_r1868757904 ## api/src/main/java/org/apache/iceberg/catalog/Catalog.java: ## @@ -271,7 +271,7 @@ default Transaction newReplaceTableTransaction( } /** - * Check wheth

Re: [PR] Bump moto from 5.0.21 to 5.0.22 [iceberg-python]

2024-12-03 Thread via GitHub
Fokko merged PR #1399: URL: https://github.com/apache/iceberg-python/pull/1399 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceber

Re: [PR] "how to release", add `gh` command to trigger workflow [iceberg-python]

2024-12-03 Thread via GitHub
Fokko commented on code in PR #1387: URL: https://github.com/apache/iceberg-python/pull/1387#discussion_r1868746609 ## mkdocs/docs/how-to-release.md: ## @@ -130,6 +130,15 @@ Run the [`Python release` Github Action](https://github.com/apache/iceberg-pytho * Tag: Use the newly c

[I] Deleting namespaces and tables of JDBC Catalog [iceberg-python]

2024-12-03 Thread via GitHub
ArijitSinghEDA opened a new issue, #1400: URL: https://github.com/apache/iceberg-python/issues/1400 ### Question I have a catalog created with a Postgres+Minio. When I run the function `catalog.drop_namespace()`, the namespace with its properties (if any), do not get purged from the

Re: [PR] "how to release", add `gh` command to trigger workflow [iceberg-python]

2024-12-03 Thread via GitHub
kevinjqliu commented on PR #1387: URL: https://github.com/apache/iceberg-python/pull/1387#issuecomment-2516214341 going to reuse this PR based on the new release instructions from #1391 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] Build: Don't run CI on unrelated changes [iceberg-python]

2024-12-03 Thread via GitHub
kevinjqliu commented on code in PR #1395: URL: https://github.com/apache/iceberg-python/pull/1395#discussion_r1868729404 ## .github/workflows/python-ci.yml: ## @@ -24,6 +24,19 @@ on: branches: - 'main' pull_request: +paths: +- '**' +- '!.github/ISSUE_TEM

Re: [PR] refine: refine writer interface [iceberg-rust]

2024-12-03 Thread via GitHub
ZENOTME commented on code in PR #741: URL: https://github.com/apache/iceberg-rust/pull/741#discussion_r1868720714 ## crates/iceberg/src/writer/file_writer/mod.rs: ## @@ -37,11 +37,11 @@ pub trait FileWriterBuilder: Send + Clone + 'static { /// The associated file writer ty

Re: [PR] Build: Upgrade to RAT 0.16.1, scanning hidden directories and adding missing ASF headers [iceberg-python]

2024-12-03 Thread via GitHub
kevinjqliu merged PR #1396: URL: https://github.com/apache/iceberg-python/pull/1396 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@i

Re: [PR] Build: Upgrade to RAT 0.16.1, scanning hidden directories and adding missing ASF headers [iceberg-python]

2024-12-03 Thread via GitHub
kevinjqliu commented on PR #1396: URL: https://github.com/apache/iceberg-python/pull/1396#issuecomment-2516199136 thanks for the contribution @manuzhang ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [PR] refine: refine writer interface [iceberg-rust]

2024-12-03 Thread via GitHub
ZENOTME commented on code in PR #741: URL: https://github.com/apache/iceberg-rust/pull/741#discussion_r1868720714 ## crates/iceberg/src/writer/file_writer/mod.rs: ## @@ -37,11 +37,11 @@ pub trait FileWriterBuilder: Send + Clone + 'static { /// The associated file writer ty

Re: [PR] Hive: Optimize tableExists API in hive catalog [iceberg]

2024-12-03 Thread via GitHub
kevinjqliu commented on code in PR #11597: URL: https://github.com/apache/iceberg/pull/11597#discussion_r1868686474 ## api/src/main/java/org/apache/iceberg/catalog/Catalog.java: ## @@ -271,7 +271,7 @@ default Transaction newReplaceTableTransaction( } /** - * Check whet

Re: [PR] Hive: Optimize tableExists API in hive catalog [iceberg]

2024-12-03 Thread via GitHub
kevinjqliu commented on code in PR #11597: URL: https://github.com/apache/iceberg/pull/11597#discussion_r1868686474 ## api/src/main/java/org/apache/iceberg/catalog/Catalog.java: ## @@ -271,7 +271,7 @@ default Transaction newReplaceTableTransaction( } /** - * Check whet

Re: [PR] Hive: Optimize tableExists API in hive catalog [iceberg]

2024-12-03 Thread via GitHub
dramaticlly commented on code in PR #11597: URL: https://github.com/apache/iceberg/pull/11597#discussion_r1868679172 ## api/src/main/java/org/apache/iceberg/catalog/Catalog.java: ## @@ -271,7 +271,7 @@ default Transaction newReplaceTableTransaction( } /** - * Check whe

Re: [PR] Hive: Optimize tableExists API in hive catalog [iceberg]

2024-12-03 Thread via GitHub
kevinjqliu commented on code in PR #11597: URL: https://github.com/apache/iceberg/pull/11597#discussion_r1868677442 ## api/src/main/java/org/apache/iceberg/catalog/Catalog.java: ## @@ -271,7 +271,7 @@ default Transaction newReplaceTableTransaction( } /** - * Check whet

Re: [PR] refine: refine interface of ManifestWriter [iceberg-rust]

2024-12-03 Thread via GitHub
liurenjie1024 commented on code in PR #738: URL: https://github.com/apache/iceberg-rust/pull/738#discussion_r1868674850 ## crates/iceberg/src/spec/manifest.rs: ## @@ -203,12 +206,80 @@ impl ManifestWriter { partition_summary } -/// Write a manifest. -pub

Re: [PR] refine: refine writer interface [iceberg-rust]

2024-12-03 Thread via GitHub
liurenjie1024 commented on code in PR #741: URL: https://github.com/apache/iceberg-rust/pull/741#discussion_r1868671378 ## crates/iceberg/src/writer/file_writer/mod.rs: ## @@ -37,11 +37,11 @@ pub trait FileWriterBuilder: Send + Clone + 'static { /// The associated file wri

Re: [PR] Build: Don't run CI on unrelated changes [iceberg-python]

2024-12-03 Thread via GitHub
manuzhang commented on code in PR #1395: URL: https://github.com/apache/iceberg-python/pull/1395#discussion_r1868671842 ## .github/workflows/python-ci-docs.yml: ## @@ -24,7 +24,19 @@ on: branches: - 'main' pull_request: - +paths: +- '**' +- '!.github/I

Re: [PR] WIP: Use localhost instead of container hostname [iceberg-rust]

2024-12-03 Thread via GitHub
liurenjie1024 commented on PR #748: URL: https://github.com/apache/iceberg-rust/pull/748#issuecomment-2516087345 cc @Fokko Do we still this change? I see your comment https://github.com/apache/iceberg-rust/issues/719#issuecomment-2511739559 here. -- This is an automated message from the A

Re: [PR] fix: equality delete writer field id project [iceberg-rust]

2024-12-03 Thread via GitHub
liurenjie1024 merged PR #751: URL: https://github.com/apache/iceberg-rust/pull/751 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ic

Re: [PR] fix: equality delete writer field id project [iceberg-rust]

2024-12-03 Thread via GitHub
liurenjie1024 commented on PR #751: URL: https://github.com/apache/iceberg-rust/pull/751#issuecomment-2516080373 Thanks @ZENOTME for fixing this pr! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [PR] Added force virtual addressing configuration for S3, Alibaba OSS protocol to use PyArrowFileIO [iceberg-python]

2024-12-03 Thread via GitHub
kevinjqliu commented on code in PR #1392: URL: https://github.com/apache/iceberg-python/pull/1392#discussion_r1868658040 ## pyiceberg/io/pyarrow.py: ## @@ -350,7 +351,7 @@ def parse_location(location: str) -> Tuple[str, str, str]: return uri.scheme, uri.netloc, f"{u

Re: [PR] API: Support removeUnusedSpecs in ExpireSnapshots [iceberg]

2024-12-03 Thread via GitHub
advancedxy commented on code in PR #10755: URL: https://github.com/apache/iceberg/pull/10755#discussion_r1868635951 ## core/src/main/java/org/apache/iceberg/TableMetadata.java: ## @@ -1108,6 +1108,25 @@ public Builder setDefaultPartitionSpec(int specId) { return this;

Re: [PR] Added force virtual addressing configuration for S3, Alibaba OSS protocol to use PyArrowFileIO [iceberg-python]

2024-12-03 Thread via GitHub
helmiazizm commented on code in PR #1392: URL: https://github.com/apache/iceberg-python/pull/1392#discussion_r1868642737 ## pyiceberg/io/pyarrow.py: ## @@ -350,7 +351,7 @@ def parse_location(location: str) -> Tuple[str, str, str]: return uri.scheme, uri.netloc, f"{u

Re: [PR] Added force virtual addressing configuration for S3, Alibaba OSS protocol to use PyArrowFileIO [iceberg-python]

2024-12-03 Thread via GitHub
helmiazizm commented on code in PR #1392: URL: https://github.com/apache/iceberg-python/pull/1392#discussion_r1868642737 ## pyiceberg/io/pyarrow.py: ## @@ -350,7 +351,7 @@ def parse_location(location: str) -> Tuple[str, str, str]: return uri.scheme, uri.netloc, f"{u

Re: [PR] feat: support arrow_struct_to_iceberg_struct [iceberg-rust]

2024-12-03 Thread via GitHub
liurenjie1024 commented on PR #731: URL: https://github.com/apache/iceberg-rust/pull/731#issuecomment-2516049034 Hi, @ZENOTME Thanks for this pr! I'm thinking that instead of array transformation, should we consider transforming arrow record batch to/from array of iceberg datum? It maybe al

Re: [PR] Document procedure for stats collection [iceberg]

2024-12-03 Thread via GitHub
szehon-ho commented on code in PR #11606: URL: https://github.com/apache/iceberg/pull/11606#discussion_r1868637948 ## docs/docs/spark-procedures.md: ## @@ -937,7 +937,7 @@ as an `UPDATE_AFTER` image, resulting in the following pre/post update images: | 3 | Robert | UPDATE_BE

Re: [PR] API: Support removeUnusedSpecs in ExpireSnapshots [iceberg]

2024-12-03 Thread via GitHub
advancedxy commented on code in PR #10755: URL: https://github.com/apache/iceberg/pull/10755#discussion_r1868635560 ## api/src/main/java/org/apache/iceberg/ExpireSnapshots.java: ## @@ -118,4 +118,16 @@ public interface ExpireSnapshots extends PendingUpdate> { * @return this

Re: [PR] Added force virtual addressing configuration for S3, Alibaba OSS protocol to use PyArrowFileIO [iceberg-python]

2024-12-03 Thread via GitHub
kevinjqliu commented on PR #1392: URL: https://github.com/apache/iceberg-python/pull/1392#issuecomment-2515997559 looks like theres a linter issue, you can run `make lint` locally to resolve it -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] Added force virtual addressing configuration for S3, Alibaba OSS protocol to use PyArrowFileIO [iceberg-python]

2024-12-03 Thread via GitHub
kevinjqliu commented on code in PR #1392: URL: https://github.com/apache/iceberg-python/pull/1392#discussion_r1868592536 ## pyiceberg/io/pyarrow.py: ## @@ -350,7 +351,7 @@ def parse_location(location: str) -> Tuple[str, str, str]: return uri.scheme, uri.netloc, f"{u

Re: [PR] Added force virtual addressing configuration for S3, Alibaba OSS protocol to use PyArrowFileIO [iceberg-python]

2024-12-03 Thread via GitHub
kevinjqliu commented on code in PR #1392: URL: https://github.com/apache/iceberg-python/pull/1392#discussion_r1868591185 ## pyiceberg/io/pyarrow.py: ## @@ -350,7 +351,7 @@ def parse_location(location: str) -> Tuple[str, str, str]: return uri.scheme, uri.netloc, f"{u

Re: [PR] Added force virtual addressing configuration for S3, Alibaba OSS protocol to use PyArrowFileIO [iceberg-python]

2024-12-03 Thread via GitHub
kevinjqliu commented on code in PR #1392: URL: https://github.com/apache/iceberg-python/pull/1392#discussion_r1868590099 ## pyiceberg/io/pyarrow.py: ## @@ -350,7 +351,7 @@ def parse_location(location: str) -> Tuple[str, str, str]: return uri.scheme, uri.netloc, f"{u

Re: [PR] Added force virtual addressing configuration for S3, Alibaba OSS protocol to use PyArrowFileIO [iceberg-python]

2024-12-03 Thread via GitHub
kevinjqliu commented on code in PR #1392: URL: https://github.com/apache/iceberg-python/pull/1392#discussion_r1868589197 ## pyiceberg/io/pyarrow.py: ## @@ -350,7 +351,7 @@ def parse_location(location: str) -> Tuple[str, str, str]: return uri.scheme, uri.netloc, f"{u

Re: [PR] Kafka Connect: Add mechanisms for routing records by topic name [iceberg]

2024-12-03 Thread via GitHub
mun1r0b0t commented on PR #11623: URL: https://github.com/apache/iceberg/pull/11623#issuecomment-2515913225 IMO making it a plugin gives the user the flexibility to choose. User can add their own routing plugin to the runtime or use SMT or some other means. I think the connector shouldn't p

Re: [PR] Add missing license headers [iceberg-python]

2024-12-03 Thread via GitHub
kevinjqliu commented on PR #1396: URL: https://github.com/apache/iceberg-python/pull/1396#issuecomment-2515885124 Ok looks like RAT v0.15 does not scan hidden directories by default. See similar issue [apache/datafusion#9851](https://github.com/apache/datafusion/issues/9851) The [Jav

Re: [PR] REST: AuthManager API [iceberg]

2024-12-03 Thread via GitHub
dimas-b commented on PR #10753: URL: https://github.com/apache/iceberg/pull/10753#issuecomment-2515873034 Glad to hear that this PR is about to merge :) Thanks for your time and effort in reviewing it @danielcweeks and @nastra (in advance :) ) -- This is an automated message from the Apac

Re: [I] Iceberg supports binlog logs [iceberg]

2024-12-03 Thread via GitHub
github-actions[bot] commented on issue #10452: URL: https://github.com/apache/iceberg/issues/10452#issuecomment-2515852859 This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occur

Re: [PR] Build: Don't run CI on unrelated changes [iceberg-python]

2024-12-03 Thread via GitHub
kevinjqliu commented on code in PR #1395: URL: https://github.com/apache/iceberg-python/pull/1395#discussion_r1868521541 ## .github/workflows/python-ci.yml: ## @@ -24,6 +24,24 @@ on: branches: - 'main' pull_request: +paths-ignore: +- '.github/ISSUE_TEMPLATE/

Re: [PR] Flink: Maintenance - RewriteDataFiles [iceberg]

2024-12-03 Thread via GitHub
stevenzwu commented on code in PR #11497: URL: https://github.com/apache/iceberg/pull/11497#discussion_r1868455346 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/DataFileRewritePlanner.java: ## @@ -0,0 +1,228 @@ +/* + * Licensed to the Apache So

Re: [PR] Kafka Connect: Add mechanisms for routing records by topic name [iceberg]

2024-12-03 Thread via GitHub
bryanck commented on PR #11623: URL: https://github.com/apache/iceberg/pull/11623#issuecomment-2515748508 I feel your solution is reasonable, though I'm trying to reconcile this with the need for a more flexible, pluggable way to route records. For example, one case we had was to support dy

Re: [PR] REST: AuthManager API [iceberg]

2024-12-03 Thread via GitHub
danielcweeks commented on PR #10753: URL: https://github.com/apache/iceberg/pull/10753#issuecomment-2515741069 @adutra I did some testing and things worked against different REST implementations, so I think this is pretty much good to go (minor comment to resolve with @nastra above and addr

Re: [PR] REST: AuthManager API [iceberg]

2024-12-03 Thread via GitHub
danielcweeks commented on code in PR #10753: URL: https://github.com/apache/iceberg/pull/10753#discussion_r1868471895 ## core/src/main/java/org/apache/iceberg/rest/auth/AuthConfig.java: ## @@ -47,7 +47,7 @@ default String scope() { return OAuth2Properties.CATALOG_SCOPE;

Re: [PR] Flink: Maintenance - RewriteDataFiles [iceberg]

2024-12-03 Thread via GitHub
stevenzwu commented on code in PR #11497: URL: https://github.com/apache/iceberg/pull/11497#discussion_r1868471586 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/DataFileRewriteExecutor.java: ## @@ -0,0 +1,257 @@ +/* + * Licensed to the Apache S

Re: [PR] REST: AuthManager API [iceberg]

2024-12-03 Thread via GitHub
danielcweeks commented on code in PR #10753: URL: https://github.com/apache/iceberg/pull/10753#discussion_r1868471393 ## aws/src/main/java/org/apache/iceberg/aws/s3/signer/S3V4RestSignerClient.java: ## @@ -81,14 +71,12 @@ public abstract class S3V4RestSignerClient private sta

Re: [PR] Flink: Maintenance - RewriteDataFiles [iceberg]

2024-12-03 Thread via GitHub
stevenzwu commented on code in PR #11497: URL: https://github.com/apache/iceberg/pull/11497#discussion_r1868466879 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/api/RewriteDataFiles.java: ## @@ -0,0 +1,232 @@ +/* + * Licensed to the Apache Software Foun

[PR] Bump moto from 5.0.21 to 5.0.22 [iceberg-python]

2024-12-03 Thread via GitHub
dependabot[bot] opened a new pull request, #1399: URL: https://github.com/apache/iceberg-python/pull/1399 Bumps [moto](https://github.com/getmoto/moto) from 5.0.21 to 5.0.22. Changelog Sourced from https://github.com/getmoto/moto/blob/master/CHANGELOG.md";>moto's changelog.

Re: [PR] Flink: Maintenance - RewriteDataFiles [iceberg]

2024-12-03 Thread via GitHub
stevenzwu commented on code in PR #11497: URL: https://github.com/apache/iceberg/pull/11497#discussion_r1868461497 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/TaskResultAggregator.java: ## @@ -0,0 +1,154 @@ +/* + * Licensed to the Apache Soft

Re: [PR] Flink: Maintenance - RewriteDataFiles [iceberg]

2024-12-03 Thread via GitHub
stevenzwu commented on code in PR #11497: URL: https://github.com/apache/iceberg/pull/11497#discussion_r1868392426 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/DataFileRewriteCommitter.java: ## @@ -0,0 +1,304 @@ +/* + * Licensed to the Apache

[PR] Bump mypy-boto3-glue from 1.35.65 to 1.35.74 [iceberg-python]

2024-12-03 Thread via GitHub
dependabot[bot] opened a new pull request, #1398: URL: https://github.com/apache/iceberg-python/pull/1398 Bumps [mypy-boto3-glue](https://github.com/youtype/mypy_boto3_builder) from 1.35.65 to 1.35.74. Commits See full diff in https://github.com/youtype/mypy_boto3_builder/commi

[PR] Bump getdaft from 0.3.14 to 0.3.15 [iceberg-python]

2024-12-03 Thread via GitHub
dependabot[bot] opened a new pull request, #1397: URL: https://github.com/apache/iceberg-python/pull/1397 Bumps [getdaft](https://github.com/Eventual-Inc/Daft) from 0.3.14 to 0.3.15. Release notes Sourced from https://github.com/Eventual-Inc/Daft/releases";>getdaft's releases.

Re: [PR] Flink: Maintenance - RewriteDataFiles [iceberg]

2024-12-03 Thread via GitHub
stevenzwu commented on code in PR #11497: URL: https://github.com/apache/iceberg/pull/11497#discussion_r1868436095 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/DataFileRewriteCommitter.java: ## @@ -0,0 +1,318 @@ +/* + * Licensed to the Apache

Re: [PR] Flink: Maintenance - RewriteDataFiles [iceberg]

2024-12-03 Thread via GitHub
stevenzwu commented on code in PR #11497: URL: https://github.com/apache/iceberg/pull/11497#discussion_r1866780897 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/DataFileRewritePlanner.java: ## @@ -0,0 +1,228 @@ +/* + * Licensed to the Apache So

Re: [PR] Flink: Maintenance - RewriteDataFiles [iceberg]

2024-12-03 Thread via GitHub
stevenzwu commented on code in PR #11497: URL: https://github.com/apache/iceberg/pull/11497#discussion_r1868426860 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/DataFileRewriteCommitter.java: ## @@ -0,0 +1,304 @@ +/* + * Licensed to the Apache

Re: [PR] Flink: Maintenance - RewriteDataFiles [iceberg]

2024-12-03 Thread via GitHub
stevenzwu commented on code in PR #11497: URL: https://github.com/apache/iceberg/pull/11497#discussion_r1868393393 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/DataFileRewriteCommitter.java: ## @@ -0,0 +1,304 @@ +/* + * Licensed to the Apache

Re: [PR] Document procedure for stats collection [iceberg]

2024-12-03 Thread via GitHub
szehon-ho commented on code in PR #11606: URL: https://github.com/apache/iceberg/pull/11606#discussion_r1868409126 ## docs/docs/spark-procedures.md: ## @@ -936,3 +936,40 @@ as an `UPDATE_AFTER` image, resulting in the following pre/post update images: |-||-

Re: [PR] Materialized View Spec [iceberg]

2024-12-03 Thread via GitHub
szehon-ho commented on code in PR #11041: URL: https://github.com/apache/iceberg/pull/11041#discussion_r1868399418 ## format/view-spec.md: ## @@ -42,12 +42,28 @@ An atomic swap of one view metadata file for another provides the basis for maki Writers create view metadata fil

Re: [PR] Hive: Optimize tableExists API in hive catalog [iceberg]

2024-12-03 Thread via GitHub
dramaticlly commented on code in PR #11597: URL: https://github.com/apache/iceberg/pull/11597#discussion_r1868399294 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalog.java: ## @@ -412,6 +412,34 @@ private void validateTableIsIcebergTableOrView( } } +

Re: [PR] Hive: Optimize tableExists API in hive catalog [iceberg]

2024-12-03 Thread via GitHub
dramaticlly commented on code in PR #11597: URL: https://github.com/apache/iceberg/pull/11597#discussion_r1868398567 ## hive-metastore/src/test/java/org/apache/iceberg/hive/HiveTableTest.java: ## @@ -388,6 +388,41 @@ public void testHiveTableAndIcebergTableWithSameName(TableTyp

Re: [PR] Hive: Optimize tableExists API in hive catalog [iceberg]

2024-12-03 Thread via GitHub
dramaticlly commented on code in PR #11597: URL: https://github.com/apache/iceberg/pull/11597#discussion_r1868397742 ## hive-metastore/src/test/java/org/apache/iceberg/hive/HiveTableTest.java: ## @@ -388,6 +388,41 @@ public void testHiveTableAndIcebergTableWithSameName(TableTyp

Re: [PR] Hive: Optimize tableExists API in hive catalog [iceberg]

2024-12-03 Thread via GitHub
szehon-ho commented on code in PR #11597: URL: https://github.com/apache/iceberg/pull/11597#discussion_r1868396253 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalog.java: ## @@ -412,6 +412,34 @@ private void validateTableIsIcebergTableOrView( } } + @

Re: [PR] Flink: Maintenance - RewriteDataFiles [iceberg]

2024-12-03 Thread via GitHub
stevenzwu commented on code in PR #11497: URL: https://github.com/apache/iceberg/pull/11497#discussion_r1868379639 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/DataFileRewriteCommitter.java: ## @@ -0,0 +1,304 @@ +/* + * Licensed to the Apache

Re: [PR] Kafka Connect: Add mechanisms for routing records by topic name [iceberg]

2024-12-03 Thread via GitHub
mun1r0b0t commented on PR #11623: URL: https://github.com/apache/iceberg/pull/11623#issuecomment-2515568996 Yeah, but the code in your link does not do that. To do that in your version, it'll have to loop through all the table configurations for each record. The abstraction around the routi

Re: [PR] Kafka Connect: Add mechanisms for routing records by topic name [iceberg]

2024-12-03 Thread via GitHub
munir0b0tcs commented on PR #11623: URL: https://github.com/apache/iceberg/pull/11623#issuecomment-2515567389 Yeah, but the code in your link does not do that. To do that in your version, it'll have to loop through all the table configurations for each record. The abstraction around the rou

[I] Write `null` for `current-snapshot-id` [iceberg-rust]

2024-12-03 Thread via GitHub
Fokko opened a new issue, #752: URL: https://github.com/apache/iceberg-rust/issues/752 Based on the checks in https://github.com/apache/iceberg-rust/pull/742#issuecomment-2515564242, it looks like we write `-1` when there is no snapshot, which is not correct. -- This is an automated mess

Re: [PR] test: append partition data file [iceberg-rust]

2024-12-03 Thread via GitHub
Fokko commented on PR #742: URL: https://github.com/apache/iceberg-rust/pull/742#issuecomment-2515564242 Did some checks: First `metadata.json`: ```json { "format-version" : 2, "table-uuid" : "eb83b77f-c2c3-473c-a138-444a3de61213", "location" : "s3://icebergda

Re: [PR] Kafka Connect: Add mechanisms for routing records by topic name [iceberg]

2024-12-03 Thread via GitHub
bryanck commented on PR #11623: URL: https://github.com/apache/iceberg/pull/11623#issuecomment-2515542879 The topic name can be mapped to a table via static routing, isn't that what your `TopicRecordRouter` is doing? -- This is an automated message from the Apache Git Service. To respond

Re: [PR] Kafka Connect: Add mechanisms for routing records by topic name [iceberg]

2024-12-03 Thread via GitHub
mun1r0b0t commented on PR #11623: URL: https://github.com/apache/iceberg/pull/11623#issuecomment-2515527010 But that's very limiting in terms of what it can do. The table name has to be the topic name, which is extremely restrictive, and it uses magic string to determine topic vs field. It

Re: [PR] Core: Merge conflicting deletion vectors [iceberg]

2024-12-03 Thread via GitHub
amogh-jahagirdar commented on code in PR #11693: URL: https://github.com/apache/iceberg/pull/11693#discussion_r1868323735 ## core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java: ## @@ -823,35 +833,138 @@ protected void validateAddedDVs( parent);

Re: [PR] Core: Merge conflicting deletion vectors [iceberg]

2024-12-03 Thread via GitHub
amogh-jahagirdar commented on code in PR #11693: URL: https://github.com/apache/iceberg/pull/11693#discussion_r1868322612 ## core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java: ## @@ -823,35 +833,138 @@ protected void validateAddedDVs( parent);

Re: [PR] Core: Merge conflicting deletion vectors [iceberg]

2024-12-03 Thread via GitHub
amogh-jahagirdar commented on code in PR #11693: URL: https://github.com/apache/iceberg/pull/11693#discussion_r1868313674 ## core/src/main/java/org/apache/iceberg/deletes/DVFileWriter.java: ## @@ -36,6 +36,14 @@ public interface DVFileWriter extends Closeable { */ void de

Re: [PR] Core: Merge conflicting deletion vectors [iceberg]

2024-12-03 Thread via GitHub
amogh-jahagirdar commented on code in PR #11693: URL: https://github.com/apache/iceberg/pull/11693#discussion_r1868316838 ## core/src/test/java/org/apache/iceberg/FileGenerationUtil.java: ## @@ -102,14 +102,18 @@ public static DeleteFile generateEqualityDeleteFile(Table table,

[PR] Core: Merge conflicting deletion vectors [iceberg]

2024-12-03 Thread via GitHub
amogh-jahagirdar opened a new pull request, #11693: URL: https://github.com/apache/iceberg/pull/11693 This PR adds the ability to merge conflicting deletion vectors for a given data file. In parallel, every conflicting DV will be merged with the committed DV, and a new Puffin with the merge

Re: [PR] Kafka Connect: Add mechanisms for routing records by topic name [iceberg]

2024-12-03 Thread via GitHub
bryanck commented on PR #11623: URL: https://github.com/apache/iceberg/pull/11623#issuecomment-2515462998 I was thinking something [like this](https://github.com/bryanck/iceberg/commit/1967c5e823e86b6f4ca8595db967e52a44688fd9) -- This is an automated message from the Apache Git Service. T

Re: [PR] Materialized View Spec [iceberg]

2024-12-03 Thread via GitHub
danielcweeks commented on code in PR #11041: URL: https://github.com/apache/iceberg/pull/11041#discussion_r1868293537 ## format/view-spec.md: ## @@ -42,12 +42,28 @@ An atomic swap of one view metadata file for another provides the basis for maki Writers create view metadata

Re: [PR] Materialized View Spec [iceberg]

2024-12-03 Thread via GitHub
danielcweeks commented on code in PR #11041: URL: https://github.com/apache/iceberg/pull/11041#discussion_r1868289462 ## format/view-spec.md: ## @@ -82,9 +98,13 @@ Each version in `versions` is a struct with the following fields: | _required_ | `representations` | A list of

Re: [PR] Kafka Connect: Add mechanisms for routing records by topic name [iceberg]

2024-12-03 Thread via GitHub
mun1r0b0t commented on PR #11623: URL: https://github.com/apache/iceberg/pull/11623#issuecomment-2515392653 With the abstraction, it doesn't need to check the configuration for each record. I want to use the connector with ~30 tables and not having to parse them all for each record will hel

Re: [PR] Hive: Optimize tableExists API in hive catalog [iceberg]

2024-12-03 Thread via GitHub
szehon-ho commented on code in PR #11597: URL: https://github.com/apache/iceberg/pull/11597#discussion_r1868258764 ## hive-metastore/src/main/java/org/apache/iceberg/hive/HiveCatalog.java: ## @@ -412,6 +412,34 @@ private void validateTableIsIcebergTableOrView( } } + @

[PR] AWS: Add integration with Glue catalog extensions for Amazon SageMaker Lakehouse [iceberg]

2024-12-03 Thread via GitHub
sachet-saurabh opened a new pull request, #11692: URL: https://github.com/apache/iceberg/pull/11692 This PR adds integration with [Glue extensions for Iceberg](https://github.com/awslabs/glue-extensions-for-iceberg/), to enable access to the new Glue multi-catalog hierarchy in the Amazon Sa

Re: [PR] Spark: Add view support to SparkSessionCatalog [iceberg]

2024-12-03 Thread via GitHub
danielcweeks merged PR #11388: URL: https://github.com/apache/iceberg/pull/11388 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceb

Re: [PR] Flink: Maintenance - RewriteDataFiles [iceberg]

2024-12-03 Thread via GitHub
pvary commented on code in PR #11497: URL: https://github.com/apache/iceberg/pull/11497#discussion_r1868114928 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/DataFileRewritePlanner.java: ## @@ -0,0 +1,228 @@ +/* + * Licensed to the Apache Softwa

Re: [PR] Flink: Maintenance - RewriteDataFiles [iceberg]

2024-12-03 Thread via GitHub
pvary commented on code in PR #11497: URL: https://github.com/apache/iceberg/pull/11497#discussion_r1868109673 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/api/RewriteDataFiles.java: ## @@ -0,0 +1,232 @@ +/* + * Licensed to the Apache Software Foundati

Re: [PR] Flink: Maintenance - RewriteDataFiles [iceberg]

2024-12-03 Thread via GitHub
pvary commented on code in PR #11497: URL: https://github.com/apache/iceberg/pull/11497#discussion_r1868103159 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/api/RewriteDataFiles.java: ## @@ -0,0 +1,232 @@ +/* + * Licensed to the Apache Software Foundati

Re: [PR] Flink: Maintenance - RewriteDataFiles [iceberg]

2024-12-03 Thread via GitHub
pvary commented on code in PR #11497: URL: https://github.com/apache/iceberg/pull/11497#discussion_r1868101342 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/api/RewriteDataFiles.java: ## @@ -0,0 +1,232 @@ +/* + * Licensed to the Apache Software Foundati

Re: [PR] Flink: Maintenance - RewriteDataFiles [iceberg]

2024-12-03 Thread via GitHub
pvary commented on code in PR #11497: URL: https://github.com/apache/iceberg/pull/11497#discussion_r1868100304 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/TaskResultAggregator.java: ## @@ -0,0 +1,154 @@ +/* + * Licensed to the Apache Software

Re: [PR] Flink: Maintenance - RewriteDataFiles [iceberg]

2024-12-03 Thread via GitHub
pvary commented on code in PR #11497: URL: https://github.com/apache/iceberg/pull/11497#discussion_r1868098943 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/DataFileRewriteExecutor.java: ## @@ -0,0 +1,257 @@ +/* + * Licensed to the Apache Softw

Re: [PR] Flink: Maintenance - RewriteDataFiles [iceberg]

2024-12-03 Thread via GitHub
pvary commented on code in PR #11497: URL: https://github.com/apache/iceberg/pull/11497#discussion_r1868096866 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/DataFileRewriteCommitter.java: ## @@ -0,0 +1,304 @@ +/* + * Licensed to the Apache Soft

Re: [PR] Flink: Maintenance - RewriteDataFiles [iceberg]

2024-12-03 Thread via GitHub
pvary commented on code in PR #11497: URL: https://github.com/apache/iceberg/pull/11497#discussion_r1868096866 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/DataFileRewriteCommitter.java: ## @@ -0,0 +1,304 @@ +/* + * Licensed to the Apache Soft

Re: [PR] Flink: Maintenance - RewriteDataFiles [iceberg]

2024-12-03 Thread via GitHub
pvary commented on code in PR #11497: URL: https://github.com/apache/iceberg/pull/11497#discussion_r1868095868 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/DataFileRewriteCommitter.java: ## @@ -0,0 +1,304 @@ +/* + * Licensed to the Apache Soft

Re: [PR] Flink: Maintenance - RewriteDataFiles [iceberg]

2024-12-03 Thread via GitHub
pvary commented on code in PR #11497: URL: https://github.com/apache/iceberg/pull/11497#discussion_r1868091732 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/DataFileRewriteCommitter.java: ## @@ -0,0 +1,304 @@ +/* + * Licensed to the Apache Soft

Re: [PR] Flink: Maintenance - RewriteDataFiles [iceberg]

2024-12-03 Thread via GitHub
pvary commented on code in PR #11497: URL: https://github.com/apache/iceberg/pull/11497#discussion_r1868087488 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/DataFileRewriteCommitter.java: ## @@ -0,0 +1,304 @@ +/* + * Licensed to the Apache Soft

Re: [PR] Flink: Maintenance - RewriteDataFiles [iceberg]

2024-12-03 Thread via GitHub
pvary commented on code in PR #11497: URL: https://github.com/apache/iceberg/pull/11497#discussion_r1868086690 ## flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/DataFileRewriteCommitter.java: ## @@ -0,0 +1,304 @@ +/* + * Licensed to the Apache Soft

  1   2   >