Re: [PR] [catalog] Expose TableCommit construction to public [iceberg-rust]

2025-04-26 Thread via GitHub
dentiny commented on PR #1252: URL: https://github.com/apache/iceberg-rust/pull/1252#issuecomment-2832913608 > Hi, @dentiny You can use Transaction api to do that. Thank you @liurenjie1024 for the quick reply! Yeah I listed it as the second alternative in the PR description, the onl

Re: [I] Add more variants to `ErrorKind` [iceberg-rust]

2025-04-26 Thread via GitHub
Xuanwo commented on issue #1038: URL: https://github.com/apache/iceberg-rust/issues/1038#issuecomment-2833199052 Hi, @dentiny. I believe our current consensus aligns with your comments. I think we can start by adding the following variants: ``` pub enum ErrorKind { Une

Re: [D] Ideas: add directory support for `FileIO` [iceberg-rust]

2025-04-26 Thread via GitHub
GitHub user Xuanwo added a comment to the discussion: Ideas: add directory support for `FileIO` Hi, thank you @dentiny for starting this. The topic of a file-based catalog has come up multiple times throughout Iceberg's history and has been rejected for similar reasons: - Iceberg is designed

Re: [PR] Core, Puffin: Add DV file writer [iceberg]

2025-04-26 Thread via GitHub
dentiny commented on PR #11476: URL: https://github.com/apache/iceberg/pull/11476#issuecomment-2833112453 Hi team, may I ask a dummy question? I'm wondering if this is the official java implementation for deletion vector mentioned at https://iceberg.apache.org/puffin-spec/#deletion-vecto

[PR] Build: Bump jackson-bom from 2.18.3 to 2.19.0 [iceberg]

2025-04-26 Thread via GitHub
dependabot[bot] opened a new pull request, #12903: URL: https://github.com/apache/iceberg/pull/12903 Bumps `jackson-bom` from 2.18.3 to 2.19.0. Updates `com.fasterxml.jackson:jackson-bom` from 2.18.3 to 2.19.0 Commits https://github.com/FasterXML/jackson-bom/commit/077244b059

[PR] Build: Bump testcontainers from 1.20.6 to 1.21.0 [iceberg]

2025-04-26 Thread via GitHub
dependabot[bot] opened a new pull request, #12904: URL: https://github.com/apache/iceberg/pull/12904 Bumps `testcontainers` from 1.20.6 to 1.21.0. Updates `org.testcontainers:testcontainers` from 1.20.6 to 1.21.0 Release notes Sourced from https://github.com/testcontainers/testco

Re: [PR] Build: Bump software.amazon.awssdk:bom from 2.29.52 to 2.31.25 [iceberg]

2025-04-26 Thread via GitHub
dependabot[bot] commented on PR #12850: URL: https://github.com/apache/iceberg/pull/12850#issuecomment-2833076310 Superseded by #12905. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] Parquet variant array write [iceberg]

2025-04-26 Thread via GitHub
aihuaxu commented on code in PR #12847: URL: https://github.com/apache/iceberg/pull/12847#discussion_r2062057137 ## core/src/main/java/org/apache/iceberg/variants/ValueArray.java: ## @@ -127,4 +127,9 @@ private int writeTo(ByteBuffer buffer, int offset) { return (dataOffs

[PR] Build: Bump software.amazon.awssdk:bom from 2.29.52 to 2.31.30 [iceberg]

2025-04-26 Thread via GitHub
dependabot[bot] opened a new pull request, #12905: URL: https://github.com/apache/iceberg/pull/12905 Bumps software.amazon.awssdk:bom from 2.29.52 to 2.31.30. [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=soft

[PR] Build: Bump org.apache.httpcomponents.client5:httpclient5 from 5.4.3 to 5.4.4 [iceberg]

2025-04-26 Thread via GitHub
dependabot[bot] opened a new pull request, #12906: URL: https://github.com/apache/iceberg/pull/12906 Bumps [org.apache.httpcomponents.client5:httpclient5](https://github.com/apache/httpcomponents-client) from 5.4.3 to 5.4.4. Changelog Sourced from https://github.com/apache/httpcom

Re: [PR] Build: Bump software.amazon.awssdk:bom from 2.29.52 to 2.31.25 [iceberg]

2025-04-26 Thread via GitHub
dependabot[bot] closed pull request #12850: Build: Bump software.amazon.awssdk:bom from 2.29.52 to 2.31.25 URL: https://github.com/apache/iceberg/pull/12850 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[PR] Build: Bump nessie from 0.103.3 to 0.103.5 [iceberg]

2025-04-26 Thread via GitHub
dependabot[bot] opened a new pull request, #12902: URL: https://github.com/apache/iceberg/pull/12902 Bumps `nessie` from 0.103.3 to 0.103.5. Updates `org.projectnessie.nessie:nessie-client` from 0.103.3 to 0.103.5 Updates `org.projectnessie.nessie:nessie-jaxrs-testextension` from 0.

Re: [PR] Parquet variant array write [iceberg]

2025-04-26 Thread via GitHub
aihuaxu commented on PR #12847: URL: https://github.com/apache/iceberg/pull/12847#issuecomment-2833073669 > Looks good to me, maybe put the toString method for VariantArray in another PR as Ryan said? Thanks for reviewing. I need this to debug while working on this PR. So think of ke

Re: [D] REST catalog with S3T [iceberg-rust]

2025-04-26 Thread via GitHub
GitHub user dentiny edited a comment on the discussion: REST catalog with S3T My concern for the S3 table catalog is it doesn't implement `update_table`: https://github.com/apache/iceberg-rust/blob/5677d446dec1361459d42a6c3672d8ec605a289a/crates/catalog/s3tables/src/catalog.rs#L476-L485 which i

Re: [D] REST catalog with S3T [iceberg-rust]

2025-04-26 Thread via GitHub
GitHub user dentiny added a comment to the discussion: REST catalog with S3T Yes, the official document provides pyiceberg access example for rest catalog: https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-tables-integrating-open-source.html GitHub link: https://github.com/apache/iceber

Re: [D] REST catalog with S3T [iceberg-rust]

2025-04-26 Thread via GitHub
GitHub user liurenjie1024 added a comment to the discussion: REST catalog with S3T I'm surprised that `S3TableClient` didn't provide a table commit api, seems rest catalog is the only way to do it. GitHub link: https://github.com/apache/iceberg-rust/discussions/1239#discussioncomment-1295844

Re: [PR] [catalog] Expose TableCommit construction to public [iceberg-rust]

2025-04-26 Thread via GitHub
liurenjie1024 closed pull request #1252: [catalog] Expose TableCommit construction to public URL: https://github.com/apache/iceberg-rust/pull/1252 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [I] Add more variants to `ErrorKind` [iceberg-rust]

2025-04-26 Thread via GitHub
dentiny commented on issue #1038: URL: https://github.com/apache/iceberg-rust/issues/1038#issuecomment-2832921814 Hi team, I'm wondering if there're further discussions or conclusions for error type? One thing @Xuanwo mentioned in the error handling design philosophy is > The calle

Re: [I] repair_table (or similar) tool/procedure call for iceberg/spark [iceberg]

2025-04-26 Thread via GitHub
manuzhang commented on issue #12883: URL: https://github.com/apache/iceberg/issues/12883#issuecomment-2832941113 It looks we can remove corrupted files after https://github.com/apache/iceberg/pull/12861 -- This is an automated message from the Apache Git Service. To respond to the message

Re: [PR] feat: `validation_history` and `ancestors_between` [iceberg-python]

2025-04-26 Thread via GitHub
sungwy commented on code in PR #1935: URL: https://github.com/apache/iceberg-python/pull/1935#discussion_r2061932681 ## tests/table/test_validate.py: ## @@ -0,0 +1,88 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See

Re: [D] REST catalog with S3T [iceberg-rust]

2025-04-26 Thread via GitHub
GitHub user dentiny added a comment to the discussion: REST catalog with S3T My concern for the S3 table catalog is it doesn't implement `update_table`: https://github.com/apache/iceberg-rust/blob/5677d446dec1361459d42a6c3672d8ec605a289a/crates/catalog/s3tables/src/catalog.rs#L476-L485 which is

Re: [D] Ideas: add directory support for `FileIO` [iceberg-rust]

2025-04-26 Thread via GitHub
GitHub user dentiny edited a comment on the discussion: Ideas: add directory support for `FileIO` Thank you for the reply! I agree with your point, "directory" concepts, AFAIK, exists mostly on POSIX system and GCS. Right now I'm implementing a filesystem based catalog, which definitely requ

Re: [D] REST catalog with S3T [iceberg-rust]

2025-04-26 Thread via GitHub
GitHub user dentiny added a comment to the discussion: REST catalog with S3T I see... I'm sticking to the latest release version, and I guess for fast-developing repos I should switch to dev branch. GitHub link: https://github.com/apache/iceberg-rust/discussions/1239#discussioncomment-1295837

Re: [D] REST catalog with S3T [iceberg-rust]

2025-04-26 Thread via GitHub
GitHub user liurenjie1024 added a comment to the discussion: REST catalog with S3T Hi, there is a `S3TableCatalog`, could you have a try? GitHub link: https://github.com/apache/iceberg-rust/discussions/1239#discussioncomment-12958317 This is an automatically sent email for issues@iceber

Re: [D] REST catalog with S3T [iceberg-rust]

2025-04-26 Thread via GitHub
GitHub user liurenjie1024 added a comment to the discussion: REST catalog with S3T Oh, I see. The s3table catalog support in this repo has not bee released yet, and will be included in 0.5.0 release, see https://github.com/apache/iceberg-rust/blob/45312032665f21425215fe3cf26f53ca9c552a6b/crat

Re: [D] Ideas: add directory support for `FileIO` [iceberg-rust]

2025-04-26 Thread via GitHub
GitHub user dentiny edited a comment on the discussion: Ideas: add directory support for `FileIO` Thank you for the reply! I agree with your point, "directory" concepts, AFAIK, exists mostly on POSIX system and GCS. Right now I'm implementing a filesystem based catalog, which definitely requ

Re: [PR] feat: `validation_history` and `ancestors_between` [iceberg-python]

2025-04-26 Thread via GitHub
sungwy commented on code in PR #1935: URL: https://github.com/apache/iceberg-python/pull/1935#discussion_r2061935546 ## pyiceberg/table/update/validate.py: ## @@ -0,0 +1,71 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements

Re: [D] Ideas: add directory support for `FileIO` [iceberg-rust]

2025-04-26 Thread via GitHub
GitHub user liurenjie1024 added a comment to the discussion: Ideas: add directory support for `FileIO` cc @Xuanwo is more familiar with storage systems, maybe he could help you here. GitHub link: https://github.com/apache/iceberg-rust/discussions/1246#discussioncomment-12958351 This is

Re: [D] REST catalog with S3T [iceberg-rust]

2025-04-26 Thread via GitHub
GitHub user dentiny edited a comment on the discussion: REST catalog with S3T Hi @liurenjie1024 , if I don't search it wrong, - this is the crate page: https://crates.io/crates/iceberg-s3tables-catalog/0.7.0 - this is the S3Table catalog implementation: https://github.com/JanKaul/iceberg-rust

Re: [D] REST catalog with S3T [iceberg-rust]

2025-04-26 Thread via GitHub
GitHub user dentiny added a comment to the discussion: REST catalog with S3T Hi @liurenjie1024 , if I don't search it wrong, - this is the crate page: https://crates.io/crates/iceberg-s3tables-catalog/0.7.0 - this is the S3Table catalog implementation: https://github.com/JanKaul/iceberg-rust/

Re: [PR] Scan Delete Support Part 4: Delete File Loading; Skeleton for Processing [iceberg-rust]

2025-04-26 Thread via GitHub
liurenjie1024 commented on code in PR #982: URL: https://github.com/apache/iceberg-rust/pull/982#discussion_r2061973945 ## crates/iceberg/src/arrow/delete_file_manager.rs: ## @@ -47,47 +60,533 @@ impl DeleteFileManager for CachingDeleteFileManager { )) } } +// Equ

Re: [D] Ideas: add directory support for `FileIO` [iceberg-rust]

2025-04-26 Thread via GitHub
GitHub user dentiny added a comment to the discussion: Ideas: add directory support for `FileIO` Thank you for the reply! I agree with your point, "directory" concepts, AFAIK, exists mostly on POSIX system and GCS. Right now I'm implementing a filesystem based catalog, which definitely requi

Re: [PR] Scan Delete Support Part 4: Delete File Loading; Skeleton for Processing [iceberg-rust]

2025-04-26 Thread via GitHub
liurenjie1024 commented on PR #982: URL: https://github.com/apache/iceberg-rust/pull/982#issuecomment-2832925165 > Actually @liurenjie1024 I'll go ahead with the structural changes to split this into separate loader and filter structs and update this PR. Sorry for late reply. I'm fine

Re: [I] Feature request: Allow pagination on list operations [iceberg-rust]

2025-04-26 Thread via GitHub
liurenjie1024 commented on issue #1251: URL: https://github.com/apache/iceberg-rust/issues/1251#issuecomment-2832916835 Thanks @dentiny , this looks good to me. [Rest catalog spec](https://github.com/apache/iceberg/blob/8ed1c216503b7193924ca57bd2694025660ac02c/open-api/rest-catalog-open-api.

Re: [D] Ideas: add directory support for `FileIO` [iceberg-rust]

2025-04-26 Thread via GitHub
GitHub user liurenjie1024 added a comment to the discussion: Ideas: add directory support for `FileIO` Thanks @dentiny for raising this. `FileIO` is an abstraction over different storage systems, and we should only provide a minimum interface so that most storage systems can fit into it. As w

Re: [PR] [catalog] Expose TableCommit construction to public [iceberg-rust]

2025-04-26 Thread via GitHub
liurenjie1024 commented on PR #1252: URL: https://github.com/apache/iceberg-rust/pull/1252#issuecomment-2832911037 > Hi @liurenjie1024 , thank you for the review! I appreciate your effort to add documentation about the visibility as well; The main reason I opened this PR is easier testing f

Re: [PR] Add more error types [iceberg-rust]

2025-04-26 Thread via GitHub
liurenjie1024 commented on PR #1250: URL: https://github.com/apache/iceberg-rust/pull/1250#issuecomment-2832913401 See https://github.com/apache/iceberg-rust/issues/1249#issuecomment-2832907748 I'll close this pr first before we have a clear conclusion and design. -- This is an automated

Re: [PR] Add more error types [iceberg-rust]

2025-04-26 Thread via GitHub
liurenjie1024 closed pull request #1250: Add more error types URL: https://github.com/apache/iceberg-rust/pull/1250 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

Re: [PR] Parquet variant array write [iceberg]

2025-04-26 Thread via GitHub
aihuaxu commented on code in PR #12847: URL: https://github.com/apache/iceberg/pull/12847#discussion_r2061957794 ## parquet/src/main/java/org/apache/iceberg/parquet/ParquetVariantWriters.java: ## @@ -360,6 +381,92 @@ public void setColumnStore(ColumnWriteStore columnStore) {

Re: [I] [Feature request] Add more error types for iceberg [iceberg-rust]

2025-04-26 Thread via GitHub
dentiny commented on issue #1249: URL: https://github.com/apache/iceberg-rust/issues/1249#issuecomment-2832911450 > Hi, we already have some discussions in [#1038](https://github.com/apache/iceberg-rust/issues/1038) . We could continue discussion there, and close this for now. Feel free to

Re: [PR] [easy] Add comment on non-existent namespace/table at drop [iceberg-rust]

2025-04-26 Thread via GitHub
liurenjie1024 merged PR #1245: URL: https://github.com/apache/iceberg-rust/pull/1245 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@

Re: [PR] [catalog] Expose TableCommit construction to public [iceberg-rust]

2025-04-26 Thread via GitHub
dentiny commented on PR #1252: URL: https://github.com/apache/iceberg-rust/pull/1252#issuecomment-2832908296 Hi @liurenjie1024 , thank you for the review! I appreciate your effort to add documentation about the visibility as well; The main reason I opened this PR is easier testing for my

Re: [I] [Feature request] Add more error types for iceberg [iceberg-rust]

2025-04-26 Thread via GitHub
liurenjie1024 closed issue #1249: [Feature request] Add more error types for iceberg URL: https://github.com/apache/iceberg-rust/issues/1249 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [I] [Feature request] Add more error types for iceberg [iceberg-rust]

2025-04-26 Thread via GitHub
liurenjie1024 commented on issue #1249: URL: https://github.com/apache/iceberg-rust/issues/1249#issuecomment-2832907748 Hi, we already have some discussions in #1038 . We could continue discussion there, and close this for now. Feel free to reopen it if you still feel necessary. -- This

Re: [PR] feat: `validation_history` and `ancestors_between` [iceberg-python]

2025-04-26 Thread via GitHub
sungwy commented on code in PR #1935: URL: https://github.com/apache/iceberg-python/pull/1935#discussion_r2061931523 ## tests/table/test_validate.py: ## @@ -0,0 +1,88 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See

Re: [PR] feat: add file reader interface [iceberg-cpp]

2025-04-26 Thread via GitHub
wgtmac commented on code in PR #88: URL: https://github.com/apache/iceberg-cpp/pull/88#discussion_r2061940987 ## src/iceberg/file_reader.h: ## @@ -0,0 +1,142 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See th

[I] Add doc to explain why `TableCommit` should be private. [iceberg-rust]

2025-04-26 Thread via GitHub
liurenjie1024 opened a new issue, #1262: URL: https://github.com/apache/iceberg-rust/issues/1262 ### Is your feature request related to a problem or challenge? Currently `TableCommit` constructor is mark as crate private for some reason. Users not familiar with this design wants to ma

Re: [PR] [catalog] Expose TableCommit construction to public [iceberg-rust]

2025-04-26 Thread via GitHub
liurenjie1024 commented on PR #1252: URL: https://github.com/apache/iceberg-rust/pull/1252#issuecomment-2832878219 I'll close this pr first, and add doc to explain why we mark it as private. Feel free to reopen it if you feel necessary. -- This is an automated message from the Apache Git

Re: [PR] [catalog] Expose TableCommit construction to public [iceberg-rust]

2025-04-26 Thread via GitHub
liurenjie1024 commented on PR #1252: URL: https://github.com/apache/iceberg-rust/pull/1252#issuecomment-2832877231 Hi @dentiny The build of `TableCommit` is marked as private on purpose. `TableCommit` should be constructed by transaction api rather by user directly, which is error prone.

Re: [PR] feat: `validation_history` and `ancestors_between` [iceberg-python]

2025-04-26 Thread via GitHub
sungwy commented on PR #1935: URL: https://github.com/apache/iceberg-python/pull/1935#issuecomment-2832872903 Hi @jayceslesar ! Thank you for pressing on with this PR! This looks almost good to merge. I share the same concerns with @Fokko regarding the validation check - I fear that it's cu

Re: [PR] feat: `validation_history` and `ancestors_between` [iceberg-python]

2025-04-26 Thread via GitHub
sungwy commented on code in PR #1935: URL: https://github.com/apache/iceberg-python/pull/1935#discussion_r2061928860 ## pyiceberg/table/update/validate.py: ## @@ -0,0 +1,71 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements

Re: [PR] feat: add file reader interface [iceberg-cpp]

2025-04-26 Thread via GitHub
lidavidm commented on code in PR #88: URL: https://github.com/apache/iceberg-cpp/pull/88#discussion_r2061916317 ## src/iceberg/file_reader.h: ## @@ -0,0 +1,142 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See

Re: [PR] OpenAPI: Use more clear language in recommending error responses [iceberg]

2025-04-26 Thread via GitHub
sungwy commented on PR #12376: URL: https://github.com/apache/iceberg/pull/12376#issuecomment-2832831939 > Hi @sungwy, are you still working on it? Hi @flyrain I'm still waiting on an approval to merge this in. Is the direction that we want to make it more clear that we are dep

Re: [PR] API: Follow up on adding Variant data type to implement sanitizing fo… [iceberg]

2025-04-26 Thread via GitHub
github-actions[bot] commented on PR #12611: URL: https://github.com/apache/iceberg/pull/12611#issuecomment-2832810951 This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pul

Re: [I] Nessie should throw a NoSuchNamespaceException when listing a non-existing namespace [iceberg]

2025-04-26 Thread via GitHub
coderfender commented on issue #12875: URL: https://github.com/apache/iceberg/issues/12875#issuecomment-2832791979 @akshatmardia sorry just saw your comment. Raised a Draft PR while I update upstream test cases unless @akshatmardia you are already made some progress in which case I can pi

Re: [PR] Spec: Add details on GZIP compressed metadata files [iceberg]

2025-04-26 Thread via GitHub
emkornfield commented on PR #12598: URL: https://github.com/apache/iceberg/pull/12598#issuecomment-2832775953 I'll start a vote -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [I] 0.9.1 release tracking [iceberg-python]

2025-04-26 Thread via GitHub
kevinjqliu closed issue #1849: 0.9.1 release tracking URL: https://github.com/apache/iceberg-python/issues/1849 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [I] 0.9.1 release tracking [iceberg-python]

2025-04-26 Thread via GitHub
kevinjqliu commented on issue #1849: URL: https://github.com/apache/iceberg-python/issues/1849#issuecomment-2832743813 devlist discussion, https://lists.apache.org/thread/gbl7bh43fcd0lzjxct6g8zmtpnrjx459 0.9.1rc1, https://lists.apache.org/thread/tbr90g0kx201dj5323hwqyd6rbkjtjyc --

Re: [PR] Pyarrow data type, default to small type and fix large type override [iceberg-python]

2025-04-26 Thread via GitHub
kevinjqliu commented on code in PR #1859: URL: https://github.com/apache/iceberg-python/pull/1859#discussion_r2061808804 ## pyiceberg/io/pyarrow.py: ## @@ -626,7 +626,7 @@ def field(self, field: NestedField, field_result: pa.DataType) -> pa.Field: def list(self, list_typ

Re: [I] Caused by: java.lang.ClassCastException: class org.apache.iceberg.shaded.org.apache.parquet.schema.Messa [iceberg]

2025-04-26 Thread via GitHub
mcagriaktas commented on issue #12846: URL: https://github.com/apache/iceberg/issues/12846#issuecomment-2832635916 Hello, flink=1.20.0 iceberg=1.8.1 If you get the error: ```text Caused by: java.lang.ClassCastException: class org.apache.iceberg.shaded.org.apache.parque

Re: [PR] Add all filles metadata tables [iceberg-python]

2025-04-26 Thread via GitHub
soumya-ghosh commented on code in PR #1626: URL: https://github.com/apache/iceberg-python/pull/1626#discussion_r2061539548 ## pyiceberg/table/inspect.py: ## @@ -523,7 +523,62 @@ def history(self) -> "pa.Table": return pa.Table.from_pylist(history, schema=history_schem

Re: [PR] OpenAPI: Use more clear language in recommending error responses [iceberg]

2025-04-26 Thread via GitHub
flyrain commented on PR #12376: URL: https://github.com/apache/iceberg/pull/12376#issuecomment-2832630055 Hi @sungwy, are you still working on it? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] refactor partition_summary_limit into SnapshotSummaryCollector constr… [iceberg-python]

2025-04-26 Thread via GitHub
Fokko commented on code in PR #1940: URL: https://github.com/apache/iceberg-python/pull/1940#discussion_r2061602570 ## pyiceberg/table/snapshots.py: ## @@ -272,10 +272,10 @@ class SnapshotSummaryCollector: partition_metrics: DefaultDict[str, UpdateMetrics] max_changed_

Re: [I] [feat] add missing metadata tables [iceberg-python]

2025-04-26 Thread via GitHub
soumya-ghosh commented on issue #1053: URL: https://github.com/apache/iceberg-python/issues/1053#issuecomment-2832536611 @kevinjqliu I agree to your point that while time-travelling to older snapshots, metadata tables should adhere to schema as of that snapshot. > I did a test to see

Re: [PR] [Spark]Add max files rewrite option for RewriteAction [iceberg]

2025-04-26 Thread via GitHub
coderfender commented on code in PR #12824: URL: https://github.com/apache/iceberg/pull/12824#discussion_r2061052784 ## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteDataFilesSparkAction.java: ## @@ -407,15 +409,37 @@ private Builder doExecuteWithPartial

Re: [PR] feat: enable arrow to build parquet [iceberg-cpp]

2025-04-26 Thread via GitHub
wgtmac commented on code in PR #89: URL: https://github.com/apache/iceberg-cpp/pull/89#discussion_r2061397093 ## cmake_modules/IcebergThirdpartyToolchain.cmake: ## @@ -82,13 +84,15 @@ function(resolve_arrow_dependency) ON CACHE BOOL "" FORCE) set(ARROW_DEPENDENC

Re: [PR] refactor partition_summary_limit into SnapshotSummaryCollector constr… [iceberg-python]

2025-04-26 Thread via GitHub
stevie9868 commented on code in PR #1940: URL: https://github.com/apache/iceberg-python/pull/1940#discussion_r2061391541 ## pyiceberg/table/snapshots.py: ## @@ -272,10 +272,10 @@ class SnapshotSummaryCollector: partition_metrics: DefaultDict[str, UpdateMetrics] max_cha

Re: [PR] refactor partition_summary_limit into SnapshotSummaryCollector constr… [iceberg-python]

2025-04-26 Thread via GitHub
stevie9868 commented on PR #1940: URL: https://github.com/apache/iceberg-python/pull/1940#issuecomment-2832299637 I have rebased and fixed the test, thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] Added ExpireSnapshots Feature [iceberg-python]

2025-04-26 Thread via GitHub
ForeverAngry commented on code in PR #1880: URL: https://github.com/apache/iceberg-python/pull/1880#discussion_r2061325688 ## pyiceberg/table/update/snapshot.py: ## @@ -55,6 +55,7 @@ from pyiceberg.partitioning import ( PartitionSpec, ) +from pyiceberg.table.refs import S

Re: [I] Caused by: java.lang.ClassCastException: class org.apache.iceberg.shaded.org.apache.parquet.schema.Messa [iceberg]

2025-04-26 Thread via GitHub
mcagriaktas commented on issue #12846: URL: https://github.com/apache/iceberg/issues/12846#issuecomment-2832219920 Same error for me! heres my iceberg jars: ```bash #!/bin/bash FLINK_LIB_DIR=/opt/flink/lib FLINK_VERSION=1.20.0 HADOOP_VERSION=3.4.1 ICEBERG_VERSION

Re: [I] org.apache.thrift.TApplicationException: Invalid method name: 'get_table' [iceberg]

2025-04-26 Thread via GitHub
manuzhang commented on issue #12878: URL: https://github.com/apache/iceberg/issues/12878#issuecomment-2832184174 We don't support hive 4 metastore yet. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] Added ExpireSnapshots Feature [iceberg-python]

2025-04-26 Thread via GitHub
ForeverAngry commented on code in PR #1880: URL: https://github.com/apache/iceberg-python/pull/1880#discussion_r2061300464 ## tests/integration/test_partition_evolution.py: ## @@ -140,6 +140,14 @@ def test_add_hour(catalog: Catalog) -> None: _validate_new_partition_fields(t

Re: [PR] Added ExpireSnapshots Feature [iceberg-python]

2025-04-26 Thread via GitHub
ForeverAngry commented on code in PR #1880: URL: https://github.com/apache/iceberg-python/pull/1880#discussion_r2061298694 ## pyiceberg/table/update/snapshot.py: ## @@ -843,3 +849,64 @@ def remove_branch(self, branch_name: str) -> ManageSnapshots: This for method c

Re: [PR] Added ExpireSnapshots Feature [iceberg-python]

2025-04-26 Thread via GitHub
ForeverAngry commented on code in PR #1880: URL: https://github.com/apache/iceberg-python/pull/1880#discussion_r2061298441 ## tests/expressions/test_literals.py: ## @@ -744,7 +744,7 @@ def test_invalid_decimal_conversions() -> None: def test_invalid_string_conversions() -> None

Re: [PR] Flink: Backport add StreamingStartingStrategy.INCREMENTAL_FROM_LATEST_SNAPSHOT_EXCLUSIVE to Flink 1.19 [iceberg]

2025-04-26 Thread via GitHub
morhidi commented on PR #12899: URL: https://github.com/apache/iceberg/pull/12899#issuecomment-2832106399 > @Guosmilesmile Absolutely, there was a bit of race condition with merging the original PR and the Flink 2.0 PR. Thanks for pointing that out! > > @morhidi Could you also include

Re: [PR] Flink: Add lockFactory open in LockRemover [iceberg]

2025-04-26 Thread via GitHub
Guosmilesmile commented on PR #12900: URL: https://github.com/apache/iceberg/pull/12900#issuecomment-2832054333 Sink it's a trivial change so I put changes in all Flink versions. If needed, I can split it into individual PRs. -- This is an automated message from the Apache Git Service. To

[PR] Flink: Add lockFactory open in LockRemover [iceberg]

2025-04-26 Thread via GitHub
Guosmilesmile opened a new pull request, #12900: URL: https://github.com/apache/iceberg/pull/12900 In LockRemover, I noticed we missed `lockFactory.open()` before `this.lock = lockFactory.createLock()`. This could lead to an NPE since the factory isn't initialized (similar to the pool in jd

Re: [PR] Docs: Incorrect property in CREATE CATALOG for Flink [iceberg]

2025-04-26 Thread via GitHub
mrsubhash commented on code in PR #12894: URL: https://github.com/apache/iceberg/pull/12894#discussion_r2061221293 ## docs/docs/aws.md: ## @@ -84,7 +84,7 @@ With those dependencies, you can create a Flink catalog like the following: CREATE CATALOG my_catalog WITH ( 'type'='

Re: [PR] Flink: Backport add StreamingStartingStrategy.INCREMENTAL_FROM_LATEST_SNAPSHOT_EXCLUSIVE to Flink 1.19 [iceberg]

2025-04-26 Thread via GitHub
mxm commented on PR #12899: URL: https://github.com/apache/iceberg/pull/12899#issuecomment-2831946767 @Guosmilesmile Absolutely, there was a bit of race condition with merging the original PR and the Flink 2.0 PR. Thanks for pointing that out! @morhidi Could you also include the Flink

Re: [I] load_table showing old table schema [iceberg-python]

2025-04-26 Thread via GitHub
guptaakashdeep commented on issue #1948: URL: https://github.com/apache/iceberg-python/issues/1948#issuecomment-2831943321 @scottjarman you can see all the metadata changes that has happened on a table using Iceberg metadata table called `metadata_log_entries`. In PyIceberg, you can see th

Re: [PR] Flink: Backport add StreamingStartingStrategy.INCREMENTAL_FROM_LATEST_SNAPSHOT_EXCLUSIVE to Flink 1.19 [iceberg]

2025-04-26 Thread via GitHub
Guosmilesmile commented on PR #12899: URL: https://github.com/apache/iceberg/pull/12899#issuecomment-2831935716 Hi, @mxm I have a small question regarding future backports. Since Flink 2.0 has been introduced, should we also prepare new backports for the 2.0 version going forward? -- Th